Pattern Matching and Regular Expressions
In earlier lessons, you were introduced to the use of the wildcard symbol (*), used to help you find files that you need, or to find contents within a file. Used in conjunction with grep, you can find anything in your Unix system at a very granular level. This, of course, is because of the use of regular expressions. In this section of the lesson, we will look at how to use REs to search for content within a file. This can be helpful if you have saved email you want to parse for information or specific content, or a long file where you are only searching for a company name such as "Que" or "Sams." Using REs, this information can be found quickly. Let's make a file and then use REs to search within it for specific content.
What Is Perl?
Perl is one of the most commonly used web-based programming languages in use today.
Short for Practical Extraction and Report Language, Perl is a programming language developed by Larry Wall. Perl was especially designed for processing text. Because of its strong text-processing capabilities, Perl is one of the most popular languages for writing CGI scripts. Perl is an interpretive language, which makes it easy to build and test simple programs.
Like REs, learning Perl will take some seriously committed time and practice, and a fundamental understanding of programming would be needed for you to understand and learn it. Unfortunately, coverage of both Perl and regular expressions in this chapter is limited as the purpose of the chapter is not for you to master Perl or regular expressions.
Perl comes with many distributions of Unix and Linux. You can learn more about Perl at http://www.perl.org/.
As we just mentioned, REs are a method of specifying a pattern of characters that can then be matched against existing text, so in this example we will make a text file with text that we will specifically search through.
The format for specifying the regular expression in grep is as follows: grep <regular expression> <filename> <filename> .... Because this lesson uses grep as its example, familiarize yourself with this format so that you can draw from it as we continue to use it throughout the chapter.
What Is This, Another Root Directory? I am Confused
Do not get overwhelmed by the amount of characters and their meanings all at once. This is what I have found to be one of the biggest hurdles when learning Unix as a beginnertrying to remember the countless commands, their options, dealing with case sensitivity, and now with a whole slew of characters that have meanings and functionality.
In the case with REs, other programs sometimes require that the regular expression be set off with a / on either side of it; this is not the case with grep. Be aware that you may have syntax issues so consult your local man pages and online documentation (or your systems administrator) if you are in a jam.