Using Grep, TR and Sed With Regular Expressions
Using Grep, TR and Sed With Regular Expressions
Using Grep, TR and Sed With Regular Expressions
Objectives
Use the available commands and utilities in Linux to search and replace lines of text in files Be able to utilize Linux utilities together with pattern matching, wildcards and regular expressions
Search the file(s) for occurrences of pattern where pattern may be literal or a regular expression grep this sample_file this tutorial has made me more knowledgeable about Linux Beyond this point, is a new dawn If no target file(s) specified, grep works as a filter on stdout, as in pipe ps ax | grep clock 765 tty1 S 0:00 xclock 901 pts/1 S 0:00 grep clock
grep Options -i makes the matching case-insensitive -r searches through files in specified directories, recursively -l prints just the name of files which contain matching lines -c prints the count of matches in each file -n numbers the matching lines in the output -v reverses the test, printing lines which do not match
Special wildcard characters ? * [emacgEl] [e-x] [!x-z] Match one character Any string One character set One character in range Not in set
Regular Expressions
Regular expressions is one of the advanced concept we require for writing efficient shell scripts and effective system administration. Regular expression are sets of symbols and syntactic elements used to match patterns of text. Why Do I Need To Study Regular Expressions? In the first instance, you've copied all the files which end in ".html" (as opposed to copying them one by one); in the second, you've conducted a search not only for "garden," but for "garden, gardening, gardens, and gardeners" all at once. Using a good regex engine and a well-crafted regular expression, one can easily search through a text file (or a hundred text files) searching for words that have the suffix ".html" (but only if the word begins with a capital letter and occurs at the beginning of the line), replace the .html suffix with a .sgml suffix, and then change all the lower case characters to upper case. With the right tools, this series of regular expressions would do just that:
Matches the expression at the start of a line Matches the expression at the end of a line Using ^ and $ character you can find out the empty lines available in a file. ^$ specifies empty line Match a single character of any value, except end of line Match zero or more of the preceding character or expression Escaping the special character. Match any one of the enclosed characters, as in [aeiou]. Use hyphen ( - ) for a range, as in [a-z]
[^ ]
Examples: 3.1. Search document.txt for lines with employers grep employers document.txt 3.2. Search certification or Certification at the start of the line. grep [cC]ertification document.txt 3.3. List all running applications and processes started by bsit. ps aux | grep bsit 3.4. List lines with dollar amount in document.txt grep \$ document.txt
Using sed For Substitution- The s Command The syntax of the s (as in substitute) command is s/regex/replacement/flags The s command can be followed by zero or more of the following flags: o g - Apply the replacement to all matches to the regex, not just the first.
Using sed For Deleting One or More Lines The d Command Syntax: sed '{[/]<n>|<string>|<regex>[/]}d' <fileName> sed '{[/]<adr1>[,<adr2>][/]d' <fileName> o o o n line number string string found in the line regex regular expression corresponding to the search pattern
o o
Using sed To Append Lines The a Command The a command appends a line after the range or pattern Syntax: sed address a Line which you want to append filename sed /pattern/ a Line which you want to append filename
Using sed To Insert A Line The i Command The i command of sed lets you insert a new line before the pattern or range Syntax: sed address i Line which you want to insert filename sed /pattern/i Line which you want to insert filename
Using sed To Replace Line Of Text The c Command The c command in sed used to replace every line with the pattern or ranges with the new given line Syntax: sed address c new line filename sed /pattern/c new line filename Using sed To Print Line Numbers The = Command = is a command in sed to print the current line number to the standard output Syntax: sed = filename o The above command prints line numbers in the first line and the original line from the file in the next line o The = command accepts only one address, so if you want to print line number for a range of lines, you must use curly braces Syntax: sed n /pattern/, /pattern/ { = p } filename sed Examples: Example 3.5. You mistakenly spelled the name Gel with Geli. Using sed, replace every occurrence of Geli with Gel wherever it occurs in the file. sed s/Geli/Gel/g oldname > editedname
Example 3.6. Sometimes you want to search for a pattern and add some characters, like parenthesis, around or near the pattern you found. Search for any word that starts with a digit and enclose the word with brackets. sed s/[0-9]*/[&]/g old_file > new_file
& - Corresponds to the pattern found. Note that you can have a number of & in the replacement string. Example 3.7. Create a file called the sysadminstuff.dat
cat sysadminstuff.dat 1. Linux - Sysadmin, Scripting etc. 2. Databases - Oracle, mySQL etc. 3. Hardware 4. Security (Firewall, Network, Online Security etc) 5. Storage 6. Cool gadgets and websites 7. Productivity (Too many technologies to explore, not much time available) 8. Website Design 9. Software Development 10.Windows- Sysadmin, reboot etc.
Example 3.7.1. Referring to the file sysadminstuff.dat, delete the third line using sed. sed 3d sysadminstuff.dat Example 3.7.2. . Referring to the file sysadminstuff.dat, delete starting from the 3rd line and every line from there. sed 3d ~ 2d sysadminstuff.dat Example 3.7.3. . Referring to the file sysadminstuff.dat, delete from the 4th line up to the 8th line sed 4d, 8d sysadminstuff.dat Example 3.7.4. . Referring to the file sysadminstuff.dat, delete the last line of the file. sed $d sysadminstuff.dat Example 3.7.5. . Referring to the file sysadminstuff.dat, delete the line which matches the pattern Sysadmin sed /Sysadmin/d sysadminstuff.dat Example 3.7.6. . Referring to the file sysadminstuff.dat, delete the line which matches the pattern Website up to the end of the file. sed /Website/,$d sysadminstuff.dat Example 3.8. Create a file called techie_stuff.txt
cat techie_stuff.txt Linux Sysadmin Databases - Oracle, mySQL etc. Security (Firewall, Network, Online Security etc) Storage in Linux Productivity (Too many technologies to explore, not much time available) Windows- Sysadmin, reboot etc.
Example 3.8.1. Add the line Amazing tools and toys after the third line of the file techie_stuff.txt sed 3 a Amazing tools and toys techie_stuff.txt Example 3.8.2. Append the line Linux shell scripting after every line that matches Sysadmin in the file techie_stuff.txt sed /Sysadmin/a Linux shell scripting techie_stuff.txt Example 3.8.3. Append the line Developing websites at the end of the file techie_stuff.txt sed $ a Developing websites techie_stuff.txt Example 3.8.4. Add a line Cool gadgets and websites before 4th line of the file techie_stuff.txt sed 4 i Cool gadgets and websites techie_stuff.txt Example 3.8.5. Add a line Linux Scripting before every line that matches with the pattern called Sysadmin. sed /Sysadmin/i Linux Scripting techie_stuff.txt Example 3.8.6. Append a line Website Design before the last line of the file. sed $ i Website Design techie_stuff.txt Example 3.8.7. Replace the first line of the file techie_stuff.txt with The Linux Guru Things To Master sed 1 c The Linux Guru Things To Master techie_stuff.txt Example 3.8.8. Replace everyline which has a pattern Linux Sysadmin to Linux Sysadmin Scripting. sed Linux Sysadmin/c Linux Sysadmin Scripting techie_stuff.txt Example 3.8.9. Use the sed command to print the line number for which matches with the pattern Databases sed -n '/Databases/=' thegeekstuff.txt
The tr Command
tr translates one set of characters to another Usage: tr start-set end-set o Replaces all characters in start-set with the corresponding characters in end-set o tr cannot a file as an argument, but uses the standard input and output Options: o d deletes characters in start-set instead of translating them o s replaces sequence of identical characters with just one
Examples: Example 4. Translate the word linux to upper case echo linux | tr a-z A-Z