0% found this document useful (0 votes)
30 views

Introawk

Awk is a programming language useful for manipulating text-based data and performing computations on data. It reads input files line-by-line, splits each line into fields, and allows the user to run commands to test or transform the data on matches. Awk programs consist of pattern-action statements that are applied to each line, and features include variables, built-in functions, flow control, arrays, and the ability to write both short one-liners and longer scripts.

Uploaded by

saleempkp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Introawk

Awk is a programming language useful for manipulating text-based data and performing computations on data. It reads input files line-by-line, splits each line into fields, and allows the user to run commands to test or transform the data on matches. Awk programs consist of pattern-action statements that are applied to each line, and features include variables, built-in functions, flow control, arrays, and the ability to write both short one-liners and longer scripts.

Uploaded by

saleempkp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Introduction to Awk

Awk is a convenient and expressive programming language that can be applied to a wide variety of computing and data manipulation tasks.

Awk
Works well on record-type data Reads input file(s) a line at a time Parses each line into fields Performs user-defined tests against each line, performs actions on matches

Other Common Uses


Input validation

Every record have same # of fields? Do values make sense (negative time, hourly wage > $100, etc.)?

Filtering out certain fields Searches


Who got a zero on lab 3? Who got the highest grade?

Many others (it's late)

Invocation
Can write little one-liners on the command line (very handy):
print the 3rd field of every line: $ awk '{ print $3 }' input.txt

Execute an awk script file:


$ awk f script.awk input.txt

Or, use this sha-bang as the first line, and give your script execute permissions:
#!/bin/awk -f

Form of an AWK program


AWK programs are entries of the form:
pattern { action } pattern some test, looking for a pattern
(regular expressions) or C-like conditions
if null, actions are applies to every line

action a statement or set of statements


if not provided, the default action is to print the

entire line, much like grep

Form of an AWK program


Input files are parsed, a record (line) at a time Each line is checked against each pattern, in order There are 2 special patterns:

BEGIN true before any records are read END true at end of input (after all records have been read)

Awk Features
Patterns can be regular expressions or C like conditions. Each line of the input is matched against the patterns, one after the next. If a match occurs the corresponding action is performed. Input lines are parsed and split into fields, which are accessed by $1,,$NF, where NF is a variable set to the number of fields. The variable $0 contains the entire line, and by default lines are split by white space (blanks, tabs)

Variables
Not declared, nor typed No character type

Only strings and floats (support for ints)

$n refers to the nth field (where n is some integer value)


# prints each field on the line for( i=0; i<=NF; ++i ) print $i

Some Built-in Variables


FS the input field separator OFS the output field separator NF # of fields; changes w/each record NR the # of records read (so far). So, the current record #. $0 the entire input line

Example
Print those employees who actually worked $ awk $3>0 {print $1, $2*$3} emp.data
Kathy Mark Mary Susie 40 100 121 76.5 $ cat emp.data Beth 4.00 0 Dan 3.75 0 Kathy 4.00 10 Mark 5.00 20 Mary 5.50 22 Susie 4.25 18

Example CSV file


$ cat students.csv smith,john,js12 jones,fred,fj84 bee,sue,sb23 fife,ralph,rf86 james,jim,jj22 cook,nancy,nc54 banana,anna,ab67 russ,sam,sr77 loeb,lisa,guitarHottie $ cat getEmails.awk #!/bin/awk -f

$ getEmails.awk students.csv john's email is: js12@school.edu fred's email is: fj84@school.edu sue's email is: sb23@school.edu ralph's email is: rf86@school.edu jim's email is: jj22@school.edu nancy's email is: nc54@school.edu anna's email is: ab67@school.edu sam's email is: sr77@school.edu lisa's email is: guitarHottie@schoo

BEGIN { FS = "," } { printf( "%s's email is: %s@school.edu\n", $2, $3 ); }

Example output separator


$ cat out.awk #!/bin/awk -f BEGIN { FS = ","; OFS = "-*-"; } { print $1, $2, $3; } $ out.awk students.csv smith-*-john-*-js12 jones-*-fred-*-fj84 bee-*-sue-*-sb23 fife-*-ralph-*-rf86 james-*-jim-*-jj22 cook-*-nancy-*-nc54 banana-*-anna-*-ab67 russ-*-sam-*-sr77 loeb-*-lisa-*-guitarHottie

Flow Control
Awk syntax is much like C Same loops, if statements, etc. AWK: Aho, Weinberger, Kernighan Kernighan and Ritchie wrote the C language

Associative Arrays
Awk also supports arrays that can be indexed by arbitrary strings. They are implemented using hash tables.

Total[Sue] = 100;

It is possible to loop over all indices that have currently been assigned values.
for (name in Total) print name, Total[name];

Example using Associative Arrays


$ cat scores Fred 90 Sue 100 Fred 85 Sam 70 Sue 98 Sam 50 Fred 70 $ cat total.awk { Total[$1] += $2} END { for (i in Total) print i, Total[i]; }

$ awk -f total.awk scores


Sue 198 Sam 120 Fred 245

Useful one-liners
Line count:
awk 'END {print NR}'

grep
awk '/pat/'

head
awk 'NR<=10'

Add line #s to a file


awk '{print NR, $0}' awk '{ printf( "%5d %s", NR, $0 )}'

Many more. See the resources tab on the course webpage for links to more examples.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy