Programming with awk

Simple patterns

You can select specific records for printing or other processing by using simple patterns. awk has three kinds of patterns. First, you can use patterns called relational expressions that make comparisons. For example, the operator == tests for equality. To print the lines for which the fourth field equals the string Asia, we can use the program consisting of the single pattern

   $4 == "Asia"
With the file countries as input, this program yields
   USSR	8650	262	Asia
   China	3692	866	Asia
   India	1269	637	Asia

The complete set of comparisons is >, >=, <, <=, == (equal to) and != (not equal to). These comparisons can be used to test both numbers and strings. For example, suppose we want to print only countries with a population greater than 100 million. The program

   $3 > 100
is all that is needed. It prints all lines in which the third field exceeds 100. (Remember that the third field in the file countries is the population in millions.)

Second, you can use patterns called extended regular expressions that search for specified characters to select records. The simplest form of an extended regular expression is a string of characters enclosed in slashes:

This program prints each line that contains the (adjacent) letters US anywhere; with the file countries as input, it prints
   USSR	8650	262	Asia
   USA	3615	219	North America
We will have a lot more to say about extended regular expressions later in this topic.

Third, you can use two special patterns, BEGIN and END, that match before the first record has been read and after the last record has been processed. This program uses BEGIN to print a title:

   BEGIN   { print "Countries of Asia:" }
   /Asia/  { print "     ", $1 }
The output is
   Countries of Asia:

Next topic: Simple actions
Previous topic: Formatted printing

© 2005 The SCO Group, Inc. All rights reserved.
SCO OpenServer Release 6.0.0 -- 02 June 2005