Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes Monday through Friday.


HPR2330: Awk Part 7

Hosted by Mr. Young on 2017-07-07 00:00:00
Download or Listen

In this episode, I will (very) briefly go over loops in the Awk programming language. Loops are useful when you want to run the same command(s) on a collection of data or when you just want to repeat the same commands many times.

When using loops, a command or group of commands is repeated until a condition (or many) is met.

While Loop

Here is a silly example of a while loop:

#!/bin/awk -f
BEGIN {

# Print the squares from 1 to 10 the first way

    i=1;
    while (i <= 10) {
        print "The square of ", i, " is ", i*i;
        i = i+1;
    }

exit;
}

Our condition is set in the braces after the while statement. We set a variable, i, before entering the loop, then increment i inside of the loop. If you forget to make a way to meet the condition, the while will go on forever.

Do While Loop

Here is an equally silly example of a do while loop:

#!/bin/awk -f
BEGIN {

    i=2;
    do {
        print "The square of ", i, " is ", i*i;
        i = i + 1
    }

    while (i != 2)

exit;
}

Here, the commands in the do code block are executed at the start, then the looping begins.

For Loop

Another silly example of a for loop:

#!/bin/awk -f
BEGIN {

    for (i=1; i <= 10; i++) {
        print "The square of ", i, " is ", i*i;
    }

exit;
}

As you can see, we set the variable, set the condition and set the increment method all in the braces after the for statement.

For Loop Over Arrays

Here is a more useful example of a for loop. Here, we are adding the different values of column 2 into an array/hash-table called a. After processing the file, we print the different values.

For file.txt:

name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5

Using the awk file of:

NR != 1 {
    a[$2]++
}
END {
    for (b in a) {
        print b
    }
}

We get the results of:

brown
purple
red
yellow
green

In another example, we do a similar process. This time, not only do we store all the distinct values of the second column, we perform a sum operation on column 3 for each distinct value of column 2.

For file.csv:

name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5

Using the awk file of:

BEGIN {
    FS=",";
    OFS=",";
    print "color,sum";
}
NR != 1 {
    a[$2]+=$3;
}
END {
    for (b in a) {
        print b, a[b]
    }
}

We get the results of:

color,sum
brown,13
purple,12
red,7
yellow,11
green,8

As you can see, we are also printing a header column prior to processing the file using the BEGIN code block.

Comments



More Information...


Copyright Information

Unless otherwise stated, our shows are released under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.

The HPR Website Design is released to the Public Domain.