Hacker Public Radio

Your ideas, projects, opinions - podcasted.

New episodes Monday through Friday.


HPR2114: Gnu Awk - Part 1

Hosted by Mr. Young on 2016-09-08 00:00:00
Download or Listen

Introduction to Awk

Awk is a powerful text parsing tool for unix and unix-like systems.

The basic syntax is:

awk [options] 'pattern {action}' file

Here is a simple example file that we will be using, called file1.txt:

name       color  amount
apple      red    4
banana     yellow 6
strawberry red    3
grape      purple 10
apple      green  8
plum       purple 2
kiwi       brown  4
potato     brown  9
pineapple  yellow 5

First command:

awk '{print $2}' file1.txt

As you can see, the “print” command will display the whatever follows. In this case we are showing the second column using “$2”. This is intuitive. To display all columns, use “$0”.

This example will output:

color
red
yellow
red
purple
green
purple
brown
brown
yellow

Second command:

awk '$2=="yellow"{print $1}' file1.txt

This will output:

banana
pineapple

As you can see, the command matches items in column 2 matching “yellow”, but prints column 1.

Field separator

By default, awk uses white space as the file separator. You can change this by using the -F option. For instance, file1.csv looks like this:

name,color,amount
apple,red,4
banana,yellow,6
strawberry,red,3
grape,purple,10
apple,green,8
plum,purple,2
kiwi,brown,4
potato,brown,9
pineapple,yellow,5

A similar command as before:

awk -F"," '$2=="yellow" {print $1}' file1.csv

will still output:

banana
pineapple

Regular expressions work as well:

awk '$2 ~ /p.+p/ {print $0}' file1.txt

This returns:

grape   purple  10
plum    purple  2

Numbers are interpreted automatically:

awk '$3>5 {print $1, $2}' file1.txt

Will output:

name    color
banana  yellow
grape   purple
apple   green
potato  brown

Using output redirection, you can write your results to file. For example:

awk -F, '$3>5 {print $1, $2}' file1.csv > output.txt

This will output a file with the contents of the query.

Here’s a cool trick! You can automatically split a file into multiple files grouped by column. For example, if I want to split file1.txt into multiple files by color, here is the command.

awk '{print > $2".txt"}' file1.txt

This will produce files named yellow.txt, red.txt, etc. In upcoming episodes, we will show how to improve the outputs.

Resources

  1. https://www.theunixschool.com/p/awk-sed.html
  2. https://www.tecmint.com/category/awk-command/
  3. https://linux.die.net/man/1/awk

Coming up

  • More options
  • Built-in Variables
  • Arithmetic operations
  • Awk language and syntax

Comments



More Information...


Copyright Information

Unless otherwise stated, our shows are released under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.

The HPR Website Design is released to the Public Domain.