Technology Blog: print a column using awk and remove dups

Tuesday, January 11, 2011

print a column using awk and remove dups

The following command will print the 3rd column in file input.txt:

awk '{print $3}' input.txt

Using redirect to output the result to a file

awk '{print $3}' input.txt > output.txt

To get rid of dups from output.txt and output to file unique.txt:

sort output.txt | uniq -u > unique.txt

We can combine these steps to one:

awk '{print $3}' input.txt | sort | uniq -u > unique.txt

Some important info about awk:

NR -- The current line's sequential number
NF -- The number of fields in the current line
FS -- The input field separator; defaults to whitespace and is reset by the -F command line parameter

For example, to get the last field of a line:

path=" ../dist/myjar-1.7.jar"
jar_file=`echo $path | awk -F '/' '{print $NF}'`

print the sum of the number in a file:

file test.txt has the following format:

   a=1.2
   bc=2.3
   xyz=1.3

awk -F '=' '{SUM += $NF} END {print SUM/NR}' test.txt

Technology Blog

Search This Blog

Tuesday, January 11, 2011

print a column using awk and remove dups

No comments:

Post a Comment

Popular Posts