Search This Blog

Tuesday, January 11, 2011

print a column using awk and remove dups

The following command will print the 3rd column in file input.txt:
awk '{print $3}' input.txt
Using redirect to output the result to a file
awk '{print $3}' input.txt > output.txt
To get rid of dups from output.txt and output to file unique.txt:
sort output.txt | uniq -u > unique.txt
We can combine these steps to one:

awk '{print $3}' input.txt | sort | uniq -u > unique.txt

Some important info about awk:
  • NR -- The current line's sequential number
  • NF -- The number of fields in the current line
  • FS -- The input field separator; defaults to whitespace and is reset by the -F command line parameter 
For example, to get the last field of a line:

path=" ../dist/myjar-1.7.jar"
jar_file=`echo $path | awk -F '/' '{print $NF}'`

print the sum of the number in a file:

file test.txt has the following format:

   a=1.2
   bc=2.3
   xyz=1.3
awk -F '=' '{SUM += $NF} END {print SUM/NR}' test.txt

No comments:

Post a Comment