Search This Blog

Tuesday, January 11, 2011

print a column using awk and remove dups

The following command will print the 3rd column in file input.txt:
awk '{print $3}' input.txt
Using redirect to output the result to a file
awk '{print $3}' input.txt > output.txt
To get rid of dups from output.txt and output to file unique.txt:
sort output.txt | uniq -u > unique.txt
We can combine these steps to one:

awk '{print $3}' input.txt | sort | uniq -u > unique.txt

Some important info about awk:
  • NR -- The current line's sequential number
  • NF -- The number of fields in the current line
  • FS -- The input field separator; defaults to whitespace and is reset by the -F command line parameter 
For example, to get the last field of a line:

path=" ../dist/myjar-1.7.jar"
jar_file=`echo $path | awk -F '/' '{print $NF}'`

print the sum of the number in a file:

file test.txt has the following format:

   a=1.2
   bc=2.3
   xyz=1.3
awk -F '=' '{SUM += $NF} END {print SUM/NR}' test.txt

Friday, January 7, 2011

Select a range from oracle, mysql

I talked about how to select first/top n rows from oracle, mysql, and ms sql server. How do we get the range, say from m to n, where m < n?

Oracle
select id, age from (select id, age, rownum as rn from customer order by age) where rn between :m and :n
MySql
select * from customer order by age limit :m, :n - :m
MS SQL

I don't know how to do it with MS sql server yet. I don't have ms sql server installed. If you happen to know it, please post your solution in the comment.