Technology Blog: 2011

Wednesday, December 7, 2011

Create a maven archetype from an existing project

In your project's root directory, run the following command:

mvn archetype:create-from-project

The generated archetype is in directory target/generated-sources/archetype

Move above folder to different folder, say: /path/to/myarchetype. Open pom to change <name> tag, and groupId and artifactId. To add additional required field, open archetype-metadata.xml file (src/main/resources/META-INF/maven) and add one required property, using <requiredProperties> tag.

To install archetype into local catalog:
mvn clean install

To generate a new project from the installed archetype:

mvn archetype:generate -DarchetypeCatalog=local

Thursday, October 6, 2011

Show hidden characters in vi

In vi, type the following to show hidden characters:

:set list

Turn it off

:set nolist

Sunday, September 11, 2011

Creating JaxContext is slow

Recently, I need to create xml or json from a POJO. I used JAXB and jettison to do the job. At the beginning, I create the following class:

import java.io.Writer;

import javax.xml.bind.*;
import javax.xml.stream.XMLStreamWriter;

import org.codehaus.jettison.mapped.*;

public class BadJaxbPrinter {
    /**
     * To print an object to xml format
     * @param writer
     * @param jaxbAnnotedObj the object to be printed. The class must be annoted with @XmlRootElement
     * @throws JAXBException
     */
    public void toXml(Writer writer, Object jaxbAnnotedObj) throws JAXBException {
        JAXBContext context = JAXBContext.newInstance(jaxbAnnotedObj.getClass());
        Marshaller marshaller = context.createMarshaller();
        marshaller.marshal(jaxbAnnotedObj, writer);
    }

    /**
     * To print an object to json format
     * @param writer
     * @param jaxbAnnotedObj the object to be printed. The class must be annoted with @XmlRootElement
     * @throws JAXBException
     */
    public void toJson(Writer writer, Object jaxbAnnotedObj) throws JAXBException {
        Configuration config = new Configuration();
        MappedNamespaceConvention convention = new MappedNamespaceConvention(config);
        XMLStreamWriter xmlStreamWriter = new MappedXMLStreamWriter(convention, writer);

        JAXBContext context = JAXBContext.newInstance(jaxbAnnotedObj.getClass());
        Marshaller marshaller = context.createMarshaller();
        marshaller.marshal(jaxbAnnotedObj, xmlStreamWriter);
    }
}

However, the above implementation is really slow. The reason is that it is slow to create JAXBContext because it uses reflection to parse the object's annotation. Since JAXBContext is thread safe, it should be only created once and reuse it. Here is the new version of implementation:

public class JaxbPrinter {
    private final MappedNamespaceConvention convention;
    //creating JaxbContext seems expensive, do NOT recreate it
    private final JAXBContext context;

    public JaxbPrinter(Class<?> clazz) throws JAXBException {
        Configuration config = new Configuration();
        convention = new MappedNamespaceConvention(config);
        context = JAXBContext.newInstance(clazz);
    }
    /**
     * To print an object to xml format
     * @param writer
     * @param jaxbAnnotedObj the object to be printed. The class must be annoted with @XmlRootElement
     * @throws JAXBException
     */
    public void toXml(Writer writer, Object jaxbAnnotedObj) throws JAXBException {
        Marshaller marshaller = context.createMarshaller();
        marshaller.marshal(jaxbAnnotedObj, writer);
    }

    /**
     * To print an object to json format
     * @param writer
     * @param jaxbAnnotedObj the object to be printed. The class must be annoted with @XmlRootElement
     * @throws JAXBException
     */
    public void toJson(Writer writer, Object jaxbAnnotedObj) throws JAXBException {
        XMLStreamWriter xmlStreamWriter = new MappedXMLStreamWriter(convention, writer);
        Marshaller marshaller = context.createMarshaller();
        marshaller.marshal(jaxbAnnotedObj, xmlStreamWriter);
    }
}

It is 25 times faster than the previous version with 10000 executions of toXml. Note that Marshaller and Unmarshaller are not thread safe and it is cheap to recreate them.

Also, toJson is 5 times slower than toXml. Since the response time meets SLA, I didn't bother to use other JSON implementation (like jackson).

Monday, August 15, 2011

fail to install myql gem for ruby

You tried to install mysql gem on linux:

$ gem install mysql

Building native extensions.  This could take a while...
ERROR:  Error installing mysql:
	ERROR: Failed to build gem native extension.

/home/t/ruby-1.9.2-p136-1/bin/ruby extconf.rb
checking for mysql_ssl_set()... *** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of
necessary libraries and/or headers.  Check the mkmf.log file for more
details.  You may need configuration options.

Provided configuration options:
	--with-opt-dir
	--without-opt-dir
	--with-opt-include
	--without-opt-include=${opt-dir}/include
	--with-opt-lib
	--without-opt-lib=${opt-dir}/lib
	--with-make-prog
	--without-make-prog
	--srcdir=.
	--curdir
	--ruby=/home/t/ruby-1.9.2-p136-1/bin/ruby
	--with-mysql-config
	--without-mysql-config
/home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:368:in `try_do': The complier failed to generate an executable file. (RuntimeError)
You have to install development tools first.
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:435:in `try_link0'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:440:in `try_link'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:552:in `try_func'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:797:in `block in have_func'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:693:in `block in checking_for'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:280:in `block (2 levels) in postpone'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:254:in `open'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:280:in `block in postpone'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:254:in `open'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:276:in `postpone'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:692:in `checking_for'
	from /home/t/ruby-1.9.2-p136-1/lib/ruby/1.9.1/mkmf.rb:796:in `have_func'
	from extconf.rb:50:in `<main>'


Gem files will remain installed in /home/t/ruby-1.9.2-p136-1/lib/ruby/gems/1.9.1/gems/mysql-2.8.1 for inspection.
Results logged to /home/t/ruby-1.9.2-p136-1/lib/ruby/gems/1.9.1/gems/mysql-2.8.1/ext/mysql_api/gem_make.out

You need to check mkmf.log which is in the same folder as gem_make.out is in.  In my case, mkmf.log is in /home/t/ruby-1.9.2-p136-1/lib/ruby/gems/1.9.1/gems/mysql-2.8.1/ext/mysql_api/

Read mkmf.log, if you see something similar to below:

/usr/bin/ld: skipping incompatible /usr/lib/libz.so when searching for -lz
/usr/bin/ld: skipping incompatible /usr/lib/libz.a when searching for -lz
/usr/bin/ld: cannot find -lz
collect2: ld returned 1 exit status

It means you have an incompatible zlib-devel binary.  In case, the 32-bit zlib-devel is installed, but not the 64 bits.  Run following command to fix it:

yum erase zlib-devel
yum install zlib-devel

Thursday, June 23, 2011

Reverse a singly linked list

I was asked this question during an interview today. I didn't do well though. I guess because it was an interview, I had some pressure and could think fast and clearly. In fact, this question is pretty easy with the recursive. When I got home, it didn't take me too long to figure out the correct answer. Here is the code in Java.

class Node {
    Node next;
}

class Util {
    public static void reverseNode(Node node) {
        reverseNode(node, null);
    }

    private static void reverseNode(Node node, Node previous) {
        if (node != null) {
            reverseNode(node.next, node);
            node.next = previous;
        }
    }
}

Update:

The recursive solution is easy to understand and implement. However, you may get a StackOverflow exception if the list is big. Here is the non-recursive solution:

class Util {
    public static void reverseNode(Node node) {
        Node previous = null;
        while (node != null) {
            Node next = node.next;
            node.next = previous;
            previous = node;
            node = next;
        }
    }
}

Tuesday, June 21, 2011

Swap 2 integers without using a temp variable

There are 2 ways to do it.

use sum of the 2 variables:

void swap(int a, int b) {

       a = a + b;

       b = a - b;

       a = a - b

}

2. use bitwise operation (xor)

void swap(int a, int b) {

      a = a ^ b;

      b = a ^ b;

      a = a ^ b;

}

Create Singleton in Java

A singleton is simply a class that is instantiated exactly once. There are two approaches to create a singleton class.

Early initialization,
Lazy initialization.

There are several ways to create singleton in each approach.

For early initialization:

Singleton with public final field

public class Singleton {

      public static final Singleton INSTANCE = new Singleton();

      private Singleton() {

         //do something

}

      public void someMethod() {

         // do something

}

}

2. Singleton with static factory

public class Singleton {

     private static final Singleton INSTANCE = new Singleton();

     private Singleton() {

         //do something

}

     public static Singleton getInstance(){

       return INSTANCE

}

     public void someMethod() {

         // do something

     }
   }

3. Singleton with enum

public enum Singleton {

      INSTANCE;

      private Singleton() {

         //do something 

}

      public void someMethod() {

         // do something

      }
    }

For lazy initialization:

However unless you absolutely need it, don't use lazy initialization.

     1. Lazy initialization holder class idiom for static fields (a class will not be initialized until it is used)

public class LazySingleton {

     private static class SingletonHolder {

       static final LazySingleton INSTANCE = new LazySingleton();

}

     private LazySingleton() {

         //do something

}

     public static LazySingleton getInstance(){

       return SingletonHolder.INSTANCE;  

}

     public void someMethod() {

         // do something

     }
    }

2. Double-check idiom for lazy initialization

   class LazySingleton {
     private static volatile LazySingleton instance;
     private LazySingleton(){
        //do something
     }

     public static LazySingleton getInstance() {
        if (instance == null) {
            synchronized (LazySingleton.class) {
                if (instance == null) {
                    INSTANCE = new LazySingleton();
                }
            }
        }
        return instance;

     }
   }

Monday, June 20, 2011

Reverse the string word by word, in place

The idea is reverse the string character by character first, then reverse the word character by character. For example, we have string "abc defg hijk". The first step is to reverse the string character by character:

"abc defg hijk" ==> "kjih gfed cba"

Then reverse the word in the output string character by character:

"kjih gfed cba" ==> "hijk defg abc"

The run time is O(n) with constant extra space. Here is the code in java:

    public static void resverseString(char[] input) {
        reverseString(input, 0, input.length - 1);
        int start = 0;
        int end = 0;
        for (char temp : input) {
            if (temp == ' ') {
                reverseString(input, start, end - 1);
                start = end + 1;
            }
            end++;
        }
        reverseString(input, start, end - 1);
    }

    private static void reverseString(char[] input, int start, int end) {
        while (start < end) {
            swap(input, start, end);
            start++;
            end--;
        }
    }

    private static void swap (char[] input, int i, int j) {
        char temp = input[i];
        input[i] = input[j];
        input[j] = temp;
    }

Wednesday, June 15, 2011

Weak Reference in Java

Today, during a phone interview, I was asked a question about WeakReference in Java. I only knew that the weak referenced object can be garbage collected by JVM, and nothing more. I did a little research, and realized that there are 4 types of references in Java:

Strong Reference
Soft Reference
Weak Reference
Phantom Reference

They are in the order from strong to weak references. 3 classes, SoftReference, WeakReference, and PhantomReference are for the last 3 references. For strong reference, just use new operator. Below is a great blog talking about all 4 references:

Understanding Weak References

Tuesday, June 14, 2011

Print a sequence of Fibonacci number

The trick is that you should use BigInteger instead of long. Otherwise, you will get overflow pretty soon. Here is the code:

import java.math.BigInteger;

public class Fibonacci {
    /**
    * print out a list of Fibonacci sequence to n places.
    * if n = 0, no number will be printed out. if n = 1, print the first Fibonacci number.
    *
    * @param n the number of Fibonacci to be printed out
    */
    public static void print(int n) {
        BigInteger f0 = BigInteger.ZERO;
        BigInteger f1 = BigInteger.ONE;
        for (int i = 0; i < n; i++) {

            System.out.print(f0);
            if (i != n - 1) {
                System.out.print(", ");
            }
            BigInteger temp = f1;
            f1 = f1.add(f0);
            f0 = temp;
        }
    }
}

Update: Another thing to consider is that don't use recursive method to compute Fibonacci number because it is too expensive. The run time is 2^n to recursively compute Fibancci number.

Monday, June 13, 2011

Print a singly linked list reversely

For a singly linked list, you can only access the list element from the head. At the first glance, it seems difficulty printing the element reversely (the tail first). However if you know recursive, the problem becomes quite easy. Here is how to solve this problem in Java:

public class ListUtil {
    public static <T> void reversePrint(List<T> list) {
        if (list == null || list.isEmpty()) {
            System.out.println("empty list");
        } else {
            Iterator<T> iter = list.iterator();
            reversePrint(iter);
        }
    }

    private static <T> void reversePrint(Iterator<T> iter) {
        if (iter.hasNext()) {
            T data = iter.next();
            reversePrint(iter);
            System.out.println(data);
        }
    }
}

Tuesday, April 19, 2011

Count the number of occurrences of a given key in an array

This was asked in on-site Microsoft interview (see original post). It seems the answers to this question are not correct. This question can be solved in O(logN) with binary search. We know how to find the first occurrence of a given key in a sorted array (see my previous post). It is easy to modify the algorithm to find the last occurrence of the given key in a sorted array:

    /**
     * find the last occurrence of the target in a sorted array
     * @param input a sorted array
     * @param target
     * @return the last index of target or -1
     */
    public static int findLastOccurence(int[] input, int target) {
        int lower = -1;
        int upper = input.length;

        while (lower + 1 != upper) {
            int middle = (lower + upper) >>> 1;

            if (input[middle] > target) {
                upper = middle;
            } else {
                lower = middle;
            }
        }

        if (lower != -1 && input[lower] == target) {
            return lower;
        } else {
            return -1;
        }
    }

After knowing indexes of the first and last occurrences, it is easy to know the number of occurrences (or the range). The run time for finding the number of occurrences is O(logN) since finding first and last occurrences is O(logN).

Monday, April 18, 2011

Find the first occurrence of an integer in sorted array

Normal binary search will return the target randomly if multiple targets are in the given array. Below is the modified version of binary search, and it is faster than normal binary search since it requires only one comparison with each loop:

    /**
     * find the index of first occurrence of target
     * @param input is a sorted array
     * @param target
     * @return the first index of target or -1
     */
    public static int findFirstOccurence(int[] input, int target) {
        int lower = -1;
        int upper = input.length;

        while (lower + 1 != upper) {
            //avoid overflow. See here
            int middle = (lower + upper) >>> 1;

            if (input[middle] < target) {
                lower = middle;
            } else {
                upper = middle;
            }
        }

        if (upper < input.length && input[upper] == target) {
            return upper;
        } else {
            return -1;
        }
    }

The run time of above method is still O(logn).

Wednesday, April 13, 2011

Implementation of the procedure RANDOM(a, b) that only makes calls to RANDOM(0, 1).

Implementation of the procedure RANDOM(a, b) that only makes calls to RANDOM(0, 1). This is the question asked in book "Introduction to Algorithm." I spent some time figuring out the following solution implemented in Java. The assumption is that we can get random number 0 and 1 which I use Java class Random to generate.

import java.util.Random;
public class RandomGenerator {
    private static Random rand = new Random();

    /**
     * generate a random nubmer between a (inclusive) and b (exclusive)
     */
    public static int random(int a, int b) {
        if (a >= b) {
            throw new UnsupportedOperationException("2nd param must be greater than 1st param");
        }
        return a + generate(b - a);
    }
    /**
     * generate the random between 0 to a (exclusive)
     */
    private static int generate(int a) {
        int run = leastPowerOfTwo(a);

        while (true) {
            int power = 1;
            int sum = 0;
            for (int i = 0; i < run; i++) {
                sum += random01() * power;
                power *= 2;
            }
            if (sum < a) {
                return sum;
            }
        }
    }

    /**
     * find the least power of 2 which is greater than or equal to the given input a
     */
    private static int leastPowerOfTwo(int a) {
        int power = 0;
        int temp = a;
        while (temp > 0) {
            temp >>>= 1;
            power++;
        }
        //if a is a power of 2
        if ((a & (a - 1)) == 0) {
            power--;
        }
        return power;
    }

    /**
     * a method to randomly generate 0 or 1
     * @return 0 or 1
     */
    private static int random01() {
        return rand.nextInt(2);
    }
}

The trick is to use the binary representation for the generated random number. For example, to generate a random within [0, 5), we need to call random(0, 1) 3 times since 2 ^ 3 = 8. If call random(0, 1) 2 times, we only have 2 ^ 2 = 4 random numbers. We only need to keep random number less than 5, and if we get a number is greater than 4, we just retry until we get the number is less than 5.   Below is the binary representation for 3 runs of random(0, 1)

#run #3   #2   #1
          0    0     0        0
          0    0     1        1
          0    1     0        2
          0    1     1        3
          1    0     0        4
          1 0     1        5
          1    1     0    6
          1    1     1        7

The informal proof of the above algorithm is that the chance to generate each number is same (1/8). Since we only keep the number from 0 to 4, the chance to generate the number from 0 to 4 is still same. Then the above algorithm does generate the correct random numbers. I ran this program on my machine to generate random number from 7 to 12 (exclusive) for 10 million times, and here is the result:

7:   1997855
8:   2000311
9:   2000140
10: 2000369
11: 2001325

Tuesday, April 12, 2011

Maximum subarray

This is an interview question:

Given an integer array, find the max sum in any contiguous sub-array. If all elements are negative, then the sum is 0. The following code just finds the max sum without returning the start and end indexes:

public class MaxSubArray {
    public static int maxSum(int[] input) {
        int currentSum = 0;
        int maxSum = 0;
        for (int i = 0; i < input.length; i++) {
            currentSum = Math.max(currentSum + input[i], 0);
            maxSum = Math.max(maxSum, currentSum);
        }
        return maxSum;
    }
}

The following code returns max sum with starting and ending indexes. If all elements are negative, return start and end indexes as 0 and -1, respectively.

public class MaxSubArray {
    public static MaxSubArrayResult compute(int[] input) {

        MaxSubArrayResult result = new MaxSubArrayResult(0, -1, 0);
        int currentSum = 0;
        int currentStart = 0;
        for (int currentEnd = 0; currentEnd < input.length; currentEnd++) {
            currentSum += input[currentEnd];
            if (currentSum > result.maxSum) {
                result.start = currentStart;
                result.end = currentEnd;
                result.maxSum = currentSum;
            } else if (currentSum < 0){
                currentStart = currentEnd+ 1;
                currentSum = 0;
            }
        }
        return result;
    }

    public static class MaxSubArrayResult {
        public int start;
        public int end;
        public int maxSum;

        public MaxSubArrayResult(int start, int end, int sum) {
            super();
            this.start = start;
            this.end = end;
            this.maxSum = sum;
        }

        @Override
        public String toString() {
            return String.format("start = %d, end = %d, maxSum = %d", start, end, maxSum);
        }
    }
}

The run time for both method is linear (O(n)).

Tuesday, January 11, 2011

print a column using awk and remove dups

The following command will print the 3rd column in file input.txt:

awk '{print $3}' input.txt

Using redirect to output the result to a file

awk '{print $3}' input.txt > output.txt

To get rid of dups from output.txt and output to file unique.txt:

sort output.txt | uniq -u > unique.txt

We can combine these steps to one:

awk '{print $3}' input.txt | sort | uniq -u > unique.txt

Some important info about awk:

NR -- The current line's sequential number
NF -- The number of fields in the current line
FS -- The input field separator; defaults to whitespace and is reset by the -F command line parameter

For example, to get the last field of a line:

path=" ../dist/myjar-1.7.jar"
jar_file=`echo $path | awk -F '/' '{print $NF}'`

print the sum of the number in a file:

file test.txt has the following format:

   a=1.2
   bc=2.3
   xyz=1.3

awk -F '=' '{SUM += $NF} END {print SUM/NR}' test.txt

Friday, January 7, 2011

Select a range from oracle, mysql

I talked about how to select first/top n rows from oracle, mysql, and ms sql server. How do we get the range, say from m to n, where m < n?

Oracle

select id, age from (select id, age, rownum as rn from customer order by age) where rn between :m and :n

MySql

select * from customer order by age limit :m, :n - :m

MS SQL

I don't know how to do it with MS sql server yet. I don't have ms sql server installed. If you happen to know it, please post your solution in the comment.

Technology Blog

Search This Blog