Search This Blog

Friday, September 4, 2020

Elasticsearch with Geopoint

Install Elasticsearch on Mac

Install the tap

brew tap elastic/tap

Install Elasticsearch

brew install elastic/tap/elasticsearch-full

Start Elasticsearch

elasticsearch

Confirm the installation works

curl http://localhost:9200

Something similar to the following should be shown:

{
  "name" : "MACC02Y753HJGH5",
  "cluster_name" : "elasticsearch_zhentao.li",
  "cluster_uuid" : "mkwsHWW8SoCcPzXg_KOyuQ",
  "version" : {
    "number" : "7.9.1",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "083627f112ba94dffc1232e8b42b73492789ef91",
    "build_date" : "2020-09-01T21:22:21.964974Z",
    "build_snapshot" : false,
    "lucene_version" : "8.6.2",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

Search by Geo Distance

Define the index for Geo Type

put http://localhost:9200/regions
{
"mappings": {
"properties": {
"name": {
type: "text"
},
"location": {
"type": "geo_point"
}
}
}
}

With the following response:

{
"acknowledged": true,
"shards_acknowledged": true,
"index": "regions"
}

Download Geo data file

curl https://restcountries.eu/rest/v1/all | jq -c '.[] | {"index": {"_index": "regions", "_id": .alpha3Code}}, {name: .name, location: [.latlng[1], .latlng[0]] }' > regions

or download it from here.

Import the file

curl -H 'Content-Type: application/json'  -s -XPUT localhost:9200/_bulk --data-binary "@regions"

List all regions

http://localhost:9200/regions/_search?size=100
GET /regions/_search
{
"size" : 100,
"query": {
"match_all": {}
}
}

Filter by Geo distance

{
"size": 50,
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "3000km",
"location": {
"lat": 35,
"lon": 105
}
}
}
}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 35,
"lon": 105
},
"order": "asc",
"unit": "km",
"distance_type": "arc"
}
}
]
}

The result is like below

{
"took": 15,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 19,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "regions",
"_type": "_doc",
"_id": "CHN",
"_score": null,
"_source": {
"name": "China",
"location": [
105,
35
]
},
"sort": [
0.0
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "MNG",
"_score": null,
"_source": {
"name": "Mongolia",
"location": [
105,
46
]
},
"sort": [
1223.1458751104453
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "MMR",
"_score": null,
"_source": {
"name": "Myanmar",
"location": [
98,
22
]
},
"sort": [
1597.9864780876323
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "BTN",
"_score": null,
"_source": {
"name": "Bhutan",
"location": [
90.5,
27.5
]
},
"sort": [
1608.420695556179
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "MAC",
"_score": null,
"_source": {
"name": "Macau",
"location": [
113.55,
22.16666666
]
},
"sort": [
1651.5104987054583
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "HKG",
"_score": null,
"_source": {
"name": "Hong Kong",
"location": [
114.16666666,
22.25
]
},
"sort": [
1674.4580484734988
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "LAO",
"_score": null,
"_source": {
"name": "Laos",
"location": [
105,
18
]
},
"sort": [
1890.3163582803475
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "BGD",
"_score": null,
"_source": {
"name": "Bangladesh",
"location": [
90,
24
]
},
"sort": [
1894.1562153610287
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "TWN",
"_score": null,
"_source": {
"name": "Taiwan",
"location": [
121,
23.5
]
},
"sort": [
2006.2988706499348
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "PRK",
"_score": null,
"_source": {
"name": "North Korea",
"location": [
127,
40
]
},
"sort": [
2012.9111514604588
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "KOR",
"_score": null,
"_source": {
"name": "South Korea",
"location": [
127.5,
37
]
},
"sort": [
2031.4778900496392
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "VNM",
"_score": null,
"_source": {
"name": "Vietnam",
"location": [
107.83333333,
16.16666666
]
},
"sort": [
2113.0731526216714
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "NPL",
"_score": null,
"_source": {
"name": "Nepal",
"location": [
84,
28
]
},
"sort": [
2132.4138804176705
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "THA",
"_score": null,
"_source": {
"name": "Thailand",
"location": [
100,
15
]
},
"sort": [
2279.3258680765903
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "KHM",
"_score": null,
"_source": {
"name": "Cambodia",
"location": [
105,
13
]
},
"sort": [
2446.291756434398
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "KGZ",
"_score": null,
"_source": {
"name": "Kyrgyzstan",
"location": [
75,
41
]
},
"sort": [
2697.507030320444
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "RUS",
"_score": null,
"_source": {
"name": "Russia",
"location": [
100,
60
]
},
"sort": [
2803.2803033336695
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "JPN",
"_score": null,
"_source": {
"name": "Japan",
"location": [
138,
36
]
},
"sort": [
2975.112941971521
]
},
{
"_index": "regions",
"_type": "_doc",
"_id": "PHL",
"_score": null,
"_source": {
"name": "Philippines",
"location": [
122,
13
]
},
"sort": [
2983.948851295192
]
}
]
}
}

When you search by distance, Elasticsearch won’t return the distance itself as it isn’t a real field. If you need sorting, Elasticsearch advise to use the sort function to return the value.



Sunday, February 21, 2016

Spring with Spark Framework for Microservices

Recently, my team decided to use Spark to build our new REST api.  Spark seems easy with less coding.  We still wanted to use Spring for dependency injection.  We also wanted to use Tomcat as the standalone web container.  After some research, we figured out how to build the REST api with Spring, Spark and Tomcat.
  1. Create class SpringSparkFilter which extends SparkFilter. Override method getApplication:
  2. Create class MySparkApplication, which implements SparkApplication. Implement init method:
  3. Configure web.xml to use SpringSparkFilter:
Then either build a war file, and deploy it to tomcat or use tomcat7-maven-plugin for test run.  I usually use tomcat7-maven-plugin for fast development.  Here is the command:

mvn tomcat7:run

Then access http://localhost:8080/current-date.  The complete example can be found here.

Thursday, September 18, 2014

HBase, Thrift2 and Erlang on Mac

There are not to many documents showing how Erlang can connect to HBase using Thrift protocol.  The very useful link which was published more than 5 years ago, and it discussed Thrift1.  Based on Lars George's post, Thrift2 will replace Thrift1 eventually.  The following will show what I have done to get Erlang to work with HBase using Thrift on a Mac machine.  It should be similar to get it work on a Linux machine.  Note that only Java is supported as first class citizen for HBase.  If you have the choice, I suggest using Java directly.
  1. Download thrift-0.9.0 and follow this tutorial to install it.
    cp -r thrift-0.9.0/lib/erl/*  /usr/local/opt/erlang/lib/erlang/lib/thrift-0.9.0 
    (homebrew installed erlang to /usr/local/opt/erlang.  Find out your own erlang directory.)
  2. Download latest stable hbase and decompress it.  I installed it to /usr/local/hbase-0.98.6.1-hadoop2, and will refer it as $HBASE_FOLDER.
  3. Download hbase.thrift and generate erlang bindings.
    thrift -gen erl hbase.thrift
    It generates erlang bindings in gen-erl folder.
  4. start hbase server:
    before starting hbase server, update conf/hbase-site.xml
    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>file:///some-folder-hbase</value>
      </property>
      <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/some-folder-zookeeper</value>
      </property>
    </configuration>

    Then start hbase server:
    $HBASE_FOLDER/bin/start-hbase.sh
  5. start Thrift2:
    $HBASE_FOLDER/bin/base-daemon.sh start thrift2
  6. Create table 'ups' with column family 'ui' using HBase Shell (Thrift2 doesn't support create table yet):
    $HBASE_FOLDER/bin/hbase shell
    hbase(main):001:0>; create 'ups', 'ui'
  7. Download erlang client file and save it to gen-erl folder.
  8. Go to gen-erl folder, compile erlang files and start erlang shell:
    cd gen-erl
    erlc *.erl
    erl 
    1>hbase_thrift2_client:put("localhost", 9090).
    {{tclient,tHBaseService_thrift,
              {protocol,thrift_binary_protocol,
                        {binary_protocol,{transport,thrift_buffered_transport,
                                                    {buffered_transport,{transport,thrift_socket_transport,
                                                                                   {data,#Port<0 data-blogger-escaped-.716="">,infinity}},
                                                                        []}},
                                         true,true}},
              0},
     {ok,ok}}

    2>hbase_thrift2_client:get("localhost", 9090).
    {{tclient,tHBaseService_thrift,
              {protocol,thrift_binary_protocol,
                        {binary_protocol,{transport,thrift_buffered_transport,
                                                    {buffered_transport,{transport,thrift_socket_transport,
                                                                                   {data,#Port<0 data-blogger-escaped-.717="">,infinity}},
                                                                        []}},
                                         true,true}},
              0},
     {ok,{tResult,<<"zhentao-key1">>,
                  [{tColumnValue,<<"ui">>,<<"test">>,<<"xyz1">>,1411075681134,
                                 undefined},
                   {tColumnValue,<<"ui">>,<<"test1">>,<<"abc1">>,1411075681134,
                                 undefined}]}}}
The detailed erlang project can be found here

Friday, August 30, 2013

Spring AMQP and RabbitMQ High Availability (HA)

Recently, I finished a project to use RabbitMQ as the message broker.  It is pretty easy to set up a RabbitMQ cluster with High Availability (HA).  The RabbitMQ has the pretty good documentation on how to set up the cluster.  I also document it here on how to set up a 2-node cluster on a single Mac machine.  In this post, I would like to show how we can leverage spring-amqp to connect to RabbitMQ cluster.  I have a rabbit cluster running with 2 nodes: localhost:5673 and localhost:5672.

It is pretty easy to use spring-amqp.  I used maven to manage the dependencies:


jackson-jaxrs-json-provider is used to serialize java object to json, and deserialize json back to java object.

When creating ConnectionFactory, the addresses should be used instead of the host and port:


The addresses are the comma separated host:port pairs which consist of the cluster.

For producer, we use rabbitTemplate to send messages:



For consumer, a MessageListenerContainer is created to consume message asynchronously:



The MessageHandler code is as follows:


This class can be called anything you like, but the method must be called handleMessaege and with the correct signature (here it is Employee to match producer).  If you want to change the method name, you have to call:

          MessageListenerAdapter.setDefaultListenerMethod

The source code can be download from github.


Tuesday, June 11, 2013

Skip cobertura coverage check when skip unit test

After I introduced embedded mysql for unit testing to my project, the build is taking longer now.  For my local development, sometimes I want to disable the unit test:

      mvn clean package -DskipTests

Note that you can even skip the compiling of test class by using

     mvn clean package -Dmaven.test.skip

However, if you use cobertura for code coverage check, you would get the following error:

[ERROR] Project failed check. Total branch coverage rate of 0.0% is below 85.0%
Project failed check. Total line coverage rate of 0.0% is below 85.0%


So you need to skip cobertura as well.  if you google "skip cobertura", most answers suggested using haltOnFailure. However, cobertura already provides the skip functionality:

     mvn clean package -DskipTests -Dcobertura.skip

With -Dcobertura.skip, cobertura check will be skipped:

[INFO] --- cobertura-maven-plugin:2.5.2:instrument (default) @ tag-service ---
[INFO] Skipping cobertura execution

Monday, June 10, 2013

Using embedded mysql database for unit test with maven and spring

I used H2 in memory database to unit test my DAO.  However, my production database is MySQL.
Generally, H2 is compatible with MySQL.  However, H2 doesn't support ENUM type yet, and my schema uses ENUM for a column definition.  Inspired by this post and another, I successfully used embedded MySQL for unit test.  Here is what I did:

1.  Add mysql-connector-mxj dependency:


2. Create EmbeddedMysqlDatabase.java and EmbeddedMysqlDatabaseBuilder.java

3. Create spring bean datasource:
4. Run unit test with Spring:
A couple things are worth mentioning:

1. The test uses Spring annotation @DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD) which reload the context after every test method.  It means every test method starts a new mysql instance.  You may group test methods to 2 different categories, one changes data, and another not.  For those methods don't change data, only need to create one instance of mysql to speed up testing.

2. If you see the following error message on Linux server:
/lib64/libc.so.6: version `GLIBC_2.7' not found (required by /tmp/test_db_2420035739165052/bin/mysqld)

You need to either downgrade mysql-connector-mxj to 5.0.11 or upgrade your linux OS.  I got the above error message on CentOS 5.4 with 5.0.12, but no problem with 5.0.11. However, 5.0.11 isn't in any public maven repo, but you can download it from here and install it to your local repo, or upload to your company's repo. Using the following command to check CentOS version:

cat /etc/redhat-release

3. The code is tested on Mac and CentOS only, and can be downloaded from github

Monday, June 3, 2013

Example for enabling CORS support in Spring Rest Api 3.2

My last year's post "Enable CORS support in REST services with Spring 3.1" seems causing some confusion.  I decided to create an example to show how to enable CORS with Spring rest api.  The CorsFilter is same as before:


Below are 2 endpoints from EmployeeController.java:

The update method adds the header Access-Control-Allow-Origin with "*", but delete method doesn't.  Therefore, the update method is enabled with CORS, but delete isn't.  If delete endpoint is called, the following error will be shown in Chrome:

"cannot load http://localhost:8080/rest/employee/1. Origin http://127.0.0.1:8080 is not allowed by Access-Control-Allow-Origin."

However, the delete method is still invoked on the server side since the pre-flight request (OPTIONS)
allows DELETE method to be called.

The entire project can be downloaded from github.  Following README to test it.

My intention was to disable/enable CORS support in each individual method by setting "Access-Control-Allow-Origin", but it seems not working as expected:  Although the browser returns correct info, the method call is still invoked on the server side even Access-Control-Allow-Origin is not set.  If you are allowed to enable all endpoints with CORS support, the code can be simplified as below:

The only difference is that addHeader("Access-Control-Allow-Origin") is moved out the if check. And then the update method can be simplified as:

The code can be downloaded from github too.