GeoMesa Transformations¶
GeoMesa allows users to perform relational projections on query results. We call these “transformations” to distinguish them from the overloaded term “projection” which has a different meaning in a spatial context. These transformations have the following uses and advantages:
- Subset to specified columns - reduces network overhead of returning results
- Rename specified columns - alters the schema of data on the fly
- Compute new attributes from one or more original attributes - adds derived fields to results
The transformations are applied in parallel across the cluster thus making them very fast. They are analogous to the map tasks in a map-reduce job. Transformations are also extensible; developers can implement new functions and plug them into the system using standard mechanisms from Geotools.
Note
When this tutorial refers to “projections”, it means in the relational sense - see Projection - Relational Algebra. Projection also has many other meanings in spatial discussions - they are not used in this tutorial. Although projections can also modify an attribute’s value, in this tutorial we will refer to such modifications as “transformations” to keep things clearer.
This tutorial will show you how to write custom Java code using GeoMesa to do the following:
- Query previously-ingested data.
- Apply relational projections to your query results.
- Apply transformations to your query results.
Prerequisites¶
You will need:
- an instance of Accumulo 1.7 or 1.8 running on Hadoop 2.2 or better,
- an Accumulo user that has appropriate permissions to query your data,
- Java JDK 8,
- Apache Maven 3.2.2 or better, and
- a git client.
This tutorial queries the GDELT data set. Instructions on ingesting GDELT data are available in the Map-Reduce Ingest of GDELT tutorial.
Warning
Before continuing, ingest the GDELT data set described in the GeoMesa GDELT Map-Reduce Ingest of GDELT.
Download and Build the Tutorial¶
Pick a reasonable directory on your machine, and run:
$ git clone https://github.com/geomesa/geomesa-tutorials.git
$ cd geomesa-tutorials
Note
You may need to download a particular release of the tutorials project to target a particular GeoMesa release. See About Tutorial Versions.
To build, run
$ mvn clean install -pl geomesa-examples-transformations
Note
Ensure that the version of Accumulo, Hadoop, etc in
the root pom.xml
match your environment.
Note
Depending on the version, you may also need to build GeoMesa locally. Instructions can be found in Installation.
Run the Tutorial¶
Warning
Before continuing, ensure that you have ingested the GDELT
data set described in the Map-Reduce Ingest of GDELT
tutorial. If using GDELT data from a
time period different than that used in the GDELT tutorial,
change the date range in the QueryTutorial
createBaseFilter
function and recompile.
On the command line, run:
$ java -cp geomesa-examples-transformations/target/geomesa-examples-transformations-<version>.jar \
com.example.geomesa.transformations.QueryTutorial \
-instanceId <instance> \
-zookeepers <zoos> \
-user <user> \
-password <pwd> \
-tableName <table> \
-featureName <feature>
where you provide the following arguments:
<instance>
the name of your Accumulo instance<zoos>
comma-separated list of your Zookeeper nodes, e.g.zoo1:2181,zoo2:2181,zoo3:2181
<user>
the name of an Accumulo user that will execute the scans, e.g.root
<pwd>
the password for the previously-mentioned Accumulo user<table>
the name of the Accumulo table that has the GeoMesa GDELT dataset, e.g.gdelt
if you followed the GDELT tutorial<feature>
the feature name used to ingest the GeoMesa GDELT dataset, e.g.event
if you followed the GDELT tutorial
You should see several queries run and the results printed out to your console.
Insight into How the Tutorial Works¶
The code for querying and projections is available in the class
com.example.geomesa.transformations.QueryTutorial
. The source code
is meant to be accessible, but the following is a high-level breakdown
of the relevant methods:
basicQuery
executes a base filter without any further options. All attributes are returned in the data set.basicProjectionQuery
executes a base filter but specifies a subset of attributes to return.basicTransformationQuery
executes a base filter and transforms one of the attributes that is returned.renamedTransformationQuery
executes a base filter and transforms one of the attributes, returning it in a separate derived attribute.mutliFieldTransformationQuery
executes a base filter and transforms two attributes into a single derived attributes.geometricTransformationQuery
executes a base filter and transforms the geometry returned from a point into a polygon by buffering it.
Additional transformation functions are listed here.
Please note that currently not all functions are supported by GeoMesa.
Sample Code and Output¶
The following code snippets show the basic aspects of creating queries for GeoMesa.
Create a basic query with no projections¶
This query does not use any projections or transformations. Note that all attributes are returned in the results.
Query query = new Query(simpleFeatureTypeName, cqlFilter);
Output
Result | GLOBALEVENTID | SQLDATE | MonthYear | Year | FractionDate | Actor1Code | Actor1Name | Actor1CountryCode | Actor1KnownGroupCode | Actor1EthnicCode | Actor1Religion1Code | Actor1Religion2Code | Actor1Type1Code | Actor1Type2Code | Actor1Type3Code | Actor2Code | Actor2Name | Actor2CountryCode | Actor2KnownGroupCode | Actor2EthnicCode | Actor2Religion1Code | Actor2Religion2Code | Actor2Type1Code | Actor2Type2Code | Actor2Type3Code | IsRootEvent | EventCode | EventBaseCode | EventRootCode | QuadClass | GoldsteinScale | NumMentions | NumSources | NumArticles | AvgTone | Actor1Geo_Type | Actor1Geo_FullName | Actor1Geo_CountryCode | Actor1Geo_ADM1Code | Actor1Geo_Lat | Actor1Geo_Long | Actor1Geo_FeatureID | Actor2Geo_Type | Actor2Geo_FullName | Actor2Geo_CountryCode | Actor2Geo_ADM1Code | Actor2Geo_Lat | Actor2Geo_Long | Actor2Geo_FeatureID | ActionGeo_Type | ActionGeo_FullName | ActionGeo_CountryCode | ActionGeo_ADM1Code | ActionGeo_Lat | ActionGeo_Long | ActionGeo_FeatureID | DATEADDED | geom |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 284464526 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USA | UNITED STATES | USA | USAGOV | UNITED STATES | USA | GOV | 0 | 010 | 010 | 01 | 1 | 0.0 | 2 | 1 | 2 | 2.6362038 | 4 | Kyiv, Kyyiv, Misto, Ukraine | UP | UP12 | 50.4333 | 30.5167 | -1044367 | 1 | United States | US | US | 38.0 | -97.0 | null | 1 | United States | US | US | 38.0 | -97.0 | null | 20140202 | POINT (30.5167 50.4333) | |||||||||||||
2 | 284466704 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USAGOV | UNITED STATES | USA | GOV | USA | UNITED STATES | USA | 1 | 036 | 036 | 03 | 1 | 4.0 | 4 | 1 | 4 | 1.5810276 | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 20140202 | POINT (32 49) | |||||||||||||
3 | 284427971 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | IGOUNO | UNITED NATIONS | UNO | IGO | USA | UNITED STATES | USA | 0 | 012 | 012 | 01 | 1 | -0.4 | 27 | 3 | 27 | 1.0064903 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 20140202 | POINT (30.5167 50.4333) | |||||||||||||
4 | 284466607 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USAGOV | UNITED STATES | USA | GOV | UKR | UKRAINE | UKR | 1 | 100 | 100 | 10 | 3 | -5.0 | 2 | 1 | 2 | 7.826087 | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 20140202 | POINT (32 49) | |||||||||||||
5 | 284464187 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USA | UNITED STATES | USA | UKR | UKRAINE | UKR | 0 | 111 | 111 | 11 | 3 | -2.0 | 5 | 1 | 5 | 1.4492754 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 20140202 | POINT (30.5167 50.4333) |
Create a query with a projection for two attributes¶
This query uses a projection to only return the ‘Actor1Name’ and ‘geom’ attributes.
String[] properties = new String[] {"Actor1Name", "geom"};
Query query = new Query(simpleFeatureTypeName, cqlFilter, properties);
Output
Result | Actor1Name | geom |
---|---|---|
1 | UNITED STATES | POINT (32 49) |
2 | UNITED STATES | POINT (30.5167 50.4333) |
3 | UNITED STATES | POINT (30.5167 50.4333) |
4 | UNITED STATES | POINT (30.5167 50.4333) |
5 | UNITED STATES | POINT (30.5167 50.4333) |
Create a query with an attribute transformation¶
This query performs a transformation on the ‘Actor1Name’ attribute, to print it in a more user-friendly format.
String[] properties = new String[] {"Actor1Name=strCapitalize(Actor1Name)", "geom"};
Query query = new Query(simpleFeatureTypeName, cqlFilter, properties);
Output
Result | geom | Actor1Name |
---|---|---|
1 | POINT (30.5167 50.4333) | United States |
2 | POINT (32 49) | United States |
3 | POINT (32 49) | United States |
4 | POINT (30.5167 50.4333) | United States |
5 | POINT (30.5167 50.4333) | United States |
Create a query with a derived attribute¶
This query creates a new attribute called ‘derived’ based off a join of the ‘Actor1Name’ and ‘Actor1Geo_FullName’ attribute. This could be used to show the actor and location of the event, for example.
String property = "derived=strConcat(Actor1Name,strConcat(' - ',Actor1Geo_FullName)),geom";
String[] properties = new String[] { property };
Query query = new Query(simpleFeatureTypeName, cqlFilter, properties);
Output
Result | geom | derived |
---|---|---|
1 | POINT (30.5167 50.4333) | UNITED STATES - Kyiv, Kyyiv, Misto, Ukraine |
2 | POINT (32 49) | UNITED STATES - Ukraine |
3 | POINT (30.5167 50.4333) | UNITED STATES - Kiev, Ukraine (general), Ukraine |
4 | POINT (32 49) | UNITED STATES - Ukraine |
5 | POINT (30.5167 50.4333) | UNITED NATIONS - Kiev, Ukraine (general), Ukraine |
Create a query with a geometric transformation¶
This query performs a geometric transformation on the points returned, buffering them by a fixed amount. This could be used to estimate an area of impact around a particular event, for example.
String[] properties = new String[] {"geom,derived=buffer(geom, 2)"};
Query query = new Query(simpleFeatureTypeName, cqlFilter, properties);
Output
Result | geom | derived |
---|---|---|
1 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) |
2 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) |
3 | POINT (32 49) | POLYGON ((34 49, 33.961570560806464 48.609819355967744, 33.84775906502257 48.23463313526982, 33.66293922460509 47.8888595339608, 33.41421356237309 47.58578643762691, 33.1111404660392 47.33706077539491, 32.76536686473018 47.15224093497743, 32.390180644032256 47.038429439193536, 32 47, 31.609819355967744 47.038429439193536, 31.23463313526982 47.15224093497743, 30.888859533960797 47.33706077539491, 30.585786437626904 47.58578643762691, 30.33706077539491 47.8888595339608, 30.152240934977428 48.234633135269824, 30.03842943919354 48.609819355967744, 30 49, 30.03842943919354 49.390180644032256, 30.152240934977428 49.76536686473018, 30.33706077539491 50.11114046603921, 30.585786437626908 50.4142135623731, 30.888859533960797 50.66293922460509, 31.234633135269824 50.84775906502257, 31.609819355967748 50.961570560806464, 32.00000000000001 51, 32.39018064403226 50.96157056080646, 32.76536686473018 50.84775906502257, 33.11114046603921 50.66293922460509, 33.4142135623731 50.41421356237309, 33.6629392246051 50.111140466039195, 33.84775906502258 49.765366864730176, 33.961570560806464 49.39018064403225, 34 49)) |
4 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) |
5 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) |