GeoMesa Transformations ======================= GeoMesa allows users to perform `relational projections `__ on query results. We call these "transformations" to distinguish them from the overloaded term "projection" which has a different meaning in a spatial context. These transformations have the following uses and advantages: 1. Subset to specified columns - reduces network overhead of returning results 2. Rename specified columns - alters the schema of data on the fly 3. Compute new attributes from one or more original attributes - adds derived fields to results The transformations are applied in parallel across the cluster thus making them very fast. They are analogous to the map tasks in a map-reduce job. Transformations are also extensible; developers can implement new functions and plug them into the system using standard mechanisms from `Geotools `__. .. note:: When this tutorial refers to "projections", it means in the relational sense - see `Projection - Relational Algebra `__. Projection also has `many other meanings `__ in spatial discussions - they are not used in this tutorial. Although projections can also modify an attribute's value, in this tutorial we will refer to such modifications as "transformations" to keep things clearer. This tutorial will show you how to write custom Java code using GeoMesa to do the following: 1. Query previously-ingested data. 2. Apply `relational projections `__ to your query results. 3. Apply transformations to your query results. Prerequisites ------------- You will need: - an instance of Accumulo |accumulo_version| running on Hadoop |hadoop_version|, - an Accumulo user that has appropriate permissions to query your data, - `Java JDK 8 `__, - `Apache Maven `__ |maven_version|, and - a `git `__ client. This tutorial queries the GDELT data set. Instructions on ingesting GDELT data are available in the :doc:`./geomesa-examples-gdelt` tutorial. .. warning:: Before continuing, ingest the GDELT data set described in the GeoMesa GDELT :doc:`./geomesa-examples-gdelt`. Download and Build the Tutorial ------------------------------- Pick a reasonable directory on your machine, and run: .. code-block:: bash $ git clone https://github.com/geomesa/geomesa-tutorials.git $ cd geomesa-tutorials .. note:: You may need to download a particular release of the tutorials project to target a particular GeoMesa release. See :ref:`tutorial_versions`. To build, run .. code-block:: bash $ mvn clean install -pl geomesa-examples-transformations .. note:: Ensure that the version of Accumulo, Hadoop, etc in the root ``pom.xml`` match your environment. .. note:: Depending on the version, you may also need to build GeoMesa locally. Instructions can be found in :ref:`installation`. Run the Tutorial ---------------- .. warning:: Before continuing, ensure that you have ingested the GDELT data set described in the :doc:`./geomesa-examples-gdelt` tutorial. If using GDELT data from a time period different than that used in the GDELT tutorial, change the date range in the ``QueryTutorial`` ``createBaseFilter`` function and recompile. On the command line, run: .. code-block:: bash $ java -cp geomesa-examples-transformations/target/geomesa-examples-transformations-.jar \ com.example.geomesa.transformations.QueryTutorial \ -instanceId \ -zookeepers \ -user \ -password \ -tableName \ -featureName where you provide the following arguments: - ```` the name of your Accumulo instance - ```` comma-separated list of your Zookeeper nodes, e.g. ``zoo1:2181,zoo2:2181,zoo3:2181`` - ```` the name of an Accumulo user that will execute the scans, e.g. ``root`` - ```` the password for the previously-mentioned Accumulo user - ``
`` the name of the Accumulo table that has the GeoMesa GDELT dataset, e.g. ``gdelt`` if you followed the GDELT tutorial - ```` the feature name used to ingest the GeoMesa GDELT dataset, e.g. ``event`` if you followed the GDELT tutorial You should see several queries run and the results printed out to your console. Insight into How the Tutorial Works ----------------------------------- The code for querying and projections is available in the class ``com.example.geomesa.transformations.QueryTutorial``. The source code is meant to be accessible, but the following is a high-level breakdown of the relevant methods: - ``basicQuery`` executes a base filter without any further options. All attributes are returned in the data set. - ``basicProjectionQuery`` executes a base filter but specifies a subset of attributes to return. - ``basicTransformationQuery`` executes a base filter and transforms one of the attributes that is returned. - ``renamedTransformationQuery`` executes a base filter and transforms one of the attributes, returning it in a separate derived attribute. - ``mutliFieldTransformationQuery`` executes a base filter and transforms two attributes into a single derived attributes. - ``geometricTransformationQuery`` executes a base filter and transforms the geometry returned from a point into a polygon by buffering it. Additional transformation functions are listed `here `__. *Please note that currently not all functions are supported by GeoMesa.* Sample Code and Output ---------------------- The following code snippets show the basic aspects of creating queries for GeoMesa. Create a basic query with no projections ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This query does not use any projections or transformations. Note that all attributes are returned in the results. .. code-block:: java Query query = new Query(simpleFeatureTypeName, cqlFilter); **Output| Result | GLOBALEVENTID | SQLDATE | MonthYear | Year | FractionDate | Actor1Code | Actor1Name | Actor1CountryCode | Actor1KnownGroupCode | Actor1EthnicCode | Actor1Religion1Code | Actor1Religion2Code | Actor1Type1Code | Actor1Type2Code | Actor1Type3Code | Actor2Code | Actor2Name | Actor2CountryCode | Actor2KnownGroupCode | Actor2EthnicCode | Actor2Religion1Code | Actor2Religion2Code | Actor2Type1Code | Actor2Type2Code | Actor2Type3Code | IsRootEvent | EventCode | EventBaseCode | EventRootCode | QuadClass | GoldsteinScale | NumMentions | NumSources | NumArticles | AvgTone | Actor1Geo\_Type | Actor1Geo\_FullName | Actor1Geo\_CountryCode | Actor1Geo\_ADM1Code | Actor1Geo\_Lat | Actor1Geo\_Long | Actor1Geo\_FeatureID | Actor2Geo\_Type | Actor2Geo\_FullName | Actor2Geo\_CountryCode | Actor2Geo\_ADM1Code | Actor2Geo\_Lat | Actor2Geo\_Long | Actor2Geo\_FeatureID | ActionGeo\_Type | ActionGeo\_FullName | ActionGeo\_CountryCode | ActionGeo\_ADM1Code | ActionGeo\_Lat | ActionGeo\_Long | ActionGeo\_FeatureID | DATEADDED | geom || 1 | 284464526 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USA | UNITED STATES | USA | | | | | | | | USAGOV | UNITED STATES | USA | | | | | GOV | | | 0 | 010 | 010 | 01 | 1 | 0.0 | 2 | 1 | 2 | 2.6362038 | 4 | Kyiv, Kyyiv, Misto, Ukraine | UP | UP12 | 50.4333 | 30.5167 | -1044367 | 1 | United States | US | US | 38.0 | -97.0 | null | 1 | United States | US | US | 38.0 | -97.0 | null | 20140202 | POINT (30.5167 50.4333) || 2 | 284466704 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USAGOV | UNITED STATES | USA | | | | | GOV | | | USA | UNITED STATES | USA | | | | | | | | 1 | 036 | 036 | 03 | 1 | 4.0 | 4 | 1 | 4 | 1.5810276 | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 20140202 | POINT (32 49) || 3 | 284427971 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | IGOUNO | UNITED NATIONS | | UNO | | | | IGO | | | USA | UNITED STATES | USA | | | | | | | | 0 | 012 | 012 | 01 | 1 | -0.4 | 27 | 3 | 27 | 1.0064903 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 20140202 | POINT (30.5167 50.4333) || 4 | 284466607 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USAGOV | UNITED STATES | USA | | | | | GOV | | | UKR | UKRAINE | UKR | | | | | | | | 1 | 100 | 100 | 10 | 3 | -5.0 | 2 | 1 | 2 | 7.826087 | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 1 | Ukraine | UP | UP | 49.0 | 32.0 | null | 20140202 | POINT (32 49) || 5 | 284464187 | Sun Feb 02 00:00:00 EST 2014 | 201402 | 2014 | 2014.0876 | USA | UNITED STATES | USA | | | | | | | | UKR | UKRAINE | UKR | | | | | | | | 0 | 111 | 111 | 11 | 3 | -2.0 | 5 | 1 | 5 | 1.4492754 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 4 | Kiev, Ukraine (general), Ukraine | UP | UP00 | 50.4333 | 30.5167 | -1044367 | 20140202 | POINT (30.5167 50.4333) |reate a query with a projection for two attributes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This query uses a projection to only return the 'Actor1Name' and 'geom' attributes. .. code-block:: java String[] properties = new String[] {"Actor1Name", "geom"}; Query query = new Query(simpleFeatureTypeName, cqlFilter, properties); **Output** +----------+-----------------+---------------------------+ | Result | Actor1Name | geom | +==========+=================+===========================+ | 1 | UNITED STATES | POINT (32 49) | +----------+-----------------+---------------------------+ | 2 | UNITED STATES | POINT (30.5167 50.4333) | +----------+-----------------+---------------------------+ | 3 | UNITED STATES | POINT (30.5167 50.4333) | +----------+-----------------+---------------------------+ | 4 | UNITED STATES | POINT (30.5167 50.4333) | +----------+-----------------+---------------------------+ | 5 | UNITED STATES | POINT (30.5167 50.4333) | +----------+-----------------+---------------------------+ Create a query with an attribute transformation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This query performs a transformation on the 'Actor1Name' attribute, to print it in a more user-friendly format. .. code-block:: java String[] properties = new String[] {"Actor1Name=strCapitalize(Actor1Name)", "geom"}; Query query = new Query(simpleFeatureTypeName, cqlFilter, properties); **Output** +----------+---------------------------+-----------------+ | Result | geom | Actor1Name | +==========+===========================+=================+ | 1 | POINT (30.5167 50.4333) | United States | +----------+---------------------------+-----------------+ | 2 | POINT (32 49) | United States | +----------+---------------------------+-----------------+ | 3 | POINT (32 49) | United States | +----------+---------------------------+-----------------+ | 4 | POINT (30.5167 50.4333) | United States | +----------+---------------------------+-----------------+ | 5 | POINT (30.5167 50.4333) | United States | +----------+---------------------------+-----------------+ Create a query with a derived attribute ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This query creates a new attribute called 'derived' based off a join of the 'Actor1Name' and 'Actor1Geo\_FullName' attribute. This could be used to show the actor and location of the event, for example. .. code-block:: java String property = "derived=strConcat(Actor1Name,strConcat(' - ',Actor1Geo_FullName)),geom"; String[] properties = new String[] { property }; Query query = new Query(simpleFeatureTypeName, cqlFilter, properties); **Output** +----------+---------------------------+-----------------------------------------------------+ | Result | geom | derived | +==========+===========================+=====================================================+ | 1 | POINT (30.5167 50.4333) | UNITED STATES - Kyiv, Kyyiv, Misto, Ukraine | +----------+---------------------------+-----------------------------------------------------+ | 2 | POINT (32 49) | UNITED STATES - Ukraine | +----------+---------------------------+-----------------------------------------------------+ | 3 | POINT (30.5167 50.4333) | UNITED STATES - Kiev, Ukraine (general), Ukraine | +----------+---------------------------+-----------------------------------------------------+ | 4 | POINT (32 49) | UNITED STATES - Ukraine | +----------+---------------------------+-----------------------------------------------------+ | 5 | POINT (30.5167 50.4333) | UNITED NATIONS - Kiev, Ukraine (general), Ukraine | +----------+---------------------------+-----------------------------------------------------+ Create a query with a geometric transformation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This query performs a geometric transformation on the points returned, buffering them by a fixed amount. This could be used to estimate an area of impact around a particular event, for example. .. code-block:: java String[] properties = new String[] {"geom,derived=buffer(geom, 2)"}; Query query = new Query(simpleFeatureTypeName, cqlFilter, properties); **Output| Result | geom | derived || 1 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) || 2 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) || 3 | POINT (32 49) | POLYGON ((34 49, 33.961570560806464 48.609819355967744, 33.84775906502257 48.23463313526982, 33.66293922460509 47.8888595339608, 33.41421356237309 47.58578643762691, 33.1111404660392 47.33706077539491, 32.76536686473018 47.15224093497743, 32.390180644032256 47.038429439193536, 32 47, 31.609819355967744 47.038429439193536, 31.23463313526982 47.15224093497743, 30.888859533960797 47.33706077539491, 30.585786437626904 47.58578643762691, 30.33706077539491 47.8888595339608, 30.152240934977428 48.234633135269824, 30.03842943919354 48.609819355967744, 30 49, 30.03842943919354 49.390180644032256, 30.152240934977428 49.76536686473018, 30.33706077539491 50.11114046603921, 30.585786437626908 50.4142135623731, 30.888859533960797 50.66293922460509, 31.234633135269824 50.84775906502257, 31.609819355967748 50.961570560806464, 32.00000000000001 51, 32.39018064403226 50.96157056080646, 32.76536686473018 50.84775906502257, 33.11114046603921 50.66293922460509, 33.4142135623731 50.41421356237309, 33.6629392246051 50.111140466039195, 33.84775906502258 49.765366864730176, 33.961570560806464 49.39018064403225, 34 49)) || 4 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) || 5 | POINT (30.5167 50.4333) | POLYGON ((32.5167 50.4333, 32.478270560806465 50.04311935596775, 32.36445906502257 49.66793313526982, 32.17963922460509 49.3221595339608, 31.930913562373096 49.01908643762691, 31.627840466039206 48.77036077539491, 31.28206686473018 48.58554093497743, 30.906880644032256 48.47172943919354, 30.5167 48.4333, 30.126519355967744 48.47172943919354, 29.75133313526982 48.58554093497743, 29.405559533960798 48.77036077539491, 29.102486437626904 49.01908643762691, 28.85376077539491 49.3221595339608, 28.668940934977428 49.66793313526983, 28.55512943919354 50.04311935596775, 28.5167 50.4333, 28.55512943919354 50.82348064403226, 28.668940934977428 51.198666864730185, 28.85376077539491 51.54444046603921, 29.102486437626908 51.8475135623731, 29.405559533960798 52.09623922460509, 29.751333135269824 52.281059065022575, 30.126519355967748 52.39487056080647, 30.516700000000004 52.4333, 30.906880644032263 52.39487056080646, 31.282066864730186 52.281059065022575, 31.62784046603921 52.09623922460509, 31.9309135623731 51.847513562373095, 32.1796392246051 51.5444404660392, 32.36445906502258 51.19866686473018, 32.478270560806465 50.82348064403225, 32.5167 50.4333)) |