13. GeoMesa Processes

The following analytic processes are available and optimized on GeoMesa data stores, found in the geomesa-process module:

Where possible, the calculations are pushed out to a distributed system for faster performance. Currently this has been implemented in the Accumulo data store and partially in the HBase data store. Other back-ends can still be used, but local processing will be used.

13.1. Installation

While they can be used independently, the common use case is to use them with GeoServer. To deploy them in GeoServer requires:

  1. a GeoMesa datastore plugin

  2. the GeoServer WPS extension

Note

Some processes also require custom output formats, available separately in the GPL licensed GeoMesa GeoServer WFS module

The GeoMesa datastore plugin is available in the binary distribution in the gs-plugins directory.

Documentation about the GeoServer WPS Extension (including download instructions) is available here.

To verify the install, start GeoServer, and you should see a line like INFO [geoserver.wps] - Found 15 bindable processes in GeoMesa Process Factory.

In the GeoServer web UI, click ‘Demos’ and then ‘WPS request builder’. From the request builder, under ‘Choose Process’, click on any of the ‘geomesa:’ options to build up example requests and in some cases see results.

13.2. Processors

13.2.1. ArrowConversionProcess

The ArrowConversionProcess converts an input feature collection to arrow format.

Parameters

Description

features

Input feature collection to encode

includeFids

Include feature IDs in arrow file

proxyFids

Proxy feature IDs to integers instead of strings

formatVersion

Arrow IPC format version

dictionaryFields

Attributes to dictionary encode

useCachedDictionaries

Use cached top-k stats (if available), or run a dynamic stats query to build dictionaries

sortField

Attribute to sort by

sortReverse

Reverse the default sort order

batchSize

Number of features to include in each record batch

doublePass

Build dictionaries first, then query results in a separate scan

13.2.2. BinConversionProcess

The BinConversionProcess converts an input feature collection to BIN format.

Parameters

Description

features

Input feature collection to query

track

Track field to use for BIN records

geom

Geometry field to use for BIN records

dtg

Use cached top-k stats (if available), or run a dynamic stats query to build dictionaries

label

Attribute to sort by

axisOrder

Reverse the default sort order

13.2.3. DensityProcess

The DensityProcess computes a density map over a set of features stored in GeoMesa. A raster image is returned.

Parameters

Description

data

Input Simple Feature Collection to run the density process over

radiusPixels

Radius of the density kernel in pixels. Controls the “fuzziness” of the density map

weightAttr

Name of the attribute to use for data point weights

outputBBOX

Bounding box and CRS of the output raster

outputWidth

Width of the output raster in pixels

outputHeight

Height of the output raster in pixels

13.2.4. DateOffsetProcess

The DateOffsetProcess modifies the specified date field in a feature collection by an input time period.

Parameters

Description

data

Input features

dateField

The date attribute to modify

timeOffset

Time offset (e.g. P1D)

13.2.5. HashAttributeProcess

The HashAttributeProcess adds an attribute to each SimpleFeature that hashes the configured attribute modulo the configured param.

Parameters

Description

data

Input Simple Feature Collection to run the hash process over

attribute

The attribute to hash on

modulo

The divisor

13.2.5.1. Hash example (XML)

HashAttributeProcess_wps.xml is a geoserver WPS call to the GeoMesa HashAttributeProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@HashAttributeProcess_wps.xml localhost:8080/geoserver/wps

The query should generate results that look like this:

{
    "id" : "d0971735-f8fe-47ed-a7cd-2e12280e8ac1",
    "geometry" : {
        "coordinates" : [
            151.1554,
            18.2014
        ],
        "type" : "Point"
    },
    "type" : "Feature",
    "properties" : {
        "Vitesse" : 614,
        "Heading" : 244,
        "Date" : "2016-05-02T18:00:44.030+0000",
        "hash" : 237,
        "CabId" : 150002,
     }
 }

13.2.6. HashAttributeColorProcess

The HashAttributeColorProcess adds an attribute to each SimpleFeature that hashes the configured attribute modulo the configured param and emit a color.

Parameters

Description

data

Input Simple Feature Collection to run the hash process over

attribute

The attribute to hash on

modulo

The divisor

13.2.7. JoinProcess

The JoinProcess queries a feature type based on attributes from a second feature type.

Parameters

Description

primary

Primary feature collection being queried

secondary

Secondary feature collection to be joined

joinAttribute

Attribute field to join on

joinFilter

Additional filter to apply to joined features

attributes

Attributes to return. Attribute names should be qualified with the schema name, e.g. foo.bar

13.2.8. KNearestNeighborSearchProcess

The KNearestNeighborSearchProcess performs a K-nearest-neighbor search on a feature collection using a second feature collection as input. It returns k neighbors for each point in the input data set. Note that if a feature is the nearest neighbor of multiple points in the input data set, it is returned only once.

Parameter

Description

inputFeatures

Input feature collection. The geometries of the features defines the KNN search

dataFeatures

The data set to query for neighbors

numDesired

k, number of nearest neighbors to return

estimatedDistance

Estimate of the distance in meters for the k-th nearest neighbor, used for the initial query window

maxSearchDistance

Maximum search distance in meters, used to prevent runaway queries of the entire data set

13.2.8.1. K-Nearest-Neighbor Example (XML)

KNNProcess_wps.xml is a geoserver WPS call to the GeoMesa KNearestNeighborSearchProcess. In this example, it is chained with a Query process (see Chaining Processes), in order to avoid returning the query features as data. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@KNNProcess_wps.xml localhost:8080/geoserver/wps

13.2.9. Point2PointProcess

The Point2PointProcess aggregates a collection of points into a collection of line segments.

Parameters

Description

data

Input feature collection

groupingField

Field on which to group

sortField

Field on which to sort (must be Date type)

minimumNumberOfPoints

Minimum number of points

breakOnDay

Break connections on day marks

filterSingularPoints

Filter out segments that fall on the same point

13.2.9.1. Point2Point example (XML)

Point2PointProcess_wps.xml is a geoserver WPS call to the GeoMesa Point2PointProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@Point2PointProcess_wps.xml localhost:8080/geoserver/wps

The query should generate results that look like this:

{
    "id" : "367152240-4",
    "geometry" : {
        "coordinates" : [
            [
                -13.4041,
                37.8067
            ],
            [
                -13.4041,
                37.8068
            ]
        ],
        "type" : "LineString"
    },
    "type" : "Feature",
    "properties" : {
        "Date_end" : "2018-02-05T14:54:36.598+0000",
        "CabId" : 367152240,
        "Date_start" : "2018-02-05T14:53:58.078+0000"
    }
}

13.2.10. ProximitySearchProcess

The ProximitySearchProcess performs a proximity search on a Geomesa feature collection using another feature collection as input.

Parameters

Description

inputFeatures

Input feature collection that defines the proximity search

dataFeatures

The data set to query for matching features

bufferDistance

Buffer size in meters

13.2.10.1. Proximity search example (XML)

ProximitySearchProcess_wps.xml is a geoserver WPS call to the GeoMesa ProximitySearchProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@ProximitySearchProcess_wps.xml localhost:8080/geoserver/wps

13.2.11. RouteSearchProcess

The RouteSearchProcess finds features around a route that are heading along the route and not just crossing over it.

Parameters

Description

features

Input feature collection to query

routes

Routes to search along. Features must have a geometry of LineString

bufferSize

Buffer size (in meters) to search around the route

headingThreshold

Threshold for comparing headings, in degrees

routeGeomField

Attribute that will be examined for routes to match. Must be a LineString

geomField

Attribute that will be examined for route matching

bidirectional

Consider the direction of the route or just the path of the route

headingField

Attribute that will be examined for heading in the input features. If not provided, input features geometries must be LineStrings

13.2.11.1. Route search example (XML)

RouteSearchProcess_wps.xml is a geoserver WPS call to the GeoMesa RouteSearchProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@RouteSearchProcess_wps.xml localhost:8080/geoserver/wps

13.2.12. SamplingProcess

The SamplingProcess uses statistical sampling to reduces the features returned by a query.

Parameters

Description

data

Input features.

samplePercent

Percent of features to return, between 0 and 1.

threadBy

Attribute field to link associated features for sampling.

13.2.12.1. Sampling example (XML)

SamplingProcess_wps.xml is a geoserver WPS call to the GeoMesa SamplingProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@SamplingProcess_wps.xml localhost:8080/geoserver/wps

13.2.13. StatsProcess

The StatsProcess allows the running of statistics on a given feature set.

Parameters

Description

features

The feature set on which to query. Can be a raw text input, reference to a remote URL, a subquery or a vector layer

statString

Stat string indicating which stats to instantiate - see below

encode

Return the values encoded as json. Must be true or false; empty values will not work

properties

The properties / transforms to apply before gathering stats

13.2.13.1. Stat Strings

Stat strings are a GeoMesa domain specific language (DSL) that allows the specification of stats for the iterators to collect. See Statistical Queries for an explanation of the available stats.

13.2.14. TrackLabelProcess

The TrackLabelProcess returns a single feature that is the head of a track of related simple features.

Parameters

Description

data

Input features

track

Track attribute to use for grouping features

dtg

Date attribute to use for ordering tracks

13.2.14.1. TrackLabel example (XML)

TrackLabelProcess_wps.xml is a geoserver WPS call to the GeoMesa TrackLabelProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@TrackLabelProcess_wps.xml localhost:8080/geoserver/wps

13.2.15. TubeSelectProcess

The TubeSelectProcess performs a tube select on a Geomesa feature collection based on another feature collection. To get more informations on TubeSelectProcess and how to use it, you can read this tutorial.

Parameters

Description

tubeFeatures

Input feature collection (must have geometry and datetime)

featureCollection

The data set to query for matching features

filter

The filter to apply to the featureCollection

maxSpeed

Max speed of the object in m/s for nofill & line gapfill methods

maxTime

Time as seconds for nofill & line gapfill methods

bufferSize

Buffer size in meters to use instead of maxSpeed/maxTime calculation

maxBins

Number of bins to use for breaking up query into individual queries

gapFill

Method of filling gap (nofill, line)

13.2.15.1. TubeSelect example (XML)

TubeSelectProcess_wps.xml is a geoserver WPS call to the GeoMesa TubeSelectProcess. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@TubeSelectProcess_wps.xml localhost:8080/geoserver/wps

13.2.16. QueryProcess

The QueryProcess takes an (E)CQL query/filter for a given feature set as a text object and returns the result as a json object.

Parameters

Description

features

The data source feature collection to query. Reference as store:layername.

For an XML file enter <wfs:Query typeName=store:layername /> For interactive WPS request builder select VECTOR_LAYER & choose store:layername

filter

The filter to apply to the feature collection.

For an XML file enter:

<wps:ComplexData mimeType="text/plain; subtype=cql">
   <![CDATA[some-query-text]]
</wps:ComplexData>
For interactive WPS request builder select TEXT & choose "text/plain; subtype=cql"

enter the query text in the text box

output

Specify how the output feature collection will be presented.

For an XML file enter:

<wps:ResponseForm>
   <wps:RawDataOutput mimeType="application/json">
      <ows:Identifier>result</ows:Identifier>
   </wps:RawDataOutput>
</wps:ResponseForm>

For interactive WPS request builder check the Generate box and choose “application/json”

properties

The properties / transforms to apply before gathering stats.

13.2.16.1. Query example (XML)

QueryProcess_wps.xml is a geoserver WPS call to the GeoMesa QueryProcess that performs the same query shown in the Accumulo-quickstart. It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@QueryProcess_wps.xml localhost:8080/geoserver/wps

The query should generate results that look like this:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          -76.513,
          -37.4941
        ]
      },
      "properties": {
        "Who": "Bierce",
        "What": 931,
        "When": "2014-07-04T22:25:38.000+0000"
      },
      "id": "Observation.931"
    }
  ]
}

13.2.17. UniqueProcess

The UniqueProcess class is optimized for GeoMesa to find unique attributes values for a feature collection, which are returned as a json object.

Parameters

Description

features

The data source feature collection to query. Reference as store:layername.

For an XML file enter <wfs:Query typeName=store:layername /> For interactive WPS request builder select VECTOR_LAYER & choose store:layername

attribute

The attribute for which unique values will be extracted. Attributes are expressed as a string.

For an XML file enter <wps:LiteralData>attribute-name</wps:LiteralData>

filter

The filter to apply to the feature collection.

For an XML file enter:

<wps:ComplexData mimeType="text/plain; subtype=cql">
   <![CDATA[some-query-text]]
</wps:ComplexData>
For interactive WPS request builder select TEXT & choose "text/plain; subtype=cql"

enter the query text in the text box.

histogram

Create a histogram of attribute values. Expressed as a boolean (true/false).

For an XML file enter <wps:LiteralData>true/false</wps:LiteralData>

sort

Sort the results. Expressed as a string; allowed values are ASC or DESC.

For an XML file enter <wps:LiteralData>ASC/DESC</wps:LiteralData>

sortByCount

Sort by histogram counts instead of attribute values. Expressed as a boolean (true/false).

For an XML file enter <wps:LiteralData>true/false</wps:LiteralData>

output

Specify how the output feature collection will be presented.

For an XML file enter:

<wps:ResponseForm>
   <wps:RawDataOutput mimeType="application/json">
      <ows:Identifier>result</ows:Identifier>
   </wps:RawDataOutput>
</wps:ResponseForm>

For interactive WPS request builder check the Generate box and choose “application/json”

13.2.17.1. Unique example (XML)

UniqueProcess_wps.xml is a geoserver WPS call to the GeoMesa UniqueProcess that reports the unique names in in the ‘Who’ field of the Accumulo quickstart data for a restricted bounding box (-77.5, -37.5, -76.5, -36.5)). It can be run with the following curl call:

curl -v -u admin:geoserver -H "Content-Type: text/xml" -d@UniqueProcess_wps.xml localhost:8080/geoserver/wps

The query should generate results that look like this:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "value": "Addams",
        "count": 37
      },
      "id": "fid--21d4eb0_15b68e0e8ca_-7fd6"
    },
    {
      "type": "Feature",
      "properties": {
        "value": "Bierce",
        "count": 43
      },
      "id": "fid--21d4eb0_15b68e0e8ca_-7fd5"
    },
    {
      "type": "Feature",
      "properties": {
        "value": "Clemens",
        "count": 48
      },
      "id": "fid--21d4eb0_15b68e0e8ca_-7fd4"
    }
  ]
}

13.2.18. Chaining Processes

WPS processes can be chained, using the result of one process as the input for another. For example, a bounding box in a GeoMesa QueryProcess can be used to restrict data sent to StatsProcess. GeoMesa_WPS_chain_example.xml will get all points from the AccumuloQuickStart table that are within a specified bounding box (-77.5, -37.5, -76.5, -36.5), and calculate descriptive statistics on the ‘What’ attribute of the results.

The query should generate results that look like this:

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "geometry": {
        "type": "Point",
        "coordinates": [
          0,
          0
        ]
      },
      "properties": {
        "stats": "{\"count\":128,\"minimum\":[29.0],\"maximum\":[991.0],\"mean\":[508.5781249999999],\"population_variance\":[85116.25952148438],\"population_standard_deviation\":[291.74691004616375],\"population_skewness\":[-0.11170819256679464],\"population_kurtosis\":[1.7823482287566166],\"population_excess_kurtosis\":[-1.2176517712433834],\"sample_variance\":[85786.46628937007],\"sample_standard_deviation\":[292.893267743337],\"sample_skewness\":[-0.11303718280959842],\"sample_kurtosis\":[1.8519712064424219],\"sample_excess_kurtosis\":[-1.1480287935575781],\"population_covariance\":[85116.25952148438],\"population_correlation\":[1.0],\"sample_covariance\":[85786.46628937007],\"sample_correlation\":[1.0]}"
      },
      "id": "stat"
    }
  ]
}