GeoTools Overview ================= The main abstraction in GeoMesa is the GeoTools ``DataStore``. Understanding the GeoTools API is important to integrating with GeoMesa. The full GeoTools documentation is available `here `__, but this section gives a concise overview of the main ways to interact with a data store. .. note:: This section is focused on users who want to integrate with GeoMesa through code. Many use cases do not require this; data can be ingested using the GeoMesa command-line tools or Apache NiFi processors, and accessed through GeoServer OGC requests or Spark. Even so, this page can provide useful background on the concepts behind those operations. A data store provides read and write access to spatial data. The API itself does not distinguish between different storage formats. Thus, the API for accessing data stored in a local shape file will be the same as for accessing data stored in an HBase cluster. GeoMesa provides several different data stores implementations, including HBase, Accumulo, Kafka, and others. See :ref:`geomesa_data_stores` for more information on the different data stores available. SimpleFeatureType and SimpleFeature ----------------------------------- In GeoTools, a ``SimpleFeatureType`` defines the names and types of the attributes in a given schema. It is similar to the table definition of a relational database. ``SimpleFeatureType``\ s can be described with a type name and a specification, typically a string indicating the attributes names and types. A ``SimpleFeature`` is a struct data type, equivalent to a single row in a relational database table. Each ``SimpleFeature`` is associated with a ``SimpleFeatureType``, and has a unique identifier (the feature ID) and a list of values corresponding to the attributes in the ``SimpleFeatureType``. See below for examples of creating and managing ``SimpleFeatureType``\ s. The "simple" in ``SimpleFeatureType`` refers to its flat data structure. It is also possible to have "complex" feature types, which are similar to joins in a relational database. However, complex feature types are not widely used or supported. Getting a Data Store Instance ----------------------------- Data stores are accessed through ``org.geotools.data.DataStoreFinder#getDataStore``. The function takes a parameter map, which is used to dynamically load a data store. For example, to load a GeoMesa HBase data store include the parameter key ``"hbase.catalog"``. Data stores are dynamically loaded; the appropriate data store implementation and all of its required dependencies must be on the classpath. GeoMesa data stores are thread-safe (although not all methods on the data store return thread-safe objects). Generally, a data store should be loaded once and then used repeatedly. When no longer needed, a data store should be cleaned up by calling the ``dispose()`` method. See the links in :ref:`geomesa_data_stores` for an explanation of the parameters for each data store implementation. .. tabs:: .. code-tab:: java import org.geotools.data.DataStore; import org.geotools.data.DataStoreFinder; import org.locationtech.geomesa.hbase.data.HBaseDataStoreParams; Map parameters = new HashMap<>(); // HBaseDataStoreParams.HBaseCatalogParam().key is the string "hbase.catalog" // the GeoMesa HBase data store will recognize the key and attempt to load itself parameters.put(HBaseDataStoreParams.HBaseCatalogParam().key, "mycatalog"); DataStore store = null; try { store = DataStoreFinder.getDataStore(parameters); } catch (IOException e) { e.printStackTrace(); } // when finished, be sure to clean up the store if (store != null) { store.dispose(); } .. code-tab:: scala import org.geotools.data.DataStoreFinder import org.locationtech.geomesa.hbase.data.HBaseDataStoreParams import scala.collection.JavaConverters._ // HBaseDataStoreParams.HBaseCatalogParam.key is the string "hbase.catalog" // the GeoMesa HBase data store will recognize the key and attempt to load itself val params = Map(HBaseDataStoreParams.HBaseCatalogParam.key -> "mycatalog") val store = DataStoreFinder.getDataStore(params.asJava) // when finished, be sure to clean up the store store.dispose() Creating a Schema ----------------- Each data store can contain multiple ``SimpleFeatureType``\ s, or schemas. Existing schemas can be listed with the ``getTypeNames`` and ``getSchema`` methods. Schemas can be created, updated and deleted through the ``createSchema``, ``updateSchema`` and ``removeSchema`` methods, respectively. See :ref:`attribute_types` for a list of the attribute type bindings available. .. tabs:: .. code-tab:: java import org.locationtech.geomesa.utils.interop.SimpleFeatureTypes; import org.opengis.feature.simple.SimpleFeatureType; try { String[] types = store.getTypeNames(); boolean exists = false; for (String type: types) { if (type.equals("purchases")) { exists = true; break; } } if (!exists) { SimpleFeatureType myType = SimpleFeatureTypes.createType( "purchases", "item:String,amount:Double,date:Date,location:Point:srid=4326"); store.createSchema(myType); } } catch (IOException e) { e.printStackTrace(); } .. code-tab:: scala import org.locationtech.geomesa.utils.geotools.SimpleFeatureTypes if (!store.getTypeNames.contains("purchases")) { val myType = SimpleFeatureTypes.createType( "purchases", "item:String,amount:Double,date:Date,location:Point:srid=4326") store.createSchema(myType) } Writing Data ------------ Data stores support writing data on a row-by-row basis. There are two different write paths - appending writes and modifying writes. .. warning:: Pay close attention to the use of ``PROVIDED_FID`` in the following sections. This hint controls the behavior of each feature ID. Some data stores support transactions, which can be used to isolate a group of operations. GeoMesa does not support transactions, so the default GeoTools ``Transaction.AUTO_COMMIT`` is used in the examples. Generally, once a writer is successfully closed, the data has been persisted to the underlying store. Until then, data may be cached and buffered locally, and may not be persisted or available to query. Appending Writes ^^^^^^^^^^^^^^^^ An appending writer can be obtained through the ``getFeatureWriterAppend`` method. A feature writer is similar to an iterator; ``next`` is called to obtain a new feature, the feature is updated with the values to be written, and then ``write`` is called to persist it. Once all writes are complete, the feature writer should be closed. The ID used to uniquely identify a feature is called the feature ID, or ``FID``. By default, GeoTools will generate a new feature ID for each feature. To specify a feature ID, set the ``PROVIDED_FID`` hint in the feature user data, as shown below. .. warning:: It is a logical error to write the same feature ID more than once with an appending feature writer. This may result in inconsistencies in the persisted data. Refer to the next section for how to safely update existing features. .. tabs:: .. code-tab:: java import org.geotools.data.FeatureWriter; import org.geotools.data.Transaction; import org.geotools.util.factory.Hints; import org.opengis.feature.simple.SimpleFeature; import org.opengis.feature.simple.SimpleFeatureType; // use try-with-resources to close the writer when done try (FeatureWriter writer = store.getFeatureWriterAppend("purchases", Transaction.AUTO_COMMIT)) { // repeat as needed, once per feature // note: hasNext() will always return false, but can be ignored SimpleFeature next = writer.next(); next.getUserData().put(Hints.PROVIDED_FID, "id-01"); next.setAttribute("item", "swag"); next.setAttribute("amount", 20.0); // attributes will be converted to the appropriate type if needed next.setAttribute("date", "2020-01-01T00:00:00.000Z"); next.setAttribute("location", "POINT (-82.379 34.1782)"); writer.write(); } catch (IOException e) { e.printStackTrace(); } .. code-tab:: scala import org.geotools.util.factory.Hints val writer = store.getFeatureWriterAppend("purchases", Transaction.AUTO_COMMIT) try { // repeat as needed, once per feature // note: hasNext will always return false, but can be ignored val next = writer.next() next.getUserData.put(Hints.PROVIDED_FID, "id-01") next.setAttribute("item", "swag") next.setAttribute("amount", 20.0) // attributes will be converted to the appropriate type if needed next.setAttribute("date", "2020-01-01T00:00:00.000Z") next.setAttribute("location", "POINT (-82.379 34.1782)") writer.write() } finally { writer.close() } An alternative way to make appending writes is to use a ``FeatureStore``. GeoTools defines a ``FeatureSource`` as read-only. ``FeatureStore`` extends ``FeatureSource`` and provides write functionality, but must be checked with a runtime cast. .. tabs:: .. code-tab:: java import org.geotools.data.simple.SimpleFeatureCollection; import org.geotools.data.simple.SimpleFeatureSource; import org.geotools.data.simple.SimpleFeatureStore; import org.geotools.feature.DefaultFeatureCollection; try { SimpleFeatureSource source = store.getFeatureSource("purchases"); if (source instanceof SimpleFeatureStore) { SimpleFeatureCollection collection = new DefaultFeatureCollection(); // omitted - add features to the collection ((SimpleFeatureStore) source).addFeatures(collection); } else { throw new IllegalStateException("Store is read only"); } } catch (IOException e) { e.printStackTrace(); } .. code-tab:: scala import org.geotools.data.simple.SimpleFeatureStore import org.geotools.feature.DefaultFeatureCollection store.getFeatureSource("purchases") match { case s: SimpleFeatureStore => val collection = new DefaultFeatureCollection() collection.add(???) s.addFeatures(collection) case _ => throw new IllegalStateException("Store is read only") } Modifying Writes ^^^^^^^^^^^^^^^^ In order to update an existing feature, a modifying writer must be used through the method ``getFeatureWriter``, which requires a filter specifying the features to be updated. A modifying feature writer is similar to an appending feature writer, except that the method ``hasNext`` will return ``true`` as long as there are additional features to modify. The features returned from ``next`` will be pre-populated with the current data for each feature. Filters can be created through the GeoTools method ``ECQL.toFilter``. See the GeoTools `documentation `__ for more information on CQL filters. .. tabs:: .. code-tab:: java import org.geotools.data.FeatureWriter; import org.geotools.data.Transaction; import org.geotools.filter.text.cql2.CQLException; import org.geotools.filter.text.ecql.ECQL; import org.opengis.feature.simple.SimpleFeature; import org.opengis.feature.simple.SimpleFeatureType; try (FeatureWriter writer = store.getFeatureWriter("purchases", ECQL.toFilter("IN ('id-01')"), Transaction.AUTO_COMMIT)) { while (writer.hasNext()) { SimpleFeature next = writer.next(); next.setAttribute("amount", 21.0); writer.write(); // or, to delete it: writer.remove(); } } catch (IOException | CQLException e) { e.printStackTrace(); } .. code-tab:: scala import org.geotools.data.Transaction import org.geotools.filter.text.ecql.ECQL val filter = ECQL.toFilter("IN ('id-01')") val writer = store.getFeatureWriter("purchases", filter, Transaction.AUTO_COMMIT) try { while (writer.hasNext) { val next = writer.next next.setAttribute("amount", 21.0) writer.write() // or, to delete it: writer.remove() } } finally { writer.close() } Reading Data ------------ Once data has been persisted, it can be read back through the ``getFeatureReader`` method. GeoTools returns a "live" iterator of results that may point to a remote location. Generally data is not actually read from the backing store until it is required, so it is possible to read a few records without fetching the entire result set. To filter the results that come back, predicates can be created using the "common query language", CQL. Filters can be created through the GeoTools method ``ECQL.toFilter``. See the GeoTools `documentation `__ for more information on CQL filters. .. tabs:: .. code-tab:: java import org.geotools.data.DataUtilities; import org.geotools.data.FeatureReader; import org.geotools.data.Query; import org.geotools.data.Transaction; import org.geotools.filter.text.cql2.CQLException; import org.geotools.filter.text.ecql.ECQL; import org.opengis.feature.simple.SimpleFeature; import org.opengis.feature.simple.SimpleFeatureType; try { Query query = new Query("purchases", ECQL.toFilter("bbox(location,-85,30,-80,35)")); try (FeatureReader reader = store.getFeatureReader(query, Transaction.AUTO_COMMIT)) { while (reader.hasNext()) { SimpleFeature next = reader.next(); System.out.println(DataUtilities.encodeFeature(next)); } } } catch (IOException | CQLException e) { e.printStackTrace(); } .. code-tab:: scala import org.geotools.data.{DataUtilities, Query, Transaction} import org.geotools.filter.text.ecql.ECQL val query = new Query("purchases", ECQL.toFilter("bbox(location,-85,30,-80,35)")) val reader = store.getFeatureReader(query, Transaction.AUTO_COMMIT) try { while (reader.hasNext) { val next = reader.next println(DataUtilities.encodeFeature(next)) } } finally { reader.close() }