15.2. Using the Accumulo Data Store Programmatically

15.2.1. Creating a Data Store

An instance of an Accumulo data store can be obtained through the normal GeoTools discovery methods, assuming that the GeoMesa code is on the classpath:

import org.geotools.data.*;

Map<String, String> parameters = new HashMap<>;
parameters.put("accumulo.instance.id", "myInstance");
parameters.put("accumulo.zookeepers", "myZoo1,myZoo2,myZoo3");
parameters.put("accumulo.user", "myUser");
parameters.put("accumulo.password", "myPassword");
parameters.put("accumulo.catalog", "myNamespace.myTable");
DataStore dataStore = DataStoreFinder.getDataStore(parameters);

Instead of specifying the cluster connection explicitly, an appropriate accumulo-client.properties (for Accumulo 2) or client.conf (for Accumulo 1) may be added to the classpath. See the Accumulo documentation for information on the necessary configuration keys. Any explicit data store parameters will take precedence over the configuration file.

More information on using GeoTools can be found in the GeoTools user guide.

15.2.2. Accumulo Data Store Parameters

The Accumulo Data Store takes several parameters (required parameters are marked with *):

Parameter Type Description
accumulo.catalog * String The name of the GeoMesa catalog table, including the Accumulo namespace (e.g. “myNamespace.myCatalog”)
accumulo.instance.id String The ID of the Accumulo instance
accumulo.zookeepers String A comma separated list of zookeeper servers (e.g. “zoo1,zoo2,zoo3” or “localhost:2181”)
accumulo.user String The username used to connect to Accumulo
accumulo.password String The password for the Accumulo user
accumulo.keytab.path String Path to a Kerberos keytab file containing an entry for the specified user
geomesa.security.auths String Comma-delimited superset of authorizations that will be used for queries via Accumulo
geomesa.security.force-empty-auths Boolean Forces authorizations to be empty
geomesa.security.auth-provider String Class name for an AuthorizationsProvider implementation
geomesa.query.audit Boolean Audit queries being run. Queries will be stored in a <catalog>_queries table
geomesa.query.timeout String The max time a query will be allowed to run before being killed. The timeout is specified as a duration, e.g. 1 minute or 60 seconds
geomesa.query.threads Integer The number of threads to use per query
geomesa.query.loose-bounding-box Boolean Use loose bounding boxes - queries will be faster but may return extraneous results
accumulo.query.record-threads Integer The number of threads to use for record retrieval
accumulo.write.threads Integer The number of threads to use for writing records
geomesa.stats.generate Boolean Toggle collection of statistics for newly created feature types
accumulo.remote.arrow.enable Boolean Process Arrow encoding in Accumulo tablets servers as a distributed call
accumulo.remote.bin.enable Boolean Process binary encoding in Accumulo tablets servers as a distributed call
accumulo.remote.density.enable Boolean Process heatmap encoding in Accumulo tablets servers as a distributed call
accumulo.remote.stats.enable Boolean Process statistical calculations in Accumulo tablets servers as a distributed call
geomesa.partition.scan.parallel Boolean For partitioned schemas, execute scans in parallel instead of sequentially
geomesa.query.caching Boolean Toggle caching of results

Note: it is an error to specify both accumulo.password and accumulo.keytab.path.