17. Kudu Data Store

Note

GeoMesa currently supports Kudu version 1.7.x.

The GeoMesa Kudu Data Store is an implementation of the GeoTools DataStore interface that is backed by Apache Kudu. It is found in the geomesa-kudu directory of the GeoMesa source distribution.

Apache Kudu completes Hadoop’s storage layer to enable fast analytics on fast data.

GeoMesa leverages Kudu predicate push-down, column selection, and more, all through the GeoTools API. The Kudu Data Store is a good choice for running Spark analytics on a few attributes at a time, as the columnar storage format minimizes the underlying data reads. In addition, due to Kudu’s compression options and ability to rapidly query non-indexed columns, space on disk is minimized compared to more complex systems like HBase.

Warning

The GeoMesa Kudu data store is an alpha-level feature, and hasn’t been robustly tested at scale.

To get started with the Kudu Data Store, try the GeoMesa Kudu Quick Start tutorial.