16.2. Using SFT and Converter Definitions with Command-Line Tools

Several GeoMesa binary distributions ship with prepackaged feature type and converter definitions for common data types including Twitter, GeoNames, T-drive, and many more. These converters can be used with the GeoMesa command-line tools in these distributions out of the box. See Prepackaged Converter Definitions.

Users can add additional SFT and converter types by providing a reference.conf file embedded with a JAR within the lib directory, or by adding the types to the application.conf file in the $GEOMESA_ACCUMULO_HOME/conf or $GEOMESA_KAKFA_HOME/conf directory.

Note

The example below is specific to the GeoMesa Accumulo distribution, but the general principle is the same for each distribution. Only the home variable and command-line tool name will differ depending on GeoMesa distribution.

Given the following sample CSV file example.csv:

ID,Name,Age,LastSeen,Friends,Lat,Lon
23623,Harry,20,2015-05-06,"Will, Mark, Suzan",-100.236523,23
26236,Hermione,25,2015-06-07,"Edward, Bill, Harry",40.232,-53.2356
3233,Severus,30,2015-10-23,"Tom, Riddle, Voldemort",3,-62.23

A “renegades” SFT and “renegades-csv” converter may be specified in the GeoMesa Tools configuration file ($GEOMESA_ACCUMULO_HOME/conf/application.conf) as shown below. By default, SFTs will be loaded from the file at the path geomesa.sfts and converters will be loaded at the path geomesa.converters. Each converter and SFT definition is keyed by the name that can be referenced in the converter and SFT loaders.

$GEOMESA_ACCUMULO_HOME/conf/application.conf:

geomesa = {
  sfts = {
     # other SFTs
     # ...
    "renegades" = {
      attributes = [
        { name = "fid",      type = "Integer",      index = false                             }
        { name = "name",     type = "String",       index = true                              }
        { name = "age",      type = "Integer",      index = false                             }
        { name = "lastseen", type = "Date",         index = true                              }
        { name = "friends",  type = "List[String]", index = true                              }
        { name = "geom",     type = "Point",        index = true, srid = 4326, default = true }
      ]
    }
  }
  converters = {
     # other converters
     # ...
    "renegades-csv" = {
      type = "delimited-text",
      format = "CSV",
      options {
        skip-lines = 1
      },
      id-field = "toString($fid)",
      fields = [
        { name = "fid",       transform = "$1::int"                 }
        { name = "name",     transform = "$2::string"              }
        { name = "age",      transform = "$3::int"                 }
        { name = "lastseen", transform = "date('YYYY-MM-dd', $4)"  }
        { name = "friends",  transform = "parseList('string', $5)" }
        { name = "lon",      transform = "$6::double"              }
        { name = "lat",      transform = "$7::double"              }
        { name = "geom",     transform = "point($lon, $lat)"       }
      ]
    }
  }
}

Use geomesa env to confirm that geomesa ingest can properly read the updated file.

$ geomesa env

Once the converter and SFT are registered, it can be used to ingest the example.csv file:

$ geomesa ingest -u <user> -p <pass> -i <instance> -z <zookeepers> -s renegades -C renegades-csv  example.csv