16.2. Using SFT and Converter Definitions with Command-Line Tools¶
Several GeoMesa binary distributions ship with prepackaged feature type and converter definitions for common data types including Twitter, GeoNames, T-drive, and many more. These converters can be used with the GeoMesa command-line tools in these distributions out of the box. See Prepackaged Converter Definitions.
Users can add additional SFT and converter types by providing a reference.conf file
embedded with a JAR within the lib directory, or by adding the types to the
application.conf file in the $GEOMESA_ACCUMULO_HOME/conf or $GEOMESA_KAKFA_HOME/conf
directory.
Note
The example below is specific to the GeoMesa Accumulo distribution, but the general principle is the same for each distribution. Only the home variable and command-line tool name will differ depending on GeoMesa distribution.
Given the following sample CSV file example.csv:
ID,Name,Age,LastSeen,Friends,Lat,Lon
23623,Harry,20,2015-05-06,"Will, Mark, Suzan",-100.236523,23
26236,Hermione,25,2015-06-07,"Edward, Bill, Harry",40.232,-53.2356
3233,Severus,30,2015-10-23,"Tom, Riddle, Voldemort",3,-62.23
A “renegades” SFT and “renegades-csv” converter may be specified in
the GeoMesa Tools configuration file ($GEOMESA_ACCUMULO_HOME/conf/application.conf)
as shown below. By default, SFTs will be loaded from the file
at the path geomesa.sfts and converters will be loaded at the path
geomesa.converters. Each converter and SFT definition is keyed by the name that
can be referenced in the converter and SFT loaders.
$GEOMESA_ACCUMULO_HOME/conf/application.conf:
geomesa = {
sfts = {
# other SFTs
# ...
"renegades" = {
attributes = [
{ name = "fid", type = "Integer", index = false }
{ name = "name", type = "String", index = true }
{ name = "age", type = "Integer", index = false }
{ name = "lastseen", type = "Date", index = true }
{ name = "friends", type = "List[String]", index = true }
{ name = "geom", type = "Point", index = true, srid = 4326, default = true }
]
}
}
converters = {
# other converters
# ...
"renegades-csv" = {
type = "delimited-text",
format = "CSV",
options {
skip-lines = 1
},
id-field = "toString($fid)",
fields = [
{ name = "fid", transform = "$1::int" }
{ name = "name", transform = "$2::string" }
{ name = "age", transform = "$3::int" }
{ name = "lastseen", transform = "date('YYYY-MM-dd', $4)" }
{ name = "friends", transform = "parseList('string', $5)" }
{ name = "lon", transform = "$6::double" }
{ name = "lat", transform = "$7::double" }
{ name = "geom", transform = "point($lon, $lat)" }
]
}
}
}
Use geomesa env to confirm that geomesa ingest can properly read
the updated file.
$ geomesa env
Once the converter and SFT are registered, it can be used to ingest the
example.csv file:
$ geomesa ingest -u <user> -p <pass> -i <instance> -z <zookeepers> -s renegades -C renegades-csv example.csv