.. _delimited_text_converter: Delimited Text Converter ======================== The delimited text converter handles plain delimited text files such as CSV or TSV. To use the delimited text converter, specify ``type = "delimited-text"`` in your converter definition. Configuration ------------- The format of the delimited files must be defined using the ``format`` element. GeoMesa uses `Apache Commons CSV `__ for parsing. The available formats are instances of ``org.apache.commons.csv.CSVFormat``: * **DEFAULT** or **CSV**: ``CSVFormat.DEFAULT`` * **TDF** or **TSV**: ``CSVFormat.TDF`` * **QUOTED**: ``CSVFormat.DEFAULT.withQuoteMode(QuoteMode.ALL)`` * **QUOTE_ESCAPE**: ``CSVFormat.DEFAULT.withEscape('"')`` * **QUOTED_WITH_QUOTE_ESCAPE**: ``CSVFormat.DEFAULT.withEscape('"').withQuoteMode(QuoteMode.ALL)`` * **EXCEL**: ``CSVFormat.EXCEL`` * **MYSQL**: ``CSVFormat.MYSQL`` * **RFC4180**: ``CSVFormat.RFC4180`` In addition, GeoMesa supports custom quote, escape and delimiter characters, which can be used to modify the base format. These can be specified through ``options.quote``, ``options.escape`` and ``options.delimiter``. Quotes and escapes can be disabled by setting the option to an empty string. If the input files have header lines, they can be skipped over by specifying a number of lines to skip using ``options.skip-lines``, e.g. ``options.skip-lines = 1``. Transform Functions ------------------- The ``transform`` element supports referencing each field in the record by its column number using ``$``. ``$0`` refers to the whole line, then the first columns is ``$1``, etc. Each column will initially be a string, so further transforms may be necessary to create the correct type. See :ref:`converter_functions` for more details. .. _convert_example_usage: Example Usage ------------- Suppose you have a ``SimpleFeatureType`` with the following schema: ``phrase:String,dtg:Date,*geom:Point:srid=4326`` And you have the following comma-separated data: :: first,hello,2015-01-01T00:00:00.000Z,45.0,45.0 second,world,2015-01-01T00:00:00.000Z,45.0,45.0 We want to concatenate the first two fields together to form the phrase, parse the third field as a date, and use the last two fields as coordinates for a ``Point`` geometry. The following configuration defines an appropriate converter for taking this CSV data and transforming it into our ``SimpleFeatureType``: :: geomesa.converters.example = { type = "delimited-text", format = "CSV", id-field = "md5(stringToBytes($0))", fields = [ { name = "phrase", transform = "concatenate($1, $2)" }, { name = "dtg", transform = "dateHourMinuteSecondMillis($3)" }, { name = "lat", transform = "$4::double" }, { name = "lon", transform = "$5::double" }, { name = "geom", transform = "point($lon, $lat)" } ] user-data = { // note: keys will be treated as strings and should not be quoted my.user.key = "$phrase" } } The ``id`` of the ``SimpleFeature`` is formed from an MD5 hash of the entire record (``$0`` is the original data). The simple feature attributes are created from the ``fields`` list with appropriate transforms (note the use of intermediate fields 'lat' and 'lon'). If desired, user data for the feature can be set by referencing fields. This can be used for setting data visibility constraints, among other things (see :ref:`data_security`).