19.5. Kafka Command-Line Tools¶
The GeoMesa Kafka distribution includes a set of command-line tools for feature management, ingest, export and debugging.
To install the tools, see Setting up the Kafka Command Line Tools.
Once installed, the tools should be available through the command geomesa-kafka:
$ geomesa-kafka
INFO Usage: geomesa-kafka [command] [command options]
Commands:
...
Commands that are common to multiple back ends are described in Command-Line Tools. The commands here are Kafka-specific.
19.5.1. General Arguments¶
Most commands require you to specify the connection to Kafka. This generally includes the list of Kafka brokers, specified
with --brokers (or -b), and the catalog topic for storing metadata, specified with --catalog (or -c).
Connection properties can be specified with --producer-config for producers, --consumer-config for consumers, or
--config which will be applied to both. See the official Kafka documentation for the available
producer and
consumer configs.
When using the legacy Zookeeper metadata persistence, the Zookeeper servers must be specified with --zookeepers (or
-z). The Zookeeper path for storing metadata can be specified with --zkpath (or -p), which supersedes --catalog.
See Migration from Zookeeper for details on migrating away from Zookeeper.
To connect to Confluent Schema Registry topics, use --schema-registry
to provide the registry URL.
The --auths argument corresponds to the data store parameter geomesa.security.auths. See
Data Security for more information.
19.5.2. Commands¶
19.5.2.1. create-schema¶
See create-schema for an overview of this command.
In addition to the regular options, Kafka allows the number of partitions and the replication factor of the Kafka topic to be specified.
Argument |
Description |
|---|---|
|
The number of partitions used for the Kafka topic |
|
The replication factor for the Kafka topic |
19.5.2.2. export¶
See export for an overview of this command.
Unlike the standard export, this command will not not terminate until it is cancelled (through a shell interrupt)
or until --max-features have been read. Thus, it can be used to monitor a topic.
This command differs from the listen command (below) in that it allows filtering and output in various formats.
It will also ignore drop and clear messages generated by feature deletion.
In addition to the regular options, Kafka allows control over the consumer behavior:
Argument |
Description |
|---|---|
|
Start reading messages from the beginning of the Kafka topic, instead of the end |
|
Number of consumers used to read the topic |
The --num-consumers argument can be used to increase read speed. However, there can be at most one
consumer per topic partition.
The --from-beginning argument can be used to start reading the Kafka topic from the start. Otherwise,
only new messages that are sent after this command is invoked will be read.
19.5.2.3. ingest¶
See ingest for an overview of this command.
In addition to the regular options, Kafka allows the number of partitions and the replication factor of the Kafka topic to be specified. In addition, an artificial delay can be inserted to simulate a live data stream.
Argument |
Description |
|---|---|
|
The number of partitions used for the Kafka topic |
|
The replication factor for the Kafka topic |
|
The serialization format to use |
|
The delay inserted between messages |
The --delay argument should be specified as a duration, in plain language. For example, 100 millis
or 1 second. The ingest will pause after creating each SimpleFeature for the specified delay.
This can be used to simulate a live data stream.
19.5.2.4. playback¶
The playback command can simulate a data stream by replaying features from a file directly on to a Kafka Data Store. Features are returned based on a date attribute in the feature. For example, if replaying three features that have dates that are each one second apart, each feature will be emitted after a delay of one second. The rate of export can be modified to speed up or slow down the original time differences.
The playback command is an extension of the ingest command, and accepts all the parameters outlined there. Additionally, it accepts the following parameters:
Argument |
Description |
|---|---|
|
Date attribute to base playback on. If not specified, will use the default schema date field |
|
Rate multiplier to speed-up (or slow down) features being returned, as a float |
|
Will modify the returned dates to match the current time |
Note
Input files (specified in --src-list or <files>...) must be time-ordered by the --dtg
attribute before ingest or the playback will not work as expected.
The --rate parameter can be used to speed up or slow down the replay. It is specified as a floating point
number. For example --rate 10 will make replay ten times faster, while --rate 0.1 will make replay
ten times slower.
19.5.2.5. listen¶
This command behaves similarly to the export command (above), but it does not provide options
for filtering or output format. It will show each message on the Kafka topic, including drop and
clear messages generated from feature deletion.
This command will not not terminate until it is cancelled (through a shell interrupt).
Argument |
Description |
|---|---|
|
The name of the schema |
|
Start reading messages from the beginning of the Kafka topic, instead of the end |
|
Number of consumers used to read the topic |
The --num-consumers argument can be used to increase read speed. However, there can be at most one
consumer per topic partition.
The --from-beginning argument can be used to start reading the Kafka topic from the start. Otherwise,
only new messages that are sent after this command is invoked will be read.
19.5.2.6. migrate-zookeeper-metadata¶
This command will migrate schema metadata out of Zookeeper. For additional information, see Migration from Zookeeper.
Argument |
Description |
|---|---|
|
Delete the metadata out of Zookeeper after migrating it |