21.9. FileSystem Metadata¶
The FileSystem data store (FSDS) stores metadata about partitions and data files, to avoid having to repeatedly interrogate the filesystem. When a new data file is added or removed, an associated metadata entry will be created to track the operation. There are two supported metadata options:
21.9.1. File System Persistence¶
Metadata information can be stored in a file under the root path for the FSDS. This is the simplest solution, as it does not require any additional infrastructure. However, the initial time required to read the metadata may be a limitation when dealing with a large number of partitions.
The file-based metadata may be specified by using the name file.
21.9.2. Relational Database Persistence¶
Alternatively, metadata may be stored in a relational database through JDBC. A relational database may be
specified by using the name jdbc, and supports the following configuration options, which can be specified
through the FileSystem Data Store Parameters fs.config.properties and fs.config.file (required options are marked with *):
Key |
Description |
|---|---|
|
Must be |
|
The JDBC connection URL, e.g. |
|
The fully-qualified name of a JDBC driver class, e.g. |
|
The database user used to create connections |
|
The password for the database user |
|
The minimum number of connections to keep idle in the connection pool |
|
The maximum number of connections to keep idle in the connection pool |
|
The maximum size of the connection pool |
|
Boolean to enable fairness when retrieving from the connection pool |
|
Boolean to enable testing connections when retrieved them from the connection pool |
|
Boolean to enable testing connections when initially creating them |
|
Boolean to enable testing idle connections in the connection pool |
Currently, only Postgres is officially supported. Other databases may work, but have not been tested.