21.8. Partition Schemes¶
Partition schemes define how data is stored on the filesystem. The scheme is important because it determines how the data is queried. When evaluating a query filter, the partition scheme is leveraged to prune data files that do not match the filter. There are three main types of partition schemes provided: spatial, temporal and attribute.
The partition scheme must be provided when creating a schema. The scheme is defined by a well-known name and additional configuration options. See Configuring the Partition Scheme for details on how to specify a partition scheme.
21.8.1. Temporal Schemes¶
Temporal schemes partition data based on time. The following names are supported:
year(oryears/yearly)month(ormonths/monthly)week(orweeks/weekly)day(ordays/daily)hour(orhours/hourly)
The following options are supported:
attribute- The name of aDate-type attribute from the SimpleFeatureType to use. If not specified, the default date attribute is used.step- The number of time units (hours, days, etc) to include in each partition. If not specified, the default is1.
21.8.2. Spatial Schemes¶
Spatial schemes lay out data based on a space-filling curve. The following names are supported:
z2- A curve suitable for point-type geometriesxz2- A curve suitable for geometries with extents (e.g. non-points such as line strings or polygons)
The following options is required:
bits- The number of bits to use for the curve, which defines the area of each partition. For example, 2 bits would create2 ^ 2(4) regions, while 3 bits would create2 ^ 3(8) regions.
The following options are supported:
attribute- The name of aGeometry-type attribute from the SimpleFeatureType to use. If not specified, the default geometry is used.
21.8.3. Attribute Scheme¶
The attribute scheme partitions data based on a lexicoded attribute value. The name must be:
attribute
The following option is required:
attribute- The name of the attribute used to partition
The following options are supported:
default- A default value to use if the attribute is nullallow- An allowed value.allowmay be specified more than once, in order to allow multiple values. If an attribute is not in the allowed values, the thedefaultvalue will be used instead
The following additional options are supported to bucket the partition values, depending on the type of attribute being used:
width- For string type attributes, the value will be truncated towidthmax lengthdivisor- For integral type attributes (e.g. ints and longs), the value will be rounded down so that it is divisible bydivisor. For example, withdivisor=10,100,109, etc will all be truncated to100.scale- For fractional type attributes (e.g. floats and doubles), the number of digits to keep to the right of the decimal place. For example, withscale=2,100.001,100.009, etc will all be truncated to100.00
The attribute scheme supports the following attribute types: String, Integer, Long, Float and Double.
21.8.4. Hash Scheme¶
The hash scheme partitions data into buckets based on an attribute value. The name must be:
hash
The following options are required:
attribute- The name of the attribute used to partitionbuckets- The number of buckets used to partition
The hash scheme supports the following attribute types:
String, Integer, Long, Float, Double, Date, Bytes, and UUID.