Limitations
The TileDB connector supports most Presto functionality. Below is a list of the features not currently supported.
Encrypted Arrays
The connector does not currently support creating/writing/reading encrypted arrays
OpenAt Timestamp
The connector does not currently support the TileDB openAt
functionality to open an array at a specific timestamp.
Datatypes
TileDB Presto connector supports the following SQL datatypes:
BOOLEAN
TINYINT
INTEGER
BIGINT
REAL
DOUBLE
DECIMAL (treated as doubles)
STRING*
VARCHAR*
CHAR*
VARBINARY
No other datatypes are supported.
Unsigned Integers
The TileDB Presto connector does not have full support for unsigned values. Presto and all connectors are written in Java, and Java does not have unsigned values. As a result of this Java limitation, an unsigned 64-bit integer can overflow if it is larger than 2^63 - 1
. Unsigned integers that are 8, 16 or 32 bits are treated as larger integers. For instance, an unsigned 32-bit value is read into a Java type of long
.
Variable-length Char/Varchar fields
For varchar
, and char
datatypes the special case of char(1)
or varchar(1)
is stored on disk as a fixed-sized attribute of size 1. Any char
/varchar
greater than 1 is stored as a variable-length attribute in TileDB. TileDB will not enforce the length parameter but Presto will for inserts.
Decimal Type
Decimal types are currently treated as doubles. TileDB does not enforce the precision or scale of the decimal types.
Create Table
Create table is supported, however only a limited subset of TileDB parameters is supported.
No support for creating encrypted arrays
No support for setting custom filters on attributes, coordinates or offsets
Splits
The current split implementation is naive and splits domains evenly with user defined predicates (WHERE
clause) or from the non-empty domains. This even splitting will likely produce sub optimal splits for sparse domains. Future work will move splitting into core TileDB where better heuristics will be used to produce even splits.
For now, if splits are highly uneven consider increasing the number of splits via the tiledb.splits
session parameter or add where clauses to limit the data set to non-empty regions of the array.
Last updated