TileDB-Presto is a data source connector for PrestoDB, which allows you to run SQL queries on TileDB arrays. The connector supports column subselection on attributes and predicate pushdown on dimension fields, leading to superb performance for projection and range queries.
The TileDB-Presto connector supports most SQL operations from PrestoDB. Arrays can be referenced dynamically and are not required to be "pre-registered" with Presto. No external service (such as Apache Hive) is required.
A docker image is provided to allow for quick testing of the TileDB-Presto connector. The docker image starts a single-node Presto cluster and opens the CLI Presto interface where SQL can be run. The image includes two example tiledb arrays:
# Run PrestoDBdockerrun-it--rmtiledb/tiledb-presto# Run PrestoDB adding your S3 access keys as env variablesdockerrun-eAWS_ACCESS_KEY_ID="<key>"-eAWS_SECRET_ACCESS_KEY="<secret>"-ittiledb/tiledb-presto# Run PrestoDB by mounting an existing local arraydockerrun-it--rm-v/local/array/path:/data/local_arraytiledb/tiledb-presto
You can run a quick example to see if it works:
show columns from"file:///opt/tiledb_example_arrays/dense_global";
Column | Type | Extra | Comment
--------+---------+-------+-----------
rows | integer | | Dimension
cols | integer | | Dimension
a | integer | | Attribute