Quickstart

TileDB-Presto is a data source connector for PrestoDB, which allows you to run SQL queries on TileDB arrays. The connector supports column subselection on attributes and predicate pushdown on dimension fields, leading to superb performance for projection and range queries.

The TileDB-Presto connector supports most SQL operations from PrestoDB. Arrays can be referenced dynamically and are not required to be "pre-registered" with Presto. No external service (such as Apache Hive) is required.

A docker image is provided to allow for quick testing of the TileDB-Presto connector. The docker image starts a single-node Presto cluster and opens the CLI Presto interface where SQL can be run. The image includes two example tiledb arrays:

  • /opt/tiledb_example_arrays/dense_global(dense array)

  • /opt/tiledb_example_arrays/sparse_global(sparse array)

Simply run:

# Run PrestoDB
docker run -it --rm tiledb/tiledb-presto

# Run PrestoDB adding your S3 access keys as env variables
docker run -e AWS_ACCESS_KEY_ID="<key>" -e AWS_SECRET_ACCESS_KEY="<secret>" -it tiledb/tiledb-presto

# Run PrestoDB by mounting an existing local array
docker run -it --rm -v /local/array/path:/data/local_array tiledb/tiledb-presto

You can run a quick example to see if it works:

show columns from "file:///opt/tiledb_example_arrays/dense_global";
 Column |  Type   | Extra |  Comment  
--------+---------+-------+-----------
 rows   | integer |       | Dimension 
 cols   | integer |       | Dimension 
 a      | integer |       | Attribute
select * from "file:///opt/tiledb_example_arrays/dense_global" 
WHERE rows = 3 AND cols between 1 and 2;
 rows | cols | a 
------+------+---
    3 |    1 | 5 
    3 |    2 | 6

It is possible to specify a file that contains SQL to be run from the docker image:

echo 'select * from "file:///opt/tiledb_example_arrays/dense_global" limit 10;' > example.sql
docker run -it --rm -v ${PWD}/example.sql:/tmp/example.sql tiledb/tiledb-presto /opt/presto/bin/entrypoint.sh --file /tmp/example.sql

You can also run a SQL statement directly:

docker run -it --rm tiledb/tiledb-presto /opt/presto/bin/entrypoint.sh --execute 'select * from "file:///opt/tiledb_example_arrays/dense_global" limit 10;'

Last updated