Embedded SQL

It is possible for MariaDB to act as an embedded SQL library in TileDB. By using this embedded library, you can avoid installing and maintaining an active MariaDB server on your machine. MariaDB acts as any other library dependency to TileDB, and contains TileDB as a pluggable storage engine so that you can query any TileDB array via SQL. When embedding MariaDB, you should note that you are subject to the GPL license of MariaDB itself.

You can use embedded SQL either via a Python, or directly from the MariaDB REPL, explained below.

Usage in Python

For convenience, TileDB offers a python package TileDB-SQL-Py, which is a fork of mysqlclient-python. The changes in the fork include adding back support for embedded linking, and packaging with binary wheels in Conda, PyPi and a Docker image. This allows you to simply import the Python package and run SQL queries against TileDB Arrays using Pandas.

You can install the package as follows:

Docker
Conda
docker run -it --rm tiledb/tiledb-sql-py
# Create a new conda environment
conda create -n tiledb
conda activate tiledb
# Install libtiledb-sql
conda install -c conda-forge libtiledb-sql
# Install a compiler, needed for cpython until binary is available in conda
conda install -c conda-forge c-compiler
# Install python tiledb-sql from git
pip install git+https://github.com/TileDB-Inc/TileDB-SQL-Py@ss/packaging-for-conda

You can run SQL queries against any TileDB array (on any backend) as follows:

Python
import tiledb.sql, pandas
db = tiledb.sql.connect()
pandas.read_sql(sql="select * from `<array_uri>`", con=db)

To set TileDB configuration parameters, you just need an init_command argument in connect:

Python
import tiledb.sql, pandas
db = tiledb.sql.connect(db="test",
init_command="set mytile_tiledb_config='rest.token=yyy,rest.server_address=https://api.tiledb.com,...'")
pandas.read_sql(sql="select * from `<array_uri>`", con=db)

When importing tiledb.sql you may get the following errors: Got ERROR: "Could not open mysql.plugin table. Some plugins may be not loaded" errno: 2000 Got ERROR: "Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist" errno: 2000 Got ERROR: "Can't open the mysql.func table. Please run mysql_upgrade to create it." errno: 2000

You can safely ignore these, since they are an artifact of the way we embed MariaDB as a library; MariaDB is looking for tables that do not exist as we do not run it as a server. We will remove those errors soon.

Usage via MariaDB REPL

The MariaDB client REPL is also available to use from Conda for exploring arrays directly.

Conda
conda create -n tiledb
conda activate tiledb
conda install -c conda-forge libtiledb-sql
# Copy error messages to expected directory
cp ${CONDA_PREFIX}/share/english/errmsg.sys ${CONDA_PREFIX}/errmsg.sys
# Create empty test dir for database connection
mkdir ${CONDA_PREFIX}/test
mysql_embedded test

You can issue any SQL queries:

SQL
select * from `<array_uri>`;