Embedded SQL
It is possible for MariaDB to act as an embedded SQL library in TileDB. By using this embedded library, you can avoid installing and maintaining an active MariaDB server on your machine. MariaDB acts as any other library dependency to TileDB, and contains TileDB as a pluggable storage engine so that you can query any TileDB array via SQL. When embedding MariaDB, you should note that you are subject to the GPL license of MariaDB itself.
You can use embedded SQL either via a Python, or directly from the MariaDB REPL, explained below.
Usage in Python
For convenience, TileDB offers a python package TileDB-SQL-Py, which is a fork of mysqlclient-python. The changes in the fork include adding back support for embedded linking, and packaging with binary wheels in Conda, PyPi and a Docker image. This allows you to simply import the Python package and run SQL queries against TileDB Arrays using Pandas.
You can install the package as follows:
You can run SQL queries against any TileDB array (on any backend) as follows:
To set TileDB configuration parameters, you just need an init_command
argument in connect
:
Note that, if your array URI is longer than 64 characters, you need to use MariaDB's assisted table discovery for long names by simply doing the following:
When importing tiledb.sql
you may get the following errors:
Got ERROR: "Could not open mysql.plugin table. Some plugins may be not loaded" errno: 2000
Got ERROR: "Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist" errno: 2000
Got ERROR: "Can't open the mysql.func table. Please run mysql_upgrade to create it." errno: 2000
You can safely ignore these, since they are an artifact of the way we embed MariaDB as a library; MariaDB is looking for tables that do not exist as we do not run it as a server. We will remove those errors soon.
Usage via MariaDB REPL
The MariaDB client REPL is also available to use from Conda for exploring arrays directly.
You can issue any SQL queries:
Last updated