Serverless SQL

Basic Usage

SQL queries can be executed with the TileDB Cloud client as follows.
Python
R
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
# res is a Pandas dataframe
7
res = tiledb.cloud.sql.exec(
8
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`")
Copied!
1
library(tilebcloud)
2
3
tiledbcloud::login(username="my_username", password="my_password")
4
# or tiledbcloud::login(api_key="my_token")
5
6
# res is a data.frame
7
res <- execute_sql_query(
8
"select `cols`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `cols`",
9
)
10
show(res)
Copied!
Supposing that there is an array tiledb://user/array_name that you have write permissions for, you can run a SQL query that writes the results to that array as follows:
Python
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
tiledb.cloud.sql.exec(
7
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`",
8
output_uri = "tiledb://user/array_name")
Copied!
You can run any SQL statement, including CTEs.
Python
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
# INSERT INTO ...
7
tiledb.cloud.sql.exec(query = "insert into `tiledb://user1/output1` select AVG(a) from `tiledb://user1/myarray1`")
8
9
# CTE
10
tiledb.cloud.sql.exec(query =
11
"""
12
WITH t1 AS (select `dim`, `dim2`, AVG(a), AVG(a) OVER (partition by dim2) FROM `tiledb://user1/array1`)
13
insert into `tiledb://user/array_name`
14
select * from t1
15
""")
Copied!
If the array does not exist, you will just need to create and pass an array schema that complies with the result schema.
Python
1
import tiledb.cloud
2
import numpy as np
3
import pandas
4
5
tiledb.cloud.login(username="my_username", password="my_password")
6
# or tiledb.cloud.login(token="my_token")
7
8
# The output array has a single dimension
9
dom = tiledb.Domain(tiledb.Dim(name="rows", domain=(1, 4), tile=4, dtype=np.int32))
10
11
# The output array has a single attribute
12
attr = tiledb.Attr(name="avg_a", dtype=np.float32)
13
14
# The output array is sparse
15
schema = tiledb.ArraySchema(domain=dom, sparse=True, attrs=[attr])
16
17
# The following will create the output array and write the results to it
18
tiledb.cloud.sql.exec(
19
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`",
20
output_schema = schema,
21
output_uri = "tiledb://user/s3://my_bucket/my_array") # Notice the full path here
Copied!
We also provide an auxiliary function exec_and_fetch, which executes a SQL query as above, but then opens the resulting output array so that it is ready for use.
Python
1
import tiledb.cloud
2
import pandas
3
4
tiledb.cloud.login(username="my_username", password="my_password")
5
# or tiledb.cloud.login(token="my_token")
6
7
A = tiledb.cloud.sql.exec_and_fetch(
8
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`",
9
output_uri="tiledb://user/array_name")
10
11
# `A` is an opened array for reading, ready for use
12
print(A.df[:])
Copied!

Selecting Who to Charge

If you are a member of an organization, then by default the organization is charged for your SQL query. If you would like to charge the SQL query to yourself, you just need to add one extra argument namespace.
Python
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
tiledb.cloud.sql.exec(
7
namespace ="my_username", # Who to charge the query to
8
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`",
9
output_uri = "tiledb://user/array_name")
Copied!

Async Execution

An asynchronous version of severless SQL is available. The _async version returns a future.
Python
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
# res is a future
7
res = tiledb.cloud.sql.exec_async(
8
query = "select `rows`, AVG(a) as avg_a from `tiledb://TileDB-Inc/quickstart_dense` GROUP BY `rows`")
9
10
# Call res.get() to block on the results
11
print(res.get())
Copied!

Creating New Arrays

It is also possible to use SQL to create a new array.
Python
1
import tiledb.cloud
2
3
tiledb.cloud.login(username="my_username", password="my_password")
4
# or tiledb.cloud.login(token="my_token")
5
6
# Create an array that will be registered in TileDB Cloud and stored in the
7
# S3 bucket and location specied in the `uri` parameter. The `uri` parameter
8
# overrides the normal use of the table name `example`.
9
# The array is automatically registered in TileDB cloud
10
# under `my_username`/`create_example_1`
11
tiledb.cloud.sql.exec("""
12
CREATE TABLE `example` (
13
dim0 integer DIMENSION=1 lower_bound="0" upper_bound="100" tile_extent="10",
14
dim1 integer DIMENSION=1 lower_bound="0" upper_bound="100" tile_extent="10",
15
attr1 varchar(255) filters="GZIP=-1",
16
attr2 FLOAT filters="GZIP=-1",
17
attr3 DATETIME filters="GZIP=-1"
18
) engine=MyTile
19
array_type='SPARSE'
20
coordinate_filters="NONE"
21
offset_filters="POSITIVE_DELTA=128"
22
uri='tiledb://my_username/s3://my_bucket/create_example_1';
23
""")
24
25
# You can also use `Primary Key` to specify which fields are the dimensions
26
# This replaces the `DIMENSION=1` parameter
27
# The array is automatically registered in TileDB cloud
28
# under `my_username`/`create_example_2`
29
tiledb.cloud.sql.exec("""
30
CREATE TABLE `example` (
31
dim0 integer lower_bound="0" upper_bound="100" tile_extent="10" NOT NULL,
32
dim1 bigint lower_bound="0" upper_bound="100" tile_extent="10" NOT NULL,
33
attr1 varchar(255) filters="GZIP=-1",
34
PRIMARY KEY(dim0, dim1)
35
) engine=MyTile
36
array_type='SPARSE'
37
coordinate_filters="NONE"
38
offset_filters="POSITIVE_DELTA=128"
39
uri='tiledb://my_username/s3://my_bucket/create_example_2';
40
""")
41
42
# Lastly you can also use `INDEX()` to set dimensions and allow duplicates
43
# By default duplicate coordinates for dimensions are not allowed,
44
# uniqueness is enforced just like a primary key. Using `Index()`
45
# instead triggers the array to be created allowing duplicates coordinate values.
46
# The array is automatically registered in TileDB cloud
47
# under `my_username`/`create_example_3`
48
tiledb.cloud.sql.exec("""
49
CREATE TABLE `example` (
50
dim0 integer lower_bound="0" upper_bound="100" tile_extent="10" NOT NULL,
51
dim1 varchar(255) NOT NULL,
52
attr1 varchar(255) filters="GZIP=-1",
53
INDEX(dim0, dim1)
54
) engine=MyTile
55
array_type='SPARSE'
56
coordinate_filters="NONE"
57
offset_filters="POSITIVE_DELTA=128"
58
uri='tiledb://my_username/s3://my_bucket/create_example_3';
59
""")
Copied!

Retry Settings

See Retry Settings for more information on this topic.