Array Access

For reads, writes, embedded SQL, any integration, and any API, you can use TileDB Open Source with only two changes:

  • Set the TileDB configuration parameters rest.username and rest.passwordwith your TileDB Cloud username and password, or alternatively rest.tokenwith the API token you created.

  • Every array registered with TileDB Cloud must be accessed using a URI of the form tiledb://<namespace>/<array-name>, where<namespace> is the user or organization who owns the array and <array-name> is the array name set by the owner upon array registration. This URI is displayed on the console when viewing the array details.

Accessing arrays by setting an API token is typically faster than using your username and password.

Here are some Python/R/Java examples, although the above changes will work with any TileDB API or integration:

import tiledb, tiledb.sql
import pandas

# Create the configuration parameters
config = tiledb.Config()
config["rest.username"] = "xxx"
config["rest.password"] = "yyy"
# or, more preferably, config["rest.token"] = "ttt"

# This is the array URI format in TileDB Cloud
array_name = "tiledb://TileDB-Inc/quickstart_sparse"

# Write code exactly as in TileDB Developer
with tiledb.open(array_name, 'r', ctx=tiledb.Ctx(config)) as A:
    print (A.df[:])

# A helper function tiledb.cloud.Ctx() exists to create a context
# automatically based on a previous call to tiledb.cloud.login()
with tiledb.open(array_name, 'r', ctx=tiledb.cloud.Ctx()) as A:
    print (A.df[:])
    
# Using embedded SQL, you need to pass the username/password or token
# as config parameters in `init_command`
db = tiledb.sql.connect(db="test",
        init_command="set mytile_tiledb_config='rest.username=xxx,rest.password=yyy'")
pandas.read_sql(sql="select * from `tiledb://TileDB-Inc/quickstart_sparse`", con=db)

You can create an array inside or outside TileDB Cloud. The benefit of creating an array with TileDB Cloud is that it will be logged for auditing purposes. Moreover, it will be registered automatically with your account upon creation.

Create New Arrays

To instruct TileDB Open Source that you are creating an array through the TileDB Cloud service, you just need a single change:

  • Instead of using <array-uri> as you would typically in TileDB Open Source, you must use tiledb://<username>/<array-uri>. For example, if you wish to create an array at s3://my_bucket/my_array, you need to set the array URI to tiledb://my_username/s3://my_bucket/my_array.

import tiledb
import numpy as np

#####################
# Define the Schema #
#####################
# The array will be 4x4 with dimensions "rows" and "cols", with domain [1,4].
dom = tiledb.Domain(
    tiledb.Dim(name="rows", domain=(1, 4), tile=4, dtype=np.int32),
    tiledb.Dim(name="cols", domain=(1, 4), tile=4, dtype=np.int32),
)

# The array will be sparse with a single attribute "a" so each (i,j) cell can store an integer.
schema = tiledb.ArraySchema(
    domain=dom, sparse=True, attrs=[tiledb.Attr(name="a", dtype=np.int32)]
)

####################
# Create the Array #
####################
array_uri = "tiledb://my_username/s3://my_bucket/my_array"
# Create the (empty) array on disk.
tiledb.SparseArray.create(array_uri, schema)

Register Existing Array

It is possible to programmatically register an existing array. To do that you will need to use one of our cloud clients. See: Installation

tiledb.cloud.array.register_array(
    uri="s3://my_bucket/my_array",
    namespace="my_organization",
    array_name="my_array",
    description=None, # Optional string for markdown description
    access_credentials_name=None, # Optional access credential name. Use this to specify a credential that is not the namespace default
)

Last updated