The TileDB Cloud client offers several useful utilities. To use them, you must have the client installed (see Installation).
Login Sessions
TileDB Cloud allows you to login (with your username/password or API token) in a way such that the session token can be cached to avoid logging in again for every program execution. This is done as follows:
# With username/passwordtiledb.cloud.login(username='xxx', password='xxx')# Or, with tokentiledb.cloud.login(token='xxx')
# With username/passwordtiledbcloud::login(username='xxx', password='xxx')# Or, with tokentiledbcloud::login(api_key='xxx')
// Use the api token for the Java client. You can leave username and password as null.TileDBClient tileDBClient =newTileDBClient(new TileDBLogin("username","password","<TILEDB_API_TOKEN>",true,true,true));
After logging in for the first time, the TileDB Cloud client will store a session token in configuration file $HOME/.tiledb/cloud.jsoncreated in your home directory.
Retry Settings
The TileDB Cloud clients have the ability to retry failed HTTP requests automatically. By default this is enabled for retrying when TileDB Cloud indicates there is not enough capacity for the request (HTTP 503 errors). For convenience we also offer the ability to disable retries or to enable more forceful retry settings.
Built in modes
# Set default retry for only retrying "not enough capacity" responsestiledb.cloud.client.client.retry_mode("default")# Set do not retry any requeststiledb.cloud.client.client.retry_mode("disabled")# Retry for a large number of scenariostiledb.cloud.client.client.retry_mode("forceful")
In "forceful" mode it is possible that the client might retry requests which will always fail, such as when there is a syntax error in a SQL query. This mode should be used with care to avoid increased costs from retrying.
All built in modes (besides disabled) will retry a request up to 10 times.
Custom Retry Logic
It is also possible to manually set retry conditions to suite your needs.
from urllib3 import Retry# Set the retries to a urllib3 retry objecttiledb.cloud.config.config.retries=Retry( total=10, backoff_factor=0.25, status_forcelist=[400, 500, 501, 502, 503], allowed_methods=["HEAD","GET","PUT","DELETE","OPTIONS","TRACE","POST","PATCH",], raise_on_status=False, remove_headers_on_redirect=[], )# After updating the config make sure to update the package level clientstiledb.cloud.client.client.update_clients()
Context and Config
There are two helper functions that allow to easily create a tiledb config or context that has the proper configuration needed for slicing arrays through TileDB Cloud.
# Create a TileDB Config object with `rest.token` set from the logincfg =tiledb.cloud.Config()# Create a TileDB Context which has a config with `rest.token` set from the loginctx =tiledb.cloud.Ctx()
# Create a TileDB Config object with `rest.token` set from the loginconfig <-tiledb_config()# Create a TileDB Context which has a config with `rest.token` set from the loginctx <-tiledb_ctx(config)
Viewing the User Profile
You can see your user profile as follows:
prof = tiledb.cloud.user_profile()print(prof)
prof <- tiledbcloud::user_profile()print(prof)
UserApi apiInstance =newUserApi(defaultClient);try {User result =apiInstance.getUser();System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling UserApi#getUser");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Listing Arrays
You can list arrays from the cloud service, passing a variety of filters:
# List all arrays you ownowned_arrays = tiledb.cloud.list_arrays()print(owned_arrays)# List all arrays that are shared with youshared_arrays = tiledb.cloud.list_shared_arrays()print(shared_arrays)# List all public arrayspublic_arrays = tiledb.cloud.list_public_arrays()print(public_arrays)# List arrays in a specific namespacetiledb_inc_arrays = tiledb.cloud.list_arrays(namespace="TileDB-Inc")print(tiledb_inc_arrays)# Filter arrays to only those you have read and write permissions torw_arrays = tiledb.cloud.list_arrays(permissions=["read", "write"])print(rw_arrays)# You can combine filtersarrays = tiledb.cloud.list_arrays(namespace="TileDB-Inc", permissions=["read"])print(arrays)# Search keywordsarrays = tiledb.cloud.list_public_arrays(namespace="TileDB-Inc", search="nyc")print(arrays)# List specific asset types that are based upon Arrays. See also Listing Groups# Available asset types:# FileType.FILE# FileType.USER_DEFINED_FUNCTION# FileType.REGISTERED_TASK_GRAPH# FileType.ML_MODEL# FileType.NOTEBOOKassets = tiledb.cloud.list_public_arrays(namespace="TileDB-Inc", file_type=[tiledb.cloud.rest_api.FileType.NOTEBOOK, tiledb.cloud.rest_api.FileType.ML_MODEL])
print(assets)
# List all arrays you ownowned_arrays <- tiledbcloud::list_arrays()str(owned_arrays)# List all arrays that are shared with youshared_arrays <- tiledbcloud::list_arrays(shared=TRUE)str(shared_arrays)# List all public arrayspublic_arrays <- tiledbcloud::list_arrays(public=TRUE)str(public_arrays)# List arrays in a specific namespacetiledb_inc_arrays = tiledbcloud::list_arrays(namespace="TileDB-Inc")str(tiledb_inc_arrays)
String namespace ="<TILEDB_NAMESPACE>"; // String | namespace array is in (an organization name or user's username)try {//get arrays in namespaceList<ArrayInfo> result =apiInstance.getArraysInNamespace(namespace);} catch (ApiException e) {System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Getting Array Information
You can run the following to get basic information about the array, such as its description:
info = tiledb.cloud.info("tiledb://TileDB-Inc/quickstart_sparse")print(info)
info <- tiledbcloud::array_info(namespace="TileDB-Inc", arrayname="quickstart_sparse")str(info)
String namespace ="TileDB-Inc"; // String | namespace array is in (an organization name or user's username)String array ="quickstart_sparse"; // String | name/uri of array that is url-encodedtry {ArrayInfo result =apiInstance.getArrayMetadata(namespace, array);System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling ArrayApi#getArrayMetadata");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Array Activity
Array activity can be fetched programmatically as follows:
String namespace ="<NAMESAPCE>"; // String | namespace array is in (an organization name or user's username)String array ="<ARRAY_NAME>"; // String | name/uri of array that is url-encodedInteger start =null; // Integer | Start time of window of fetch logs, unix epoch in seconds (default: seven days ago)Integer end = null; // Integer | End time of window of fetch logs, unix epoch in seconds (default: current utc timestamp)
String eventTypes = null; // String | Event values can be one or more of the following read, write, create, delete, register, deregister, comma separated
String taskId =null; // String | Array task ID To filter activity toBoolean hasTaskId =false; // Boolean | Excludes activity log results that do not contain an array task UUIDtry { List<ArrayActivityLog> result = apiInstance.arrayActivityLog(namespace, array, start, end, eventTypes, taskId, hasTaskId);
System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling ArrayApi#arrayActivityLog");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Listing Tasks
You can list tasks from the cloud service, passing a variety of filters:
# List all tasksall_tasks = tiledb.cloud.fetch_tasks()print(all_tasks)# List only tasks on a specific arrayarray_tasks = tiledb.cloud.fetch_tasks(array="tiledb://TileDB-Inc/quickstart_sparse")print(array_tasks)# Lists tasks within a specific time periodimport datetimeninety_days_ago = datetime.datetime.utcnow()- datetime.timedelta(days=90)datetime_tasks = tiledb.cloud.fetch_tasks( array="tiledb://TileDB-Inc/quickstart_sparse", start=ninety_days_ago)print(datetime_tasks)# Filter tasks by status, valid statuses are RUNNING, FAILED, COMPLETEDrunning_tasks = tiledb.cloud.fetch_tasks(status="RUNNING")print(running_tasks)
TasksApi apiInstance =newTasksApi(defaultClient);String namespace ="<NAMESPACE"; // String | namespace to filterString createdBy =null; // String | username to filterString array ="<ARRAY_URI>"; // String | name/uri of array that is url-encoded to filterInteger start =null; // Integer | start time for tasks to filter byInteger end =null; // Integer | end time for tasks to filter byInteger page =null; // Integer | pagination offsetInteger perPage =null; // Integer | pagination limitString type =null; // String | task type, \"QUERY\", \"SQL\", \"UDF\", \"GENERIC_UDF\"List<String> excludeType = Arrays.asList(); // List<String> | task_type to exclude matching array in results, more than one can be included
List<String> fileType =Arrays.asList(); // List<String> | match file_type of task array, more than one can be includedList<String> excludeFileType = Arrays.asList(); // List<String> | exclude file_type of task arrays, more than one can be included
String status =null; // String | Filter to only return these statusesString search =null; // String | search string that will look at name, namespace or description fieldsString orderby =null; // String | sort by which field valid values include start_time, nametry { ArrayTaskData result = apiInstance.tasksGet(namespace, createdBy, array, start, end, page, perPage, type, excludeType, fileType, excludeFileType, status, search, orderby);
System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling TasksApi#tasksGet");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
For convenience, you can also see the last SQL or UDF task:
# Get last SQL tasktiledb.cloud.last_sql_task()# Get last UDF tasktiledb.cloud.last_udf_task()
Or you can get a specific task with a given task ID (which can be found on the UI console):
task = tiledb.cloud.task(id='xxx')
Registering an Array
In addition to registering S3-stored TileDB arrays with TileDB cloud via the console, you can also do it programmatically as follows:
tiledb.cloud.register_array(uri="s3://mybucket/myarray", namespace="user1", # Optional, you may register it under your username, or one of your organizations array_name="myarray", description=None, # Optional access_credentials_name="myCredentials") # You must have already added your AWS credentials on the console
String namespace ="<NAMESPACE>"; // String | namespace array is in (an organization name or user's username)String array ="s3://<S3_BUCKET>/<ARRAY_NAME>"; // String | name/uri of s3 array that is url-encodedArrayInfoUpdate arrayMetadata =newArrayInfoUpdate(); // ArrayInfoUpdate | metadata associated with arrayarrayMetadata.setUri("s3://<S3_BUCKET>/<ARRAY_NAME>");arrayMetadata.setName("<ARRAY_NAME>");try {ArrayInfo result =arrayApi.registerArray(namespace, array, arrayMetadata);System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling ArrayApi#registerArray");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
ArrayApi apiInstance =newArrayApi(defaultClient);String namespace ="<NAMESPACE>"; // String | namespace array is in (an organization name or user's username)String array ="<ARRAY_NAME>"; // String | name/uri of array that is url-encodedtry {apiInstance.deregisterArray(namespace, array);} catch (ApiException e) {System.err.println("Exception when calling ArrayApi#deregisterArray");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Deregistering an array will not physically delete it.
Sharing Arrays
You can programmatically share a registered array, "unshare" a registered array (i.e., revoke access) and list array sharing information as follows:
# Share an array with both read and write permissions with a usertiledb.cloud.share_array(uri="tiledb://user1/myarray", namespace="user1", # The user to share the array with permissions=["read", "write"])# Revoke access to an array for a particular user tiledb.cloud.unshare_array(uri="tiledb://user1/myarray", namespace="user1")# Get sharing information about an arrayshared_with = tiledb.cloud.list_shared_with("tiledb://user1/myarray")print(shared_with)
ArrayApi apiInstance =newArrayApi(defaultClient);String namespace = "namespace_example"; // String | namespace array is in (an organization name or user's username) to share the array with.
String array ="array_example"; // String | name/uri of array that is url-encodedArraySharing arraySharing = new ArraySharing(); // ArraySharing | Namespace and list of permissions to share with. An empty list of permissions will remove the namespace; if permissions already exist they will be deleted then new ones added. In the event of a failure, the new policies will be rolled back to prevent partial policies, and it's likely the array will not be shared with the namespace at all.
arraySharing.addActionsItem(ArrayActions.READ); //enable read permissions.try {//share an array with read persmissions.apiInstance.shareArray(namespace, array, arraySharing);//to unshare an array use an empy arraySharing object arraySharing =null;apiInstance.shareArray(namespace, array, arraySharing);//Get sharing information about an arrayList<ArraySharing> result =apiInstance.getArraySharingPolicies(namespace, array);} catch (ApiException e) {System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Invite to Array
Similar to #sharing-arrays, you can invite users to an array as follows:
When accessing an array or group via the API, your request will be automatically routed to the instance closest to the data. If you already know the region, a compute region can be accessed directly with a configured parameter to manually bypass automatic redirection. Manually specifying the region can be helpful if you want to avoid the slight increase in latency that the redirection adds.
To access a region directly the domain is of the scheme: <region>.aws.api.tiledb.com
The five domains we currently support are:
us-east-1.aws.api.tiledb.com
us-west-2.aws.api.tiledb.com
eu-west-1.aws.api.tiledb.com
eu-west-2.aws.api.tiledb.com
ap-southeast-1.aws.api.tiledb.com
You can manually set the domain to send a request directly to a region as follows:
import tiledb, tiledb.sqlimport pandas# Create the configuration parametersconfig = tiledb.Config()config["rest.username"]="xxx"config["rest.password"]="yyy"# or, more preferably, config["rest.token"] = "my_token"# Manually set the server address to the redirection URLconfig["rest.server_address"]="https://eu-west-2.aws.api.tiledb.com"# This is the array URI format in TileDB Cloudarray_name ="tiledb://TileDB-Inc/quickstart_sparse-eu-west-2"# Write code exactly as in TileDB Developerwith tiledb.open(array_name, 'r', ctx=tiledb.Ctx(config))as A:print (A.df[:])# Using embedded SQL, you need to pass the username/password # as config parameters as well as the server address in `init_command`db = tiledb.sql.connect(db="test", init_command="set mytile_tiledb_config='rest.username=xxx,rest.password=xxx,rest.server_address=https://eu-west-2.aws.api.tiledb.com'")
pandas.read_sql(sql="select * from `tiledb://TileDB-Inc/quickstart_sparse-eu-west-2`", con=db)
import tiledb, tiledb.cloud, numpydefmedian(numpy_ordered_dictionary):return numpy.median(numpy_ordered_dictionary["a"])tiledb.cloud.login(username="xxx", password="yyy", host="https://eu-west-2.aws.api.tiledb.com")# or tiledb.cloud.login(token="my_token", host="https://eu-west-2.aws.api.tiledb.com")with tiledb.open("tiledb://TileDB-Inc/quickstart_sparse-eu-west-2", ctx=tiledb.cloud.Ctx())as A:# apply on subarray [1,2]x[1,2] res = A.apply(median, [(1,2), (1,2)], attrs = ["a"])print(res)
//set the configConfig config =newConfig();config.set("rest.username","xxx");config.set("rest.password","yyy");config.set("rest.server_address","https://eu-west-2.aws.api.tiledb.com")Context ctx =newContext(config);//open the array as you would normally do in the Java API.Array array =newArray(ctx,"tiledb://TileDB-Inc/quickstart_sparse-eu-west-2");//print the array schemaSystem.out.println(array.getSchema());
Files
TileDB Cloud has the ability to convert files to and from the TileDB File representation. This allows you to store any arbitrary file as a 1 dimensions dense array. Importing and exporting to and from the original file format is supported directly through TileDB Cloud. The file-arrays can be stored on an object store, such as S3, directly.
# Import from s3 to a TileDB array,# automatically registering it with TileDB Cloudtiledb.cloud.files.utils.create_file( namespace="my_organization", name="my_file", # optional name to set for registered file input_uri="s3://my_bucket/files/my_file.pdf", output_uri="s3://my_bucket/files/arrays/my_file")# Export back to S3 in the original format# The export happens completely in TileDB cloudtiledb.cloud.files.utils.export_file( uri="tiledb://my_organization/my_file", output_uri="s3://my_bucket/files/arrays/my_file")# Export back to local filesystem in the original formattiledb.cloud.files.utils.export_file( uri="tiledb://my_organization/my_file", output_uri="my_file_exported.pdf",)
//Use the TileDB-Java package// Set up the config with your TileDB-Cloud credentialsConfig config =newConfig();// For s3 accessconfig.set("vfs.s3.aws_access_key_id","<ID>");config.set("vfs.s3.aws_secret_access_key","<KEY>");// For TileDB-Cloud access.// You can either use rest.username and rest.passwordconfig.set("rest.username","<USERNAME>");config.set("rest.password","<PASSWORD>");// Or rest.tokenconfig.set("rest.token","<TOKEN>");Context ctx =newContext(config);// Create the array schema of an array based on the file to be savedArraySchema arraySchema =FileStore.schemaCreate(ctx,"<FILENAME>");// Create a TileDB array with the schemaArray.create("tiledb://<NAMESPACE_NAME>/s3://<BUCKET_NAME>/<ARRAY_NAME>", arraySchema);// Import the file to be saved to the TileDB arrayFileStore.uriImport( ctx,"tiledb://<NAMESPACE_NAME>/<ARRAY_NAME>","<FILENAME>",MimeType.TILEDB_MIME_AUTODETECT);// Export/download the file from TileDB and save it with a given name.FileStore.uriExport(ctx,"tiledb://<NAMESPACE_NAME>/<ARRAY_NAME>","<OUTPUT_FILENAME>");
Registering a Group
In addition to registering S3-stored TileDB groups with TileDB cloud via the console, you can also do it programmatically as follows:
tiledb.cloud.groups.register("s3://mybucket/mygroup", namespace="user1", # Optional, you may register it under your username, or one of your organizations name="mygroup", description=None, # Optional credentials_name="myCredentials")# You must have already added your AWS credentials on the console
String namespace ="<NAMESPACE>"; // String | namespace array is in (an organization name or user's username)String array ="s3://<S3_BUCKET>/<ARRAY_NAME>"; // String | name/uri of s3 array that is url-encodedGroupInfoUpdate groupInfo =newGroupInfoUpdate(); // ArrayInfoUpdate | metadata associated with arraygroupInfo.setUri("s3://<S3_BUCKET>/<ARRAY_NAME>");groupInfo.setName("<ARRAY_NAME>");try {GroupInfo result =arrayApi.registerGroup(namespace, array, groupInfo);System.out.println(result);} catch (ApiException e) {System.err.println("Exception when calling ArrayApi#registerArray");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
GroupApi apiInstance =newGroupApi(defaultClient);String namespace ="<NAMESPACE>"; // String | namespace array is in (an organization name or user's username)String array ="<ARRAY_NAME>"; // String | name/uri of array that is url-encodedtry {apiInstance.deregisterGroup(namespace, array);} catch (ApiException e) {System.err.println("Exception when calling GroupApi#deregisterGroup");System.err.println("Status code: "+e.getCode());System.err.println("Reason: "+e.getResponseBody());System.err.println("Response headers: "+e.getResponseHeaders());e.printStackTrace();}
Deregistering a group will not physically delete it.
Listing Group
You can list arrays from the cloud service, passing a variety of filters:
# List all groups you ownowned_groups = tiledb.cloud.list_groups()print(owned_groups)# List all arrays that are shared with youshared_arrays = tiledb.cloud.list_shared_groups()print(shared_groups)# List all public groupspublic_groups = tiledb.cloud.list_public_groups()print(public_groups)# List arrays in a specific namespacetiledb_inc_groups = tiledb.cloud.list_groups(namespace="TileDB-Inc")print(tiledb_inc_groups)# Search keywordsgroups = tiledb.cloud.list_public_groups(namespace="TileDB-Inc", search="dragen")print(groups)# You can combine filtersgroups = tiledb.cloud.list_public_groups(namespace="TileDB-Inc", tag="genomics", search="dragen")print(groups)# List specific asset types that are based upon Groups. See also Listing Arrays# Available asset types:# GroupType.BIOIMG# GroupType.SOMA# GroupType.VCF# GroupType.POINTCLOUD# GroupType.RASTERassets = tiledb.cloud.list_public_groups(namespace="TileDB-Inc", group_type=tiledb.cloud.rest_api.GroupType.SOMA)print(assets)
Getting Group Information
You can run the following to get basic information about the array, such as its description:
info = tiledb.cloud.groups.info("tiledb://TileDB-Inc/vcf-1kghicov-dragen-v376")print(info)