Welcome to TileDB Cloud!

What is TileDB Cloud?

In a nutshell, TileDB Cloud allows you to share your TileDB arrays on the cloud with other users and perform serverless computations on them in a cost-effective manner (see Pricing) and with zero deployment hassle. Sign up now, earn $10 in credit and enjoy the following features.

Array Sharing

You can register with TileDB Cloud any of your existing TileDB arrays stored on AWS S3 and share it with other users defining access policies. You retain full ownership of the data and nothing gets moved around. When a user slices or writes to a TileDB array, TileDB Cloud enforces the access policies and performs the query, sending back any results to the TileDB Cloud client. There is no need to manage IAM roles, or Apache Sentry/Range setups anymore. TileDB Cloud handles everything transparently, while you write code in the same manner as in TileDB Developer.

Serverless Computations

TileDB Cloud comes with two serverless capabilities (while new ones will be added soon):

  • SQL: Experience the power of the TileDB-MariaDB integration and perform rapid SQL queries on your S3-stored TileDB arrays.

  • Python UDFs: Perform any custom Python function on a TileDB array slice, which can range from reductions (e.g., aggregate queries) to any sophisticated computation.

Full Auditability

All access to arrays (e.g., slicing, serverless computations, etc) is logged and can be viewed for audit purposes. TileDB Cloud allows you to keep track of how your shared arrays are being used and gain valuable insights.

Organizations

Create organizations and invite members, in order to share team-wide arrays and consolidate billing.

Public Arrays

Mark any of your arrays as public and let it be discovered and used by other users. You do not bear any extra charge for a public array, but instead only the user that access the array gets charged for usage.

Minimal Code Changes

The transition from TileDB Developer to TileDB Cloud is a couple of configuration parameters away. This allows you to test your code locally, and transition to the cloud or a shared array by changing 1-2 lines of code.

How it Works

Currently TileDB Cloud runs on AWS, but in the future it will be deployed on other cloud providers. Moreover, the TileDB Cloud instances are deployed only in us-east-1, but we will soon make it work on any region. The user arrays can be stored in any S3 region.

TileDB Cloud consists of two components: (i) a global state that handles the customer accounts, all encrypted AWS keys, billing, and task queues, and (ii) an elastic cloud of stateless workers deployed with Kubernetes. The worker cloud expands transparently from the users and everything is performed in a serverless manner. TileDB cloud may spin up any of the following EC2 instances, and the user currently has no control to choose a specific machine: m5.4xlarge, c5.4xlarge and r5.4xlarge.

There are three types of user tasks:

  • Access (read/write): Read any array slice or write to a TileDB array.

  • SQL: Perform a SQL query on one or more TileDB arrays.

  • Python UDF: Run any Python function on a TileDB array slice.

Every query is dispatched from the TileDB client and is handled by a different worker pod, i.e., TileDB Cloud balances the load. An access query is handled by a REST server pod which securely enforces the array access policies and manages all encrypted AWS keys. A SQL or Python UDF query is placed on a worker pod and any involved slicing always goes through a separate REST server pod for security purposes (e.g., so that the user is not able to maliciously retrieve any keys by dumping the machine memory contents). Each query is given all the CPUs of the worker machine and 2GB of RAM.

TileDB Cloud architecture

Self-hosting TileDB Cloud

Do you wish to run TileDB Cloud under your full control on premises or on the cloud? See TileDB Enterprise.