Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Below you can find links to various tutorials for dataframes. Currently, all these tutorials are built in Python (using TileDB-Py), but soon we will add tutorials that use the other TileDB APIs as well.
Dataframe Basics Learn how to ingest a CSV file as a dense (with row id indexing) or sparse (with multi-column indexing) array, inspect the array schema, slice the ingested dataframe, subselect on columns (array attributes), read into the Apache Arrow format, apply conditions on columns, and run SQL queries
Below we provide links to various tutorials for dense and sparse arrays. Currently, all those tutorials are built in Python (using TileDB-Py), but soon we will add tutorials that use the other TileDB APIs as well.
Dense Array Basics: Learn how to create a dense array, inspect the array schema, write to and read from the array, write and read array metadata, create arrays with multiple attributes and var-sized attributes, treat dense arrays as dataframes and even run SQL queries.
Tile Filters: Learn how to set compression and other filters to attributes in dense arrays, and use encryption in dense arrays.
Sparse Array Basics: In this tutorial you will learn how to create a sparse array, inspect the array schema, write to and read from the array, write and read array metadata, create arrays with multiple attributes and var-sized attributes, create arrays with string dimensions, create arrays with heterogeneous dimensions, treat sparse arrays as dataframes and even run SQL queries.
Designing a universal data model
These are exciting times for anyone working on data problems, as the data industry is as hot and as hyped as ever. Numerous databases, data warehouses, data lakes, lakehouses, feature stores, metadata stores, file managers, etc. have been hitting the market in the past few years. At TileDB we are trying to answer a simple question: instead of building a new data system every time our data needs change, can we build a single database that can store, govern, and process all data — tables, images, video, genomics, LiDAR, features, metadata, flat files and any other data type that may pop up in the future?
This question was born from the simple observation that all database systems (and variations) share significant similarities, including laying data out on the storage medium of choice, and fetching it for processing based on certain query workloads. Therefore, to answer the above question, we had to ask a slightly different one: is there a data model that can efficiently capture all data from all applications? Because if such a universal data model exists, it can serve as the foundation for building a universal database with all the subsystems common to all databases (query planner, executor, authenticator, transaction manager, APIs, etc.). We discovered that such a model does exist, and it is based on multi-dimensional arrays.
Before elaborating on why arrays are universal by describing the data model and their use cases, we need to answer yet another question: why should you care about a universal data model and a universal database? Here are a few important reasons:
Data diversity. You may think that it’s all about tabular data for which a traditional data warehouse (or data lake, or lakehouse) can do the trick, but in reality organizations possess a ton of other very valuable data, such as images, video, audio, genomics, point clouds, flat files and many more. And they wish to perform a variety of operations on these data collections, from analytics, to data science and machine learning.
Vendor optimization. In order to be able to manage their diverse data, organizations resort to either buying numerous different data systems, e.g., a data warehouse, plus an ML platform, plus a metadata store, plus a file manager. That costs money and time; money because some of the vendors have overlapping functionality that you pay twice for (e.g., authentication, access control, etc), and time because teams have to learn to operate numerous different systems, and wrangle data when they need to gain insights by combining disparate data sources.
Holistic governance. Even if organizations are happy with their numerous vendors, each different data system has its own access controls and logging capabilities. Therefore, if an organization needs to enforce centralized governance over all its data, it needs to build it in-house. That costs more money and time.
Even if you are already convinced of the importance of a universal database, we need to make one last vital remark. A universal database is unusable if it does not offer excellent performance for all the data types it is serving. In other words, a universal database must be performing as efficiently as the purpose-built ones, otherwise there will be a lot of skepticism on adopting it. And this is where the difficulty of universal databases lies and why no one had built such a system before TileDB.
In these docs you will be able to learn that multi-dimensional arrays are the right bet not only for their universality, but also for their performance. We will describe a lot of the critical decisions we took at TileDB when designing the array data model and an efficient on-disk format, as well as developing a powerful storage engine to support it.
For further reading on why we chose arrays as first-class citizens in TileDB, see our blog post Why Arrays as a Universal Data Model.
Our vision is to facilitate fast, large-scale genomics research at a fraction of the cost by providing infrastructure that is easy to setup and designed for extreme scale.
Storing genomic variant call data in a collection of VCF files can severely impact the performance of analyses at population scale. Motivated by these issues, we developed TileDB-VCF
In a nutshell: Storing vast quantities of genomic variant samples in legacy file formats leads to slow access, especially on the cloud. Merging the files into multi-sample files is not scalable or maintainable, resulting in the so called N+1 problem.
The de facto file format for storing genomic variant call data is VCF. This comes in two flavors, single-sample and multi-sample VCF (other names include combined VCF and project VCF, etc). Below we explain the problems with each of those approaches, as well as the data engineering effort involved when storing genomic data in a legacy, non-interoperable file format.
Genomic analyses performed on collections of single-sample VCF files typically involve retrieving variant information from particular genomic ranges, for a specific subsets of samples, along with any of the provided VCF fields/attributes. Such queries are often repeated millions of times, so it is imperative that data retrieval is performed in a fast, scalable, and cost-effective manner.
However, accessing random locations in thousands—or hundreds of thousands—of different files is prohibitively expensive, both computationally and monetarily. This is especially true on cloud object stores, where each byte range in a file is a separate request that goes over the network. Not only does this introduce non-negligible latency, it can also incur significant costs as cloud object stores charge for every such request. For a typical analysis involving millions of requests on large collections of VCF files, this quickly becomes unsustainable.
The problems with single-sample VCF collections was the motivation behind multi-sample VCF, or project/population VCF (pVCF), files in which the entire collection of single-sample VCFs is merged into a single file. When indexed, specific records can be located within a pVCF file very quickly, as data retrieval is reduced to a simple, super-fast linear scan, minimizing latency, and significantly boosting I/O performance. However, this approach comes with significant costs.
First, the size of multi-sample pVCF files can scale superlinearly with the number of included samples. The problem is that individual VCF files contain very sparse data, which the pVCF file densifies by adding dummy information to denote variants missing from the original single-sample VCF file. This means that the combined pVCF solution is not scalable because it can lead to an explosion of storage overhead and high merging cost for large population studies.
Another problem is that the multi-sample pVCF file cannot be updated. Due to the fact that the sample-related information is listed in the last columns of the file, a new sample insertion will need to create a new column and inject values at the end of every line. This effectively means that a brand new pVCF will need to be generated with every update. If pVCF is large (typically in the order of many GBs or even TBs), the insertion process will be extremely slow. This is often referred to as the N+1 problem.
Regardless of whether you deal with single- or multi-sample VCF, the typical way of accessing these files is via custom CLI tools, such as bcftools. Those tools are extremely useful for analysis that can be performed on a local machine (e.g., a laptop). However, they become unwieldy when scalable analysis necessitates the use of numerous machines working in parallel. Spinning up new nodes, deploying the software and orchestrating the parallel analysis consumes the majority of the time of researchers and hinders progress.
Furthermore, domain-specific tools either re-invent a lot of the work that has been done in general-purpose data science tools, or they miss out on them. This leads the researchers to writing custom code for converting the VCF data into a format that a programming language (e.g., Python and R) or a data science tool (e.g., pandas or Spark) understands. This data access and conversion cost often becomes the bottleneck, instead of the actual analysis.
Population variant data can be efficiently represented using a 3D sparse array. For each sample, imagine a 2D plane where the vertical axis is the contig and the horizontal axis is the genomic position. Every variant can be represented as a range within this plane; it can be unary (i.e., a SNP) or it can be a longer range (e.g., INDEL or CNV). Each sample is then indexed by a third dimension, which is unbounded to accommodate populations of any size. The figure below shows an example for one sample, with several variants distributed across contigs chr1
, chr2
and chr3
.
In TileDB-VCF, we represent the start position of each range as a non-empty cell in a sparse array (black squares in the above figure). In each of those array cells, we store the end position of each cell (to create a range) along with all other fields in the corresponding single-sample VCF files for each variant (e.g., REF
, ALT
, etc.). Therefore, for every sample, we map variants to 2D non-empty sparse array cells.
To facilitate rapid retrieval of interval intersections (explained in the next section), we also inject anchors (green square in the above figure) to breakup long ranges. Specifically, we create a new non-empty cell every anchor_gap
bases from the start of the range (where anchor_gap
is a user-defined parameter), which is identical to the range cell, except that (1) it has a new start coordinate and (2) it stores the real start position in an attribute.
Note that regardless of the number of samples, we do not inject any additional information other than that of the anchors, which is user configurable and turns out to be negligible for real datasets. In other words, this solution leads to linear storage in the number of samples, thus being scalable.
The typical access pattern used for variant data involves one or more rectangles covering a set of genomic ranges across one or more samples. In the figure below, let the black rectangle be the user's query. Observe that the results are highlighted in blue (v1, v2, v4, v7
). However, the rectangle misses v1
, i.e., the case where an Indel/CNV range intersects the query rectangle, but the start position is outside the rectangle.
This is the motivation behind anchors. TileDB-VCF expands the user's query range on the left by anchor_gap
. It then reports as results the cells that are included in the expanded query if their end position (stored in an attribute) comes after the query range start endpoint. In the example above, TileDB-VCF retrieves anchor a1
and Indel/CNV v3
. It reports v1
as a result (as it can be extracted directly from anchor a1
), but filters out v3
.
Quite often, the analyses requires data retrieval based on numerous query ranges (up to the order of millions), which must be submitted simultaneously. TileDB-VCF leverages the highly optimized multi-range subarray
But what about updates? That's the topic of the next sub section.
TileDB-VCF is built in C++ for performance, and comes with an efficient Python API, a command-line tool (CLI) and integrations with Spark and Dask. We are working hard to expand on the APIs through which our users can access the variant data stored in the open-spec TileDB format, and we are happy to accommodate user requests (which can be filed here).
Being able to access the data in various programming languages and computational frameworks unlocks a vast opportunity for utilizing the numerous existing open-source data science and machine learning tools.
You can find more detailed documentation about TileDB-VCF (including the installation instructions, API reference and How To guides) in the following dedicated section:
TileDB Open Source is a universal storage engine that stores any kind of data (beyond tables) in a powerful unified format, offering extreme interoperability via many APIs and tool integrations.
TileDB Open Source is a powerful engine architected around multi-dimensional arrays that enables storing and accessing:
Dense arrays (e.g., images, video and more)
Sparse arrays (e.g., LiDAR, genomics and more)
Dataframes (any tabular data, as either dense or sparse arrays)
Any data that can be modeled as arrays (e.g., graphs, key-values, ML models, etc.)
You can use TileDB to store data in a variety of applications, such as Genomics, Geospatial, Biomedical Imaging, Finance, Machine Learning, and more. The power of TileDB stems from the fact that any data can be modeled efficiently as either a dense or a sparse multi-dimensional array, which is the format used internally by most data science tooling. By storing your data and metadata in TileDB arrays, you abstract all the data storage and management pains, while efficiently accessing the data with your favorite programming language or data science tool via our numerous APIs and integrations.
TileDB Open Source is a fast embeddable C++ library with the following main features:
Open-source under the MIT license
Fast multi-dimensional slicing via tiling (i.e., chunking)
Multiple compression, encryption and checksum filters
Fast, lock-free ingestion
Parallel IO for both reads and writes
Cloud storage (AWS S3, Google Cloud Storage, Azure Blob Storage)
A fully multi-threaded implementation
Query condition execution push-down
Schema evolution
Data versioning and time traveling
Metadata stored alongside the array data
Groups for hierarchical organization of array data
A growing set of APIs (C, C++, C#, Python, Java, R, Go),
Numerous integrations (Spark, Dask, MariaDB, GDAL, and more)
The TileDB Open Source engine is built in C++ and exposes a C and a C++ API:
We maintain a growing set of language APIs built on top of the C and C++ APIs:
We extended TileDB Open Source to capture domain-specific aspects of important use cases:
There is a constantly growing set of tutorials in the TUTORIALS page group found in the left navigation menu of these docs.
If you'd like to take a deeper dive into the TileDB Open Source internals, you can check BACKGROUND in the left navigation menu. You can also always consult the HOW TO guides and the API REFERENCE.
Finally, detailed information about the various TileDB Open Source tool integrations and extensions can be found under the INTEGRATIONS & EXTENSIONS page group in the left navigation menu.
To make it easy to understand where to find what you are looking for, the documentation is structured in the following sections:
Tutorials A series of examples for learning how to use TileDB in various use cases
Background Explanation of key topics and concepts
How To Short how-to guides for all different features of TileDB
API Reference
Technical reference to the APIs
Extensions & integrations Detailed documentation on the TileDB Open Source extensions and integrations
The company maintains two offerings:
This tutorial section is under heavy development. Numerous new tutorials across a wide range of use cases are coming up soon, so stay tuned!
In this section we will be providing links to Jupyter notebooks hosted on TileDB Cloud. From there you can download and run them locally, no TileDB Cloud account is needed. Alternatively, you can launch them directly in TileDB Cloud. For that you will need to sign up, and contact us to tell us a bit about your use case if you'd like free credits for your trial (no credit card information is needed).
The pages listed below split the various tutorials by category. You can also navigate through all tutorials directly from the TileDB-Inc/Tutorials group on TileDB Cloud.
The basic array model we follow at TileDB is depicted below. We make an important distinction between a dense and a sparse array. A dense array can have any number of dimensions. Each dimension must have a domain with integer values, and all dimensions have the same data type. An array element is defined by a unique set of dimension coordinates and it is called a cell. In a dense array, all cells must store a value. A logical cell can store more than one value of potentially different types (which can be integers, floats, strings, etc). An attribute defines a collection of values that have the same type across all cells. A dense array may be associated with a set of arbitrary key-value pairs, called array metadata.
A sparse array is very similar to a dense array, but it has three important differences:
Cells in a sparse array can be empty.
The dimensions can have heterogeneous types, including floats and strings (i.e., the domain can be “infinite”).
Cell multiplicities (i.e., cells with the same dimension coordinates) are allowed.
The decision on whether to model your data with a dense or sparse array depends on the application and it can greatly affect performance. Also extra care should be taken when choosing to model a data field as a dimension or an attribute. These decisions are covered in detail in other sections of the docs, but for now, you should know this: array systems are optimized for rapidly performing range conditions on dimension coordinates. Arrays can also support efficient conditions on attributes, but by design the most optimized selection performance will come from querying on dimensions, and the reason will become clear soon.
Range conditions on dimensions are often called “slicing” and the results constitute a “slice” or “subarray”. Some examples are shown in the figure below. In numpy notation, A[0:2, 1:3]
is a slice that consists of the values of cells with coordinates 0 and 1 on the first dimension, and coordinates 1 , and 2 on the second dimension (assuming a single attribute). Alternatively, this can be written in SQL as SELECT attr FROM A WHERE d1>=0 AND d1<=1 AND d2>=1 AND d2<=2
, where attr
is an attribute and d1
and d2
the two dimensions of array A
. Note also that slicing may contain more than one range per dimension (a multi-range slice/subarray).
The above model can be extended to include “dimension labels”. This extension can be applied to both dense and sparse arrays, but labels are particularly useful in dense arrays. Briefly stated, a dimension can accept a label vector which maps values of arbitrary data types to integer dimension coordinates. An example is demonstrated below. This is very powerful in applications where the data is quite dense (i.e., there are not too many empty cells), but the dimension fields are not integers or do not appear contiguously in the integer domain. In such cases, multi-dimensional slicing is performed by first efficiently looking up the integer coordinates in the label vectors, and then applying the slicing as explained above, which in the dense array case can be truly rapid.
Labeled dimensions are currently under development in TileDB. They will appear in a future release soon.
In many applications, it is useful to hierarchically organize different arrays into groups. TileDB incorporates the concept and functionality of groups into its Data Format.
We currently focus on the following use cases in Genomics:
Multi-dimensional arrays have been around for a long time. However, there have been two misconceptions about arrays:
Arrays are used solely in scientific applications. This is mainly due to their massive use in Python, Matlab, R, machine learning and other scientific applications. There is absolutely nothing wrong with arrays capturing scientific use cases. On the contrary, such applications are important and challenging, and there is no relational database that can efficiently accommodate them.
Arrays are only dense. Most array systems (i.e., storage engines or databases) built before TileDB focused solely on dense arrays. Despite their suitability for a wide spectrum of use cases, dense arrays are inadequate for sparse problems, such as genomics, LiDAR and tables. Sparse arrays have been ignored and, therefore, no array system was able to claim universality.
The sky is the limit in terms of applicability for a system that supports both dense and sparse arrays. An image is a 2D dense array, where each cell is a pixel that can store the RGBA color values. Similarly a video is a 3D dense array, two dimensions for the frame images and a third one for the time. LiDAR is a 3D sparse array with float coordinates. Genomic variants can be modeled by a 3D array where the dimensions are the sample name (string), the chromosome (string) and the position (integer). Time series tick data can be modeled by a 2D array, with time and tick symbol as labeled dimensions (this can of course be extended arbitrarily to a ND dense or sparse array). Similarly, weather data can be modeled with a 2D dense array with float labels (the lat/lon real coordinates). Graphs can be modeled as (sparse 2D) adjacency matrices. Finally, a flat file can be stored as a simple 1D dense array where each cell stores a byte.
But what about tabular data? Arrays have a lot of flexibility here. In the most contrived scenario, we can store a table as a set of 1D arrays, one per column (similar to Parquet for those familiar with it). This is useful if we want to slice a range of rows at a time. Alternatively, we can store a table as a ND sparse array, using a subset of columns as the dimensions. That would allow rapid slicing on the dimension columns. Finally, we can use labeled dense arrays as explained above for the time series tick data.
You may wonder how we can make all these decisions about dimensions vs. attributes and dense vs. sparse for each application. To answer that, we need to understand how dense and sparse arrays lay data out on the storage medium, and what factors affect performance when slicing, which is the focus of the key concepts and data format section:
In addition, check out the various TileDB use cases in more detail:
TileDB supports fast and parallel subarray reads, with the option to time travel, i.e., to read the array at a particular time in the past. The read algorithm is architected to handle multiple fragments efficiently and completely transparently to the user. To read an array, TileDB first "opens" the array and brings some lightweight fragment metadata in main memory. Using this metadata, TileDB knows which fragments to ignore and which to focus on, e.g., based on whether their overlaps with the query subarray, or whether the fragment was created at or before the time of interest. Moreover, in case consolidation has occurred, TileDB will be smart enough to ignore fragments that have been consolidated, by considering only the merged fragment that encompasses them.
When reading an array, the user provides:
a (single- or multi-range) subarray
the attributes to slice on (it can be any subset of the attributes, including the coordinates)
the layout with respect to the subarray to return the result cells in
The read algorithm is quite involved. It leverages spatial indexing to locate only the relevant data tiles to the slice, it makes sure it does not fetch a data tile twice in the case of multi-range queries, it performs selective decompression of tile chunks after a tile has been fetched from the backend, and it employs parallelism pretty much everywhere (in IO, decompression, sorting, etc).
The figure below shows how to read the values of a single attribute from a dense array. The ideas extend to multi-attribute arrays and slicing on any subset of the attributes, including even retrieving the explicit coordinates of the results. The figure shows retrieving the results in 3 different layouts, all with respect to the subarray query. This means that you can ask TileDB to return the results in an order that is different than the actual physical order (which, recall, is always the global order), depending on the needs of your application.
You can also submit multi-range subarrays, as shows in the figure below. The supported orders here are row-major, column-major and unordered. The latter gives no guarantees about the order; TileDB will attempt to process the query in the fastest possible way and return the results in an arbitrary order. It is recommended to use this layout if you target at performance and you do not care about the order of the results. Also you can ask TileDB to return the explicit coordinates of the returned values if you wish to know which value corresponds to which cell.
Recall that all cells in a dense fragment must have a value, which TileDB materializes on disk. This characteristic of dense fragments is important as it considerably simplifies spatial indexing, which becomes almost implicit. Consider the example in the figure below. Knowing the space tile extent along each dimension and the tile order, we can easily identify which space tiles intersect with a subarray query without maintaining any complicated index. Then, using lightweight bookkeeping (such as offsets of data tiles on disk, compressed data tile size, etc.), TileDB can fetch the tiles containing results from storage to main memory. Finally, knowing the cell order, it can locate each slab of contiguous cell results in constant time (again without extra indexing) and minimize the number of memory copy operations.
Note that the above ideas apply also to dense fragments that populate only a subset of the array domain; knowing the non-empty domain, TileDB can use similar arithmetic calculations to locate the overlapping tiles and cell results.
The figure below shows an example subarray query on a sparse array with a single attribute, where the query requests also the coordinates of the result cells. Similar to the case of dense arrays, the user can request the results in layouts that may be different from the physical layout of the cells in the array (global order).
Sparse arrays accept multi-range subarray queries as well. Similar to the dense case, global order is not applicable here, but instead an unordered layout is supported that returns the results in an arbitrary order (again, TileDB will try its best to return the results as fast as possible in this read mode).
A sparse fragment differs from a dense fragment in the following aspects:
A sparse fragment stores only non-empty cells that might appear in any position in the domain (i.e., they may not be concentrated in dense hyper-rectangles)
In sparse fragments there is no correspondence between space and data tiles. The data tiles are created by first sorting the cells on the global order, and then grouping adjacent cell values based on the tile capacity.
There is no way to know a priori the position of the non-empty cells, unless we maintain extra indexing information.
A sparse fragment materializes the coordinates of the non-empty cells in data tiles.
Given a subarray query, the R-Tree (which is small enough to fit in main memory) is used to identify the intersecting data tile MBRs. Then, the qualifying coordinate data tiles are fetched and the materialized coordinates therein are used to determine the actual results.
Recall that writing to TileDB arrays produces a number of timestamped fragments. TileDB supports reading an array at an arbitrary instance in time, by providing a timestamp upon opening the array for reading. Any fragment created after that timestamp will be ignored and the read will produce results as if only the fragments created at or before the given timestamp existed in the array. Time traveling applies to both dense and sparse arrays. The figure below shows an example of a dense array with 3 fragments, along with the results of a subarray depending on the timestamp the array gets opened with.
If the user opens the array at a timestamp that is larger than or equal to the second timestamp of a fragment name, then that fragment will be considered in the read.
If the user opens the array at a timestamp that is smaller than the second timestamp of a fragment, then that fragment will be ignored.
If a fragment that qualifies for reading has been consolidated into another fragment that is considered for reading, then it will be ignored.
But what portion of the query is served in each iteration? TileDB implements the incomplete query functionality via result estimation and subarray partitioning. Specifically, if TileDB assesses (via estimation heuristics) that the query subarray leads to a larger result size than the allocated buffers, it splits (i.e., partitions) it appropriately, such that a smaller subarray (single- or multi-range) can be served. The challenge is in partitioning the subarray in a way that the result cell order (defined by the user) is respected across the incomplete query iterations. TileDB efficiently and correctly performs this partitioning process transparently from the user.
TileDB caches data and metadata upon read queries. More specifically, it caches:
Fragment metadata for those fragments that are relevant to a submitted query. There are no restrictions on how large that cache space is. The user can flush the fragment metadata cache by simply closing all open instances of an array.
The use of caching can be quite beneficial especially if the data resides on cloud object stores like AWS S3.
This group of pages describes the internal mechanics of the TileDB Open Source storage engine. It is meant for more advanced users that would like to better understand how we implement the array and to achieve excellent performance and provide features such as atomicity, concurrency, eventual consistency, data versioning, time traveling, and consolidation.
Python:
R:
Java:
Go:
C#:
: An extension for storing and accessing genomic variant (VCF) data
: Integrations with , , and
: Integration with and
: Integration with , and
Our blog post is a good starting point for understanding why we chose arrays as first-class citizens in TileDB Open Source.
TileDB started at MIT and Intel Labs as a research project in late 2014 that led to a . In May 2017 it was spun out into TileDB, Inc., a company that has since raised over $20M to further develop and maintain the project (see ).
The open-source storage engine, which is covered in this documentation.
The commercial data management platform called , which builds upon TileDB Embedded and offers data governance, scalable serverless compute and more.
TileDB Embedded along with its APIs and integrations are open-source projects and welcome all forms of contributions. Contributors to the project should read the for more information.
We'd love to hear from you. Drop us a line at , join our , visit our or , or to stay informed of updates and news.
You can also check out the TileDB and (webinars and workshops) to learn more about the TileDB vision, value proposition and use cases, as well as meet the behind all this amazing work.
Note that reading dense arrays always returns dense results. This means that, if your subarray overlaps with empty (non-materialized) cells in the dense array, TileDB will return or fill values for those cells. The figure below shows an example.
TileDB indexes sparse non-empty cells with . Specifically, for every coordinate data tile it constructs the minimum bounding rectangle (MBR) using the coordinates in the tile. Then, it uses the MBRs of the data tiles as leaves and constructs an R-Tree bottom up by recursively grouping MBRs into larger MBRs using a fanout parameter. The figure below shows an example of a sparse fragment and its corresponding R-Tree.
In the case of , time traveling works as follows:
There are situations where the memory allocated by the user to hold the result size is not enough for a given query. Instead of erroring out, TileDB gracefully handles these cases by attempting to serve a portion of the query and report back with an "incomplete" query status. The user should then consume the returned result and resubmit the query, until the query returns a "complete" status. This is explained with code . TileDB maintains all the necessary internal state inside the query object.
Data tiles that overlap a subarray (across fragments). This cache space is configurable (see ). The data tiles are currently cached in their raw "disk" form (i.e., with all potential filters applied as they are stored in the data files).
TileDB implements additional optimizations that improve decompression times and the overall memory footprint of a read query. Recall that each data tile is further decomposed into . After fetching a data tile that contains result candidates from storage, the TileDB read algorithm knows exactly which chunks of the tile are relevant to the query and decompresses (unfilters) only those chunks.
In addition to using parallelism internally, TileDB is designed having parallel programming in mind. Specifically, scientific computing users may be accustomed to using multi-processing (e.g., via MPI or Dask or Spark), or writing multi-threaded programs to speed up performance. TileDB enables concurrency using a multiple writer / multiple reader model that is entirely lock-free.
Concurrent writes are achieved by having each thread or process create one or more separate fragments for each write operation. No synchronization is needed across processes and no internal state is shared across threads among the write operations and, thus, no locking is necessary. Regarding the concurrent creation of the fragments, thread- and process-safety is achieved because each thread/process creates a fragment with a unique name (as it incorporates a UUID). Therefore, there are no conflicts even at the storage backend level.
TileDB supports lock-free concurrent writes of array metadata as well. Each write creates a separate array metadata file with a unique name (also incorporating a UUID), and thus name collisions are prevented.
During opening the array, TileDB loads the array schema and fragment metadata to main memory once, and shares them across all array objects referring to the same array. Therefore, for the multi-threading case, it is highly recommended that you open the array once outside the atomic block and have all threads create the query on the same array object. This is to prevent the scenario where a thread opens the array, then closes it before another thread opens the array again, and so on. TileDB internally employs a reference-count system, discarding the array schema and fragment metadata each time the array is closed and the reference count reaches zero (the schema and metadata are typically cached, but they still need to be deserialized in the above scenario). Having all concurrent queries use the same array object eliminates the above problem.
Reads in the multi-processing setting are completely independent and no locking is required. In the multi-threading scenario, locking is employed (through mutexes) only when the queries access the tile cache, which incurs a very small overhead.
Concurrent reads and writes can be arbitrarily mixed. Fragments are not visible unless the write query has been completed (and the .ok
file appeared). Fragment-based writes make it so that reads simply see the logical view of the array without the new (incomplete) fragment. This multiple writers / multiple readers concurrency model of TileDB is more powerful than competing approaches, such as HDF5’s single writer / multiple readers (SWMR) model. This feature comes with a more relaxed consistency model, which is described in the Consistency section.
Consolidation can be performed in the background in parallel with and independently of other reads and writes. The new fragment that is being created is not visible to reads before consolidation is completed.
Vacuuming deletes fragments that have been consolidated. Although it can never lead to a corrupted array state, it may lead to issues if there is a read operation that accesses a fragment that is being vacuumed. This is possible when the array is opened at a timestamp before some consolidation operation took place, therefore considering the fragment to be vacuumed. Most likely, that will lead to a segfault or some unexpected behavior.
TileDB locks the array upon vacuuming to prevent the above. This is achieved via mutexes in multi-threading, and file locking in multi-processing (for those storage backends that support it).
All POSIX-compliant filesystems and Windows filesystems support file locking. Note that Lustre supports POSIX file locking semantics and exposes local- (mount with -o localflock
) and cluster- (mount with -o flock
) level locking. Currently, TileDB does not use file locking on HDFS and S3 (these storage backends do not provide such functionality, but rather resource locking must be implemented as an external feature). For filesystems that do not support filelocking, the multi-processing programs are responsible for synchronizing the concurrent writes.
Particular care must be taken when vacuuming arrays on AWS S3 and HDFS. Without filelocking TileDB has no way to prevent vacuuming from deleting the old consolidated fragments. If another process is reading those fragments while consolidation is deleting them, the reading process is likely to error out or crash.
In general, avoid executing vacuuming when time traveling upon reading in cloud object stores. It is generally safe to vacuum if you are reading the array at the current timestamp.
Array creation (i.e., storing the array schema on persistent storage) is not thread-/process-safe. We do not expect a practical scenario where multiple threads/processes attempt to create the same array in parallel. We suggest that only one thread/process creates the array, before multiple threads/processes start working concurrently for writes and reads.
TileDB is fully parallelized internally, i.e., it uses multiple threads to process in parallel the most heavyweight tasks.
We explain how TileDB parallelizes the read and write queries, outlining the configuration parameters that you can use to control the amount of parallelization. Note that here we cover only the most important areas, as TileDB parallelizes numerous other internal tasks. See Configuration Parameters and Configuration for a summary of the parameters and the way to set them respectively.
A read query mainly involves the following steps in this order:
Identifying the physical attribute data tiles that are relevant to the query (pruning the rest)
Performing parallel IO to retrieve those tiles from the storage backend.
Unfiltering the data tiles in parallel to get the raw cell values and coordinates.
Performing a refining step to get the actual results and organize them in the query layout.
TileDB parallelizes all steps, but here we discuss mainly steps (2) and (3) that are the most heavyweight.
TileDB reads the relevant tiles from all attributes to be read in parallel as follows:
TileDB computes the byte ranges required to be fetched from each attribute file. Those byte ranges might be disconnected and could be numerous especially in the case of multi-range subarrays. In order to reduce the latency of the IO requests (especially on S3), TileDB attempts to merge byte ranges that are close to each other and dispatch fewer larger IO requests instead of numerous smaller ones. More specifically, TileDB merges two byte ranges if their gap size is not bigger than vfs.min_batch_gap
and their resulting size is not bigger than vfs.min_batch_size
. Then, each byte range (always corresponding to the same attribute file) becomes an IO task. These IO tasks are dispatched for concurrent execution, where the maximum level of concurrency is controlled by the sm.io_concurrency_level
parameter.
TileDB may further partition each byte range to be fetched based on parameters vfs.file.max_parallel_ops
(for posix and Windows), vfs.s3.max_parallel_ops
(for S3) and vfs.min_parallel_size
. Those partitions are then read in parallel. Currently, the maximum parallel operations for HDFS is set to 1, i.e., this task parallelization step does not apply to HDFS.
Once the relevant data tiles are in main memory, TileDB "unfilters" them (i.e., runs the filters applied during writes in reverse) in parallel in a nested manner as follows:
The “chunks” of a tile are controlled by a TileDB filter list parameter that defaults to 64KB.
The sm.compute_concurrency_level
parameter impacts the for
loops above, although it is not recommended to modify this configuration parameter from its default setting. The nested parallelism in reads allows for maximum utilization of the available cores for filtering (e.g. decompression), in either the case where the query intersects few large tiles or many small tiles.
A write query mainly involves the following steps in this order:
Re-organizing the cells in the global cell order and into attribute data tiles.
Filtering the attribute data tiles to be written.
Performing parallel IO to write those tiles to the storage backend.
TileDB parallelizes all steps, but here we discuss mainly steps (2) and (3) that are the most heavyweight.
For writes TileDB uses a similar strategy as for reads:
Similar to reads, the sm.compute_concurrency_level
parameter impacts the for
loops above, although it is not recommended to modify this configuration parameter from its default setting.
Similar to reads, IO tasks are created for each tile of every attribute. These IO tasks are dispatched for concurrent execution, where the maximum level of concurrency is controlled by the sm.io_concurrency_level
parameter. For HDFS, this is the only parallelization TileDB provides for writes. For the other backends, TileDB parallelizes the writes further.
For POSIX and Windows, if a data tile is large enough, the VFS layer partitions the tile based on configuration parameters vfs.file.max_parallel_ops
and vfs.min_parallel_size
. Those partitions are then written in parallel using the VFS thread pool, whose size is controlled by vfs.io_concurrency
.
For S3, TileDB buffers potentially several tiles and issues parallel multipart upload requests to S3. The size of the buffer is equal to vfs.s3.max_parallel_ops * vfs.s3.multipart_part_size
. When the buffer is filled, TileDB issues vfs.s3.max_parallel_ops
parallel multipart upload requests to S3.
In TileDB, reads, writes, consolidation and vacuuming are all atomic and will never lead to array corruption.
A read operation is the process of (i) creating a read query object and (ii) submitting the query (potentially multiple times in the case of incomplete queries) until the query is completed. Each such read operation is atomic and can never corrupt the state of an array.
A write operation is the process of (i) creating a write query object, (ii) submitting the query (potentially multiple times in the case of global writes) and (iii) finalizing the query object (important only in global writes). Each such write operation is atomic, i.e., a set of functions (which depends on the API) that must be treated atomically by each thread. For example, multiple threads should not submit the query for the same query object. Instead, you can have multiple threads create separate query objects for the same array (even sharing the same context or array object), and prepare and submit them in parallel with each thread.
A write operation either succeeds and creates a fragment that is visible to future reads, or it fails and any folder and file relevant to the failed fragment is entirely ignored by future reads. A fragment creation is successful if a file <fragment_name>.ok
appears in the array folder for the created fragment <fragment_name>
. There will never be the case that a fragment will be partially written and still accessible by the reader. The user just needs to eventually delete the partially written folder to save space (i.e., a fragment folder without an associated .ok
file). Furthermore, each fragment is immutable, so there is no way for a write operation to corrupt another fragment created by another operation.
Consolidation entails a read and a write and, therefore, it is atomic in the same sense as for writing. There is no way for consolidation to lead to a corrupted array state.
Vacuuming simply deletes fragment folders and array/fragment metadata files. Vacuuming always deletes the .ok
files before proceeding to erasing the corresponding folders. It is atomic in the sense that it cannot lead to array corruption and if the vacuuming process is interrupted, it can be restarted without issues.
TileDB supports fast and parallel aggregation of results. Currently, the results can only be aggregated over the whole returned dataset, which this page will call the default channel. To add aggregates to a query, the first thing to do is to get the default channel. For count
(nullary aggregate), no operations need to be created. For the other aggregates, an operation needs to be created on the desired column. That operation can then be applied to the default channel, whilst defining the output field name for the result (for count
, there is a constant operation that can be used to apply). Finally, buffers to receive the aggregate result can be specified using the regular buffer APIs on the query (see Basic Reading).
Note that ranges and query conditions can still be used to limit the rows to aggregate. Also note that TileDB allows getting the data and computing aggregates simultaneously. To do so, it is only required to specify buffers for the desired columns at the same time as the aggregated results. Here, the result of the aggregation will be available once the query is in a completed state (see Incomplete Queries).
Finally, here is a list of supported operations and information about the supported input field data type and the output datatype.
Count
N/A
UINT64
Sum
Numeric fields
Signed fields: INT64 Unsigned fields: UINT64 Floating point fields: FLOAT64
Min/Max
Numeric/string fields
Same as input type
Null count
Nullable fields.
UINT64
Mean
Numeric fields
FLOAT64
Count
"count"
Null count
"null_count"
Sum
"sum"
Min/Max
"min", "max"
Mean
"mean"
The presence of numerous fragments may impact the TileDB read performance. This is because many fragments would lead to fragment metadata being loaded to main memory from numerous different files in storage. Moreover, the locality of the result cells of a subarray query may be destroyed in case those cells appear in multiple different fragment files, instead of concentrated byte regions within the same fragment files.
To mitigate this problem, TileDB has a consolidation feature, which allows you to merge
Lightweight fragment metadata footers into a single file.
A subset of fragments into a single fragment.
A subset of array metadata files into a single one.
Consolidation is thread-/process-safe and can be done in the background while you continue reading from the array without being blocked. Moreover, consolidation does not hinder the ability to do time traveling at a fine granularity, as it does not delete fragments that participated in consolidation (and, therefore, they are still queryable). The user is responsible for vacuuming fragments, fragment metadata and array metadata that got consolidated to save space, at the cost of not being able to time travel across the old (finer) fragments.
Each fragment metadata file (located in a fragment folder) contains some lightweight information in its footer. This is mostly the non-empty domain and offsets for other metadata included in other parts of the file. If there are numerous fragments, reading the array may be slow on cloud object stores due to the numerous REST requests to fetch the fragment metadata footers. TileDB offers a consolidation process (with mode fragment_meta
), which merges the fragment metadata footers of a subset of fragments into a single file that has suffix .meta
, stored in the array folder. This file is named similarly to fragments, i.e., it carries a timestamp range that helps with time traveling. It also contains all the URIs of the fragments whose metadata footers are consolidated in that file. Upon reading an array, only this file is efficiently fetched from the backend, since it is typically very small in size (even for hundreds of thousands of fragments).
If mode fragments
is passed to the consolidation function, then the fragment consolidation algorithms is executed, which is explained in detail below.
There are two important points to stress regarding fragment consolidation:
Consolidating dense fragments produces a dense fragment, and may induce fill values.
Consolidating fragments where all fragments are sparse produces a sparse fragment.
The figure below shows consolidation of two dense fragments, the first containing only full tiles, and the second containing two tiles with a single cell written to each. Note that this can occur only in dense arrays, since sparse arrays can have only sparse fragments. The array in the figure has a 2x2 space tiling. Recall that a dense fragment consists of a dense hyper-rectangle and that it stores only integral tiles. Due to the partial cell in the second fragment that is located in the lower left space tile, the dense hyper-rectangle of the produced consolidated dense fragment must cover all four space tiles. Therefore, TileDB must fill the empty cells in this hyper-rectangle with empty values, illustrated in grey color in the figure below.
Consolidating only sparse fragments is simpler. The figure below illustrates consolidation of two sparse fragments, where the resulting consolidated fragment is also sparse and there is no injection of empty values.
Recall that each fragment is associated with its creation timestamp upon writing. A consolidated fragment instead is associated with the timestamp range that includes the timestamps of the fragments that produced it (see Consolidated Fragments). This is particularly important for time traveling, since opening an array at a timestamp will consider all the consolidated fragments whose end timestamp is at or before the query timestamp. In other words, although consolidation generally leads to better performance, it affects the granularity of time traveling.
Before the consolidation algorithm begins, TileDB applies a simple optimization in a pre-processing step, which may lead to great performance benefits depending on the “shape” of the existing fragments. Specifically, TileDB identifies dense fragments whose non-empty domain completely covers older adjacent (dense or sparse) fragments, and directly deletes the old fragment directories without performing any actual consolidation.
This clean-up process is illustrated with an example in the figure below. Suppose the first fragment is dense and covers the entire array, i.e., [1,4], [1,4]
, the second is dense and covers [1,2], [1,2]
, the third is sparse as shown in the figure, and the fourth one is dense covering [1,2], [1,4]
. Observe that, if those four fragments were to be consolidated, the cells of the second and third fragment would be completely overwritten from the cells of the fourth fragment. Therefore, the existence of those two fragments would make no difference to the consolidation result. Deleting them altogether before the consolidation algorithm commences will result in boosting the algorithm performance (since fewer cells will be read and checked for overwrites).
The consolidation algorithm is performed in steps. In each step, a subset of adjacent (in the timeline) fragments is selected for consolidation. The algorithm proceeds until a determined number of steps were executed, or until the algorithm specifies that no further fragments are to be consolidated. The choice of the next fragment subset for consolidation is based on certain rules and user-defined parameters, explained below. The number of steps is also configurable, controlled by sm.consolidation.steps
.
Let us focus on a single step, during which the algorithm must select and consolidate a subset of fragments based on certain criteria:
The first criterion is if a subset of fragments is “consolidatable”, i.e., eligible for consolidation in a way that does not violate correctness. Any subset consisting of solely sparse fragments is always consolidatable. However, if a fragment subset contains one or more dense fragments, TileDB performs an important check; if the union of the non-empty domains of the fragments (which is equal to the non-empty domain of the resulting consolidated fragment) overlaps with any fragment created prior to this subset, then the subset is marked as non-consolidatable. Recall that the fragment that results from consolidating a subset of fragments containing at least one dense fragment is always a dense fragment. Therefore, empty regions in the non-emtpy domain of the consolidated fragment will be filled with special values. Those values may erroneously overwrite older valid cell values. Such a scenario is illustrated in the figure below. The second and third fragments are not consolidatable, since their non-empty domain contains empty regions that overlap with the first (older) fragment. Consequently, consolidating the second and third fragment results in a logical view that is not identical to the one before consolidation, violating correctness. This criterion detects and prevents such cases.
The second criterion is the comparative fragment size. Ideally, we must consolidate fragments of approximately equal size. Otherwise, we may end up in a situation where, for example, a 100GB fragment gets consolidated with a 1MB one, which would unnecessarily waste consolidation time. This is controlled by parameter sm.consolidation.step_size_ratio
; if the size ratio of two adjacent fragments is smaller than this parameter, then no fragment subset that contains those two fragments will be considered for consolidation.
The third criterion is the fragment amplification factor, applicable to the case where the fragment subset to be consolidated contains at least one dense fragment. If the non-empty domain of the resulting fragment has too many empty cells, its size may become considerably larger than the sum of sizes of the original fragments to be consolidated. This is because the consolidated fragment is dense and inserts special fill values for all empty cells in its non-empty domain (see figure below). The amplification factor is the ratio between the consolidated fragment size and the sum of sizes of the original fragments. This is controlled by sm.consolidation.amplification
, which should not be exceed for a fragment subset to be eligible for consolidation. The default value 1.0
means that the fragments will be consolidated if there is no amplification at all, i.e., if the size of the resulting consolidated fragment is smaller than or equal to the sum of sizes of the original fragments. As an example, this happens when the non-empty domain of the consolidated fragment does not contain any empty cells.
The fourth criterion is the collective fragment size. Among all eligible fragment subsets for consolidation, we must first select to consolidate the ones that have the smallest sum of fragment sizes. This will quickly reduce the number of fragments (hence boosting read performance), without resorting to costly consolidation of larger fragments.
The final criterion is the number of fragments to consolidate in each step. This is controlled by sm.consolidation.step_min_frags
and sm.consolidation.step_max_frags
; the algorithm will select the subset of fragments (complying with all the above criteria) that has the maximum cardinality smaller than or equal to sm.consolidation.step_max_frags
and larger than or equal to sm.consolidation.step_min_frags
. If no fragment subset is eligible with cardinality at least sm.consolidation.step_min_frags
, then the consolidation algorithm terminates.
The algorithm is based on dynamic programming and runs in time O(max_frags * total_frags)
, where total_frags
is the total number of fragments considered in a given step, and max_frags
is equal to the sm.consolidation.step_max_frags
config parameter.
When computing the union of the non-empty domains of the fragments to be consolidated, in case there is at least one dense fragment, the union is always expanded to coincide with the space tile extents. This affects criterion 1 (since the expanded domain union may now overlap with some older fragments) and 2 (since the expanded union may amplify resulting consolidated fragment size).
Similar to array fragments, array metadata can also be consolidated (with mode array_meta
). Since the array metadata is typically small and can fit in main-memory, consolidating them is rather simple. TileDB simply reads all the array metadata (from all the existing array metadata fragments) in main memory, creates an up-to-date view of the metadata, and then flushes them to a new array metadata file that carries in its name the timestamp range determined by the first timestamp of the first array metadata and the second timestamp of the last array metadata files that got consolidated.
Vacuuming applies to consolidated fragments, consolidated array metadata and consolidated fragment metadata as follows:
Fragments: During consolidation, a .vac
file is produced with all the fragment URIs that participated in consolidation. When the vacuuming function is called with mode "fragments"
, all the fragment folders whose URI is in the .vac
file get deleted.
Array metadata: During consolidation, a .vac
file is produced with all the array metadata URIs that participated in consolidation. When the vacuuming function is called with mode "array_meta"
, all the array metadata files whose URI is in the .vac
file get deleted.
Fragment metadata: Vacuuming simply deletes all .meta
files except for the last one.
TileDB is architected to support parallel batch writes, i.e., writing collections of cells with multiple processes or threads. Each write operation creates one or more dense or sparse fragments. Updating an array is equivalent to initiating a new write operation, which could either insert cells in unpopulated areas of the domain or overwrite existing cells (or a combination of the two). TileDB handles each write separately and without any locking. Each fragment is immutable, i.e., write operations always create new fragments, without altering any other fragment.
A dense write is applicable to dense arrays and creates one or more dense fragments. In a dense write, the user provides:
The subarray to write into (it must be single-range).
The buffers that contain the attribute values of the cells that are being written.
The cell order within the subarray (which must be common across all attributes), so that TileDB knows which values correspond to which cells in the array domain. The cell order may be row-major, column-major, or global.
The example below illustrates writing into a subarray of an array with a single attribute. The figure depicts the order of the attribute values in the user buffers for the case of row- and column-major cell order. TileDB knows how to appropriately re-organize the user-provided values so that they obey the global cell order before storing them to disk. Moreover, note that TileDB always writes integral space tiles to disk. Therefore, it will inject special empty values (depicted in grey below) into the user data to create full data tiles for each space tile.
Writing in the array global order needs a little bit more care. The subarray must be specified such that it coincides with space tile boundaries, even if the user wishes to write in a smaller area within that subarray. The user is responsible for manually adding any necessary empty cell values in her buffers. This is illustrated in the figure below, where the user wishes to write in the blue cells, but has to expand the subarray to coincide with the two space tiles and provide the empty values for the grey cells as well. The user must provide all cell values in the global order, i.e., following the tile order of the space tiles and the cell order within each space tile.
Writing in global order requires knowledge of the space tiling and cell/tile order, and is rather cumbersome to use. However, this write mode leads to the best performance, because TileDB does not need to internally re-organize the cells along the global order. It is recommended for use cases where the data arrive already grouped according to the space tiling and global order (e.g., in geospatial applications).
TileDB uses the following default fill values for empty cells in dense writes, noting that the user can specify any other fill value upon array creation:
Datatype
Default fill value
TILEDB_CHAR
Minimum char
value
TILEDB_INT8
Minimum int8
value
TILEDB_UINT8
Maximum uint8
value
TILEDB_INT16
Minimum int16
value
TILEDB_UINT16
Maximum uint16
value
TILEDB_INT32
Minimum int32
value
TILEDB_UINT32
Maximum uint32
value
TILEDB_INT64
Minimum int64
value
TILEDB_UINT64
Maximum uint64
value
TILEDB_FLOAT32
NaN
TILEDB_FLOAT64
NaN
TILEDB_ASCII
0
TILEDB_UTF8
0
TILEDB_UTF16
0
TILEDB_USC2
0
TILEDB_USC4
0
TILEDB_ANY
0
TILEDB_DATETIME_*
Minimum int64
value
In the case a fixed-sized attribute stores more than one values, all the cell values will be assigned the corresponding default value shown above.
Sparse writes are applicable to sparse arrays and create one or more sparse fragments. The user must provide:
The attribute values to be written.
The coordinates of the cells to be written.
The cell layout of the attribute and coordinate values to be written (must be the same across attributes and dimensions). The cell layout may be unordered or global.
Note that sparse writes do not need to be constrained in a subarray, since they contain the explicit coordinates of the cells to write into. The figure below shows a sparse write example with the two cell orders. The unordered layout is the easiest and most typical. TileDB knows how to appropriately re-organize the cells along the global order internally before writing the values to disk. The global layout is once again more efficient but also more cumbersome, since the user must know the space tiling and the tile/cell order of the array, and manually sort the values before providing them to TileDB.
TileDB enables concurrent writes and reads that can be arbitrarily mixed, without affecting the normal execution of a parallel program. This comes with a more relaxed consistency model, called eventual consistency. Informally, this guarantees that, if no new updates are made to an array, eventually all accesses to the array will “see” the last collective global view of the array (i.e., one that incorporates all the updates). Everything discussed in this section about array fragments is also applicable to array metadata.
We illustrate the concept of eventual consistency in the figure below (which is the same for both dense and sparse arrays). Suppose we perform two writes in parallel (by different threads or processes), producing two separate fragments. Assume also that there is a read at some point in time, which is also performed by a third thread/process (potentially in parallel with the writes). There are five possible scenarios regarding the logical view of the array at the time of the read (i.e., five different possible read query results). First, no write may have completed yet, therefore the read sees an empty array. Second, only the first write got completed. Third, only the second write got completed. Fourth, both writes got completed, but the first write was the one to create a fragment with an earlier timestamp than the second. Fifth, both writes got completed, but the second write was the one to create a fragment with an earlier timestamp than the first.
The concept of eventual consistency essentially tells you that, eventually (i.e., after all writes have completed), you will see the view of the array with all updates in. The order of the fragment creation will determine which cells are overwritten by others and, hence, greatly affects the final logical view of the array.
Eventual consistency allows high availability and concurrency. This model is followed by the AWS S3 object store and, thus, TileDB is ideal for integrating with such distributed storage backends. If strict consistency is required for some application (e.g., similar to that in transactional databases), then an extra layer must be built on top of TileDB Open Source to enforce additional synchronization.
But how does TileDB deal internally with consistency? This is where opening an array becomes important. When you open an array (at the current time or a time in the past), TileDB takes a snapshot of the already completed fragments. This the view of the array for all queries that will be using that opened array object. If writes happen (or get completed) after the array got opened, the queries will not see the new fragments. If you wish to see the new fragments, you will need to either open a new array object and use that one for the new queries, or reopen the array (reopening the array bypasses closing it first, permitting some performance optimizations).
We illustrate with the figure below. The first array depicts the logical view when opening the array. Next suppose a write occurs (after opening the array) that creates the fragment shown as the second array in the figure. If we attempt to read from the opened array, even after the new fragment creation, we will see the view of the third array in the figure. In other words, we will not see the updates that occurred between opening and reading from the array. If we'd like to read from the most up-to-date array view (fourth array in the figure), we will need to reopen the array after the creation of the fragment.
When you write to TileDB with multiple processes, if your application is the one to be synchronizing the writes across machines, make sure that the machine clocks are synchronized as well. This is because TileDB sorts the fragments based on the timestamp in their names, which is calculated based on the machine clock.
Here is how TileDB reads achieve eventual consistency on AWS S3:
Upon opening the array, list the fragments in the array folder
Consider only the fragments that have an associated .ok
file (the ones that do not have one are either in progress or not visible due to S3’s eventual consistency)
The .ok
file is PUT after all the fragment data and metadata files have been PUT in the fragment folder.
Any access inside the fragment folder is performed with a byte range GET request, never with LIST. Due to S3’s read-after-write consistency model, those GET requests are guaranteed to succeed.
The above practically tells you that a read operation will always succeed and never be corrupted (i.e., it will never have results from partially written fragments), but it will consider only the fragments that S3 makes visible (in their entirety) at the timestamp of opening the array.
TileDB allows you to encrypt your arrays at rest. It currently supports a single type of encryption, AES-256 in the GCM mode, which is a symmetric, authenticated encryption algorithm. When creating, reading or writing arrays you must provide the same 256-bit encryption key. The authenticated nature of the encryption scheme means that a message authentication code (MAC) is stored together with the encrypted data, allowing verification that the persisted ciphertext was not modified.
Encryption libraries used:
macOS and Linux:
Windows:
By default, TileDB caches array data and metadata in main memory after opening and reading from arrays. These caches will store decrypted (plaintext) array data in the case of encrypted arrays. For a bit of extra in-flight security (at the cost of performance), you can disable the TileDB caches (see and ).
TileDB never persists the encryption key, but TileDB does store a copy of the encryption key in main memory while an encrypted array is open. When the array is closed, TileDB will zero out the memory used to store its copy of the key, and free the associated memory.
Due to the extra processing required to encrypt and decrypt array metadata and attribute data, you may experience lower performance on opening, reading and writing for encrypted arrays.
To mitigate this, TileDB internally parallelizes encryption and decryption using a chunking strategy. Additionally, when compression or other filtering is configured on array metadata or attribute data, encryption occurs last, meaning the compressed (or filtered in general) is what gets encrypted.
Finally, newer generations of some Intel and AMD processors offer instructions for hardware acceleration of encryption and decryption. The encryption libraries that TileDB employs are configured to use hardware acceleration if it is available.
The array metadata are simple key-value pairs that the user can attach to an array. The key is a string and the value can be of any datatype. The array metadata is typically small. Time traveling applies to array metadata as well, i.e., opening an array at a timestamp will fetch only the array metadata created at or before the given timestamp.
Both the and the array metadata store information about the array, and the user is responsible for setting and configuring them. The easiest way to remember the difference between the array metadata and the array schema is the following:
The array metadata stores user-specific data about the array that is arbitrary key-value pairs.
The array schema stores system-specific data about the array that has a fixed structure (e.g., a dimension name, domain and datatype).
The array schema stores all the details about the array definition. Some of the data it holds are:
Attributes (name, datatype, filters)
Dimensions (name, datatype, domain, filters)
Tile extent and capacity
Tile and cell order
See the for more details.
offers a very simple way of sharing arrays with anyone on the planet. It effectively provides a way to defined access control on arrays, and log every single action for auditability purposes.
A non-empty cell (in either a dense or sparse array) is not limited to storing a single value. Each cell stores a tuple with a structure that is common to all cells. Each tuple element corresponds to a value on a named attribute of a certain type. An attribute can be:
Fixed-sized: an attribute value in a cell may consist of one or a fixed number of values of the same datatype
Variable-sized: an attribute value in a cell may consist of a variable number of values of the same datatype, i.e., different cells may store a different number of values on this attribute.
The figure below shows an example of an array with 3 attributes; a1
of type int32
, a2
of type char: var
and a3
of type float32: 2
. Every non-empty cell must store 1 int32
value on a1
, any number of char
values on a2
and exactly 2 float32
values on a3
.
An ordered tuple of dimension domain values, called coordinates, identifies an array cell. The order of the coordinates must follow the order in which the array dimensions were specified. The figure below depicts an example of cell (3, 4)
assuming that the dimension order is d1, d2
.
The coordinates of an array cell is an ordered tuple of dimension domain values that identifies it. In dense arrays, the coordinates of each cell are unique. In sparse arrays, the same coordinates may appear more than once.
TileDB adopts the so-called columnar format and stores the (non-empty) cell values for each attribute separately. A data tile is a subset of cell values on a particular attribute. We explain the data tile separately for dense and sparse fragments, and its relationship to the space tile. The data tile is the atomic unit of compression and IO.
Contrary to dense fragments, there is no correspondence between space tiles and data tiles in sparse fragments. Consider the 8x8
fragment with 4x4
space tiles in the figure below. Assume for simplicity that the array stores a single int32
attribute. The non-empty cells are depicted in blue color. If we followed the data tiling technique of dense fragment, we would have to create 4 data tiles, one for each space tile. TileDB does not materialize empty cells, i.e., it stores only the values of the non-empty cells in the data files. Therefore, the space tiles would produce 4 data tiles with 3 (upper left), 12 (upper right), 1 (lower left) and 2 (lower right) non-empty cells.
The physical tile size imbalance that may result from space tiling can lead to ineffective compression (if numerous data tiles contain only a handful of values), and inefficient reads (if the subarray you wish to read only partially intersects with a huge tile, which needs to be fetched in its entirety and potentially decompressed). Ideally, we wish every data tile to store to the same number of non-empty cells. Recall that this is achieved in the dense case by definition, since each space tile has the same shape (equal number of cells) and all cells in each space tile are non-empty. Finally, since the distribution of the non-empty cells in the array may be arbitrary, it is phenomenally difficult to fine-tune the space tiling in a way that leads to load-balanced data tiles storing an acceptable number of non-empty cells, or even completely unattainable.
In other words, the space tiles in sparse fragments are used to determine the global cell order that will dictate which cell values will be grouped together in the same data tile. Another difference to dense fragments is that sparse fragments create extra data tiles for the coordinates of the non-empty cells, which is important in reading.
There are three main differences between a dense and a sparse array:
A dense array is used when the majority of the cells are non-empty (within any hyper-rectangular sub domain), whereas a sparse array when the majority of the cells are empty.
The dimensions of a dense array must have the same datatype, whereas the dimensions of a sparse array may have different datatypes.
The dimensions of a dense array can only be of integer data type, whereas the dimensions of a sparse array may be of any data type (even real or string).
Every cell in a dense array is uniquely identified by its coordinates, whereas a sparse array can permit multiplicities, i.e., cells with the same coordinates but potentially different attribute values, as well as real (float32
, float64
) and string domains.
TileDB provides a unified API for both dense and sparse arrays.
A multi-dimensional array consists of a set of ordered dimensions. A dimension has a name, a datatype and a domain. The figure below shows an example of two int32
dimensions, d1
with domain [1,4]
and d2
with domain [2,6]
.
The array domain (or simply domain) is the hyperspace defined by the domains of the array dimensions. In a dense array, all dimensions must have the same type (homogeneous dimensions) and can only be integers. In a sparse array, the dimensions may have different type (heterogenous dimensions) and can be of any data type (even real and string).
The non-empty domain is the tightest hyper-rectangle that contains all non-empty cells. An example is shown in the figure below.
The dimension domains can have negative, real and string values. An array cell is still identified by its coordinates, which take any value from the corresponding dimension domain.
In our examples, the orientation of each dimension domain is rather arbitrary and does not affect the array definition. It is just a matter of convention. For example, the lower values may be at the top or bottom of the vertical dimension.
Not all array cells may contain values. A cell that contains values is called non-empty cell, otherwise it is called empty.
A fragment is a timestamped snapshot of a portion of the array, which is produced during writes. A fragment may be dense or sparse as shown in the figure below. In a dense fragment, the non-empty cells are contained in a full hyper-rectangle in the domain. This hyper-rectangle may cover the full domain or any subdomain. In a sparse fragment, the non-empty cells may be arbitrary, i.e., not necessarily comprise a full hyper-rectangle.
An array may consist of multiple fragments. Those fragments are completely transparent to the user, who only sees the combined logical view of the array upon reading. This is produced by superimposing the more recent fragments on top of the older ones, with the more recently written cells overwriting the older ones. A dense array may consist of both dense and sparse fragments, but a sparse array may consist only of sparse fragments.
The fragment metadata is system-specific information about a fragment. Some of the information this metadata includes is:
Dense or sparse
Non-empty domain
Tile offsets
Tile sizes
R-Tree (for the sparse case)
The tile and cell order collectively determine the global cell order. The global cell order is essentially a mapping from the multi-dimensional cell space to the 1-dimensional physical storage space for the non-empty cells, i.e., it is the order in which TileDB stores the cell values on disk. The figure below shows the 4 possible global cell orders resulting from all combinations of tile/cell orders. The numbers indicate the relative positions of the non-empty cells along the global order.
Groups allows hierarchically organizing arrays and other groups.
The non-empty domain of an array is the minimum bounding hype-rectangle that tightly encompasses all non-empty cells in the array.
A space tile is defined by specifying a tile extent along each dimension. The domain of each dimension is partitioned into segments equal to the tile extent, and hyper-rectangular tiles are formed in the multi-dimensional array space. The space tile concept applies to both dense and sparse arrays (as well as real dimensions) and is independent of the actual data stored in the array.
A subarray is an array slice. A single-range subarray is defined by a domain range along each dimension. A multi-range subarray is defined by a multiple ranges per dimension. The resulting slice of a multi-range subarray is oriented by the cross-product of the ranges along all dimensions. Multi-range subarrays are applicable only to reads. Multi-range subarrays are applicable to both dense and sparse arrays.
Row-major: Assuming each tile or cell can be identified by a set of coordinates in the multi-dimensional space, row-major means that the rightmost coordinate index “varies the fastest”.
Column-major: Assuming each tile or cell can be identified by a set of coordinates in the multi-dimensional space, column-major means that the leftmost coordinate index “varies the fastest”.
Once you install TileDB, visit the page to see how to use TileDB in your programs.
Conda will install pre-built TileDB-Py and TileDB core binaries for Windows, macOS, or Linux. Pip currently provides binary wheels for Linux, and will build all dependences from source on other platforms (see for more information).
TileDB needs to be installed beforehand (from a package or from source) for the TileDB-R package to build and link correctly. See for methods of installing TileDB.
If the TileDB library is installed in a custom location, you need to pass the explicit path:
To fix this, simply export the TAR
environmental variable before starting R
After running the commands above our TileDB-Project/ConsoleApp/ConsoleApp.csproj
will have the following configuration, providing access to the TileDB-CSharp API in our project.
Here is a sample go.mod file:
TileDB needs to be installed beforehand (from a package or from source) for the TileDB-Go library to build and link correctly.
Default read/write queries in TileDB are synchronous or blocking. This means that the user function that is submitting the query has to block and wait until TileDB is done processing the query. There are scenarios in which you may want to submit the query in an asynchronous or non-blocking fashion. In other words, you may wish to submit the query but tell TileDB to process it in the background, while you proceed with the execution of your function and perform other tasks while TileDB is executing the query in parallel. TileDB supports asynchronous queries and enables you to check the query status (e.g., if it is still in progress). It also allows to pass a callback upon submission, i.e., specify a function that you wish TileDB to compute upon finishing processing the query. This applies to both dense and sparse arrays, as well as to both write and read queries.
The figure below shows the difference between synchronous and asynchronous query execution.
TileDB allocates a separate thread pool for asynchronous queries, whose size is controlled by configuration parameter sm.num_async_threads
(defaulting to 1).
Note that the above is the logical representation of the attribute values in the cells. TileDB physically stores the values in a "columnar" manner, i.e., all the values along each of the attributes are stored in separate files. See for more details.
TileDB writes in immutable , which are self-contained directories inside the array directory. If the number of fragments becomes excessive or if several fragments are small in size, it is beneficial to consolidate them into smaller ones. TileDB offers various ways to consolidate fragments. It also enables consolidating only the footers in order to boost performance on cloud object stores.
In a dense fragment, each corresponds to N
data tiles, where N
is the number of attributes.
TileDB solves this seemingly very challenging problem in a surprisingly simple manner. It first sorts the non-empty cells along the (defined by specifying the space tiles, tile and cell order as in the dense case), then separates the cell attribute values (as in the dense case), and then creates data tiles on each attribute by grouping adjacent cells based on a user-defined parameter called capacity.
In case the user slices some empty space from a dense array, the selected attributes are assigned special fill values (to represent "empty"). TileDB uses but also provides a way to during the creation of an attribute and the array schema.
TileDB offers a variety of , which are applied to data tiles of the attributes and dimensions when writing to disk (called filtering) or when reading from disk (called unfiltering).
See the for the detailed description of the fragment metadata.
An incomplete query occurs when the result size of a subarray is larger than the allocated buffers that will hold the result. TileDB this case via result estimation and subarray partitioning.
Each of a nullable attribute is always accompanied by a validity tile, which stores the validity of each corresponding cell. Non-zero validity values represent non-null cells, while zero values represent null cells. This is applicable to both fixed-size and .
TileDB uses R-trees for multi-dimensional indexing in sparse fragments. This allows for fast pruning of irrelevant during reading.
TileDB allows the user to specify an order for the , as well as the cells inside each space tile. The order can be:
TileDB writes in immutable , each of which is timestamped. It then offers a way to open the array at user-specified time stamps and, therefore, read views of the array at different times.
allows for execution of user-defined functions (UDF) written in various languages. Those are essentially arbitrary computations that are dispatched to the TileDB Cloud platform, avoiding the need to manually spin up compute instances and clusters in the cloud. TileDB Cloud allows the execution of UDFs in parallel, enabling massive scalability.
TileDB retains the after a process in order to continue to offer fine-grained . TileDB offers a vacuuming process that deletes the consolidated fragments in order to save space.
Each data tile of a variable-sized attribute is always accompanied by an offset tile, which stores the starting byte position of every variable-sized value in the . This allows TileDB to quickly locate the i-th cell value in a data tile in constant time.
If you are using R inside conda and want to install TileDB-R you might run into a . The error is below:
Use the NuGet package to create a .NET console application with the . The TileDB NuGet package is currently compatible with .NET 5 and above.
Your project must declare a otherwise the native TileDB Open Source binaries will not be imported and the app might fail at runtime.
To test our project we can edit TileDB-Project/ConsoleApp/Program.cs
and obtain the version of TileDB core currently in use by TileDB.CSharp. For more examples using the C# API see the repository on GitHub.
The TileDB-R
package has is available on CRAN which provides binaries for Windows and macOS which can be installed via install.packages("tiledb")
. On Linux this result in installation from source. For all operating systems, one can also clone the repository and create a compressed tarfile to check and install as described in the R manual, or install directly from Github. We also describe installing releases from Github as shown below.
If the TileDB library is installed in a custom location, you need to pass the explicit path:
To build the latest development version of TileDB-R:
install_github
will delete all temporary files upon failure. To debug build failures, clone this repository locally and build a compressed tarfile to check and install, or run the commanddevtools::install("/path/to/TileDB-R")
.
If you are using the TileDB Conda package, you may need to explicitly add the conda path after activating the environment with conda activate tiledb
.
Instructions for setting up a RStudio development environment, building, and testing the TileDB-R package are located in the developer documentation wiki.
If you experience issues when installing devtools
, see these instructions. If the problem persists, you can install devtools
with conda by running:
Once you install TileDB, visit the Usage page to see how to use TileDB in your programs.
The core TileDB library can be installed easily using the Homebrew package manager for macOS. Install instructions for Homebrew are provided on the package manager’s website.
To install the latest stable version of TileDB:
HDFS and S3 backends are enabled by default. To disable one or more backends, use the --without-
switch to disable them:
A full list of build options can be viewed with the info
command:
Other helpful brew commands:
The Homebrew Tap is located at https://github.com/TileDB-Inc/homebrew.
TileDB is available as a pre-built Docker image. For a latest version, run:
For a specific TileDB version, run:
A package for TileDB is available for the Conda package manager. Conda makes it easy to install software into separate distinct environments on Windows, Linux, and macOS
If you are compiling / linking against the TileDB conda package, you may need to explicitly add the conda path after activating the environment with conda activate tiledb
, sinceconda activate
sets the CONDA_PREFIX
environment variable:
Instead of exporting those environment variables, you can pass them as command line flags during compilation:
You can download pre-built Windows binaries in the .zip file from the latest TileDB release. You can then simply configure your project (if you are using Visual Studio) according to the Windows usage instructions.
TileDB binaries for each release are available on GitHub Releases for the following operating system and architecture combination:
Windows on x64
macOS on x64
macOS on Apple Silicon (arm64)
glibc-based Linux on x64 (with and without AVX2 support)
These binaries are also available on NuGet, for use by the C# API.
Begin by downloading a release tarball or by cloning the TileDB GitHub repo and checking out a release tag (where <version>
is the version you wish to use (e.g., 1.7.4
)
To configure TileDB, use the bootstrap
script:
The flags for the bootstrap script and the CMake equivalents are as follows:
Flag
Description
CMake Equivalent
--help
Prints command line flag options
N/A
--prefix=PREFIX
Install files in tree rooted at PREFIX
(defaults to TileDB/dist
)
CMAKE_INSTALL_PREFIX=<PREFIX>
--dependency=DIRs
Colon separated list to binary dependencies
CMAKE_PREFIX_PATH=<DIRs>
--enable-debug
Enable debug build
CMAKE_BUILD_TYPE=Debug
--enable-coverage
Enable build with code coverage support
CMAKE_BUILD_TYPE=Coverage
--enable-verbose
Enable verbose status messages
TILEDB_VERBOSE=ON
--enable-hdfs
Enables building with HDFS storage backend support
TILEDB_HDFS=ON
--enable-s3
Enables building with S3 storage backend support
TILEDB_S3=ON
--enable-azure
Enables building with Azure Blob Storage backend support
TILEDB_AZURE=ON
--enable-gcs
Enables building with Google Cloud Storage backend support
TILEDB_GCS=ON
--enable-serialization
Enables building with Serialization and TileDB Cloud support
TILEDB_SERIALIZATION=ON
--enable-static-tiledb
Enables building TileDB as a static library
TILEDB_STATIC=ON
--disable-werror
Disables building with the -Werror
flag
TILEDB_WERROR=OFF
--disable-cpp-api
Disables building the TileDB C++ API
TILEDB_CPP_API=OFF
--disable-stats
Disables internal TileDB statistics
TILEDB_STATS=OFF
--disable-tests
Disables building the TileDB test suite
TILEDB_TESTS=OFF
To build after configuration, run the generated make script
To install to the configured prefix
Note that building against the installed shared library requires setting the library search path at build- or run-time, as documented in Usage. (system-wide installations requiring sudo
permissions may avoid this step by running sudo ldconfig
after installation).
Other helpful makefile targets:
Building TileDB on Windows has been tested to work with Microsoft Visual Studio 2019 and later. You can install the free Community Edition if you’d like the full IDE, or the Build Tools if you don’t need or want the IDE installed.
During the Visual Studio setup process, make sure the Git for Windows component is selected if you do not already have a working Git installation. Also be sure to select the CMake component if you do not have a working CMake installation.
In addition, you will need to install PowerShell (free).
To build and install TileDB, first open PowerShell and clone the TileDB repository and checking out a release tag (where <version>
is the version you wish to use (e.g., 1.7.4
)
Next, ensure the CMake binaries are in your path. If you installed Visual Studio, execute
Create a build directory and configure TileDB
The flags for the bootstrap script and the CMake equivalents are as follows:
Flag
Description
CMake Equivalent
-?
Display a usage message.
n/a
-Prefix
Install files in tree rooted at PREFIX
(defaults to TileDB\dist
)
CMAKE_INSTALL_PREFIX=<PREFIX>
-Dependency
Semicolon separated list to binary dependencies.
CMAKE_PREFIX_PATH=<DIRs>
-CMakeGenerator
Optionally specify the CMake generator string, e.g. “Visual Studio 15 2017”. Check ‘cmake –help’ for a list of supported generators.
-G <generator>
-EnableDebug
Enable debug build
CMAKE_BUILD_TYPE=Debug
-EnableVerbose
Enable verbose status messages.
TILEDB_VERBOSE=ON
-EnableS3
Enables building with the S3 storage backend.
TILEDB_S3=ON
-EnableGcs
Enables building the Google Cloud Storage backend
TILEDB_GCS=ON
-EnableSerialization
Enabled serialization and TileDB Cloud support
TILEDB_SERIALZIATION=ON
-EnableStaticTileDB
Enables building TileDB as a static library
TILEDB_STATIC=ON
-DisableWerror
Disables building with the /WX
flag
TILEDB_WERROR=OFF
-DisableCppApi
Disables building the TileDB C++ API
TILEDB_CPP_API=OFF
-DisableTBB
Disables use of TBB for parallelization
TILEDB_TBB=OFF
-DisableStats
Disables internal TileDB statistics
TILEDB_STATS=OFF
-DisableTests
Disables building the TileDB test suite
TILEDB_TESTS=OFF
To build after configuration
To install
Other helpful build targets:
If you build libtiledb
in Release
mode (resp. Debug
), make sure to build check
and examples
in Release
mode as well (resp. Debug
), otherwise the test and example executables will not run properly.
Should you experience any problem with the build, it is always a good idea to delete the build
and dist
directories in your TileDB repo path and restart the process, as cmake
’s cached state could present some unexpected problems.
Cygwin is a Unix like environment and command line interface for Microsoft Windows that provides a large collection of GNU / OpenSource tools (including the gcc toolchain) and supporting libraries that provide substantial POSIX API functionality. TileDB is able to compile from source in the Cygwin environment if Intel TBB is disabled and some TileDB dependencies are installed as Cygwin packages.
The following Cygwin packages need to be installed:
gcc / g++
git
cmake
make
lz4-devel
zlib-devel
libzstd-devel (+src)
bzip2 (+src)
openssl-devel
You can then clone and build TileDB using git / cmake / make:
To build the JNI extension you need to install:
Cmake (>=3.3)
JDK (>=1.8)
To build the library with the native library bundled in run:
This will create the TileDB JNI library build/tiledb_jni/libtiledbjni.dylib
. This will also download and build the TileDB core library if it is not found installed in a global system path, and place it in build/externals/install/lib/libtiledb.dylib
.
If you wish to build with a custom version of the TileDB core library, you can define the environment variable TILEDB_HOME
, e.g.:
env TILEDB_HOME=/path/to/TileDB/dist ./gradlew assemble
Note that if you build with a custom native TileDB library, it will only be bundled into the jar if the native static library was produced.
If TileDB is not globally installed in the system where the JNI library is being compiled, the TileDB core Library will be compiled. There are multiple properties which can be configured, including S3 and HDFS support.
See gradle.properties for all properties which can be set for building.
The properties can be set via the -P
option to gradlew
:
To run the tests use:
TileDB has been tested on Ubuntu Linux (v.20.04+), CentOS Linux (v.7+, with updated devtoolset compiler), macOS (v.11) and Windows (7+), but TileDB should work with any reasonably recent version of Ubuntu, CentOS, macOS or Windows with an installed compiler supporting C++20 (minimum tested version: GCC 10).
Once you build TileDB, visit the Usage page to see how to use TileDB in your programs.
TileDB requires a recent version of the CMake build system (and if needed vcpkg will update), and a compiler supporting C++20. For compression, TileDB relies on the following libraries:
When building from source, TileDB will locate these dependencies if already installed on your system, and locally install (not system-wide) any of them that are missing.
Backend support for S3 stores requires the AWS C++ SDK. Similarly to the required dependencies, the TileDB build system will install the SDK locally if it is not already present on your system (when the S3 build option is enabled).
TileDB also integrates well with the S3-compliant minio object store.
Backend support for the Hadoop File System HDFS is optional. TileDB relies on the C interface to HDFS provided by libhdfs to interact with the distributed filesystem.
During the build process the following environmental variables must be set:
JAVA_HOME
: Path to the location of the Java installation.
HADOOP_HOME
: Path to the location of the HDFS installation.
CLASSPATH
: The Hadoop jars must be added to the CLASSPATH
before interacting with libhdfs
.
Consult the HDFS user guide for installing, setting up, and using the distributed Hadoop file system.
HDFS is not currently supported on Windows.
If any dependencies are not found pre-installed on your system, the TileDB build process will download and build them automatically. Preferentially, any dependencies built by this process will be built as static libraries, which are statically linked against the TileDB shared library during the build. This simplifies usage of TileDB, as it results in a single binary object, e.g. libtiledb.so
that contains all of the dependencies. When installing TileDB, only the TileDB include files and the dynamic object libtiledb.so
will be copied into the installation prefix.
If TileDB is itself built as a static library (using the TILEDB_STATIC=ON
CMake variable or corresponding bootstrap
flag), the dependency static libraries must be installed alongside the resulting static libtiledb.a
object. This is because static libraries cannot be statically linked together into a single object (at least not in a portable way). Therefore, when installing TileDB all static dependency libraries will be copied into the installation prefix alongside libtiledb.a
.
Build dependencies:
NumPy
Cython
pybind11
scikit-build-core
C++20 compiler
CMake
Runtime Dependencies
NumPy
Simply execute the following commands:
If you wish to modify the install process, you can use these environment variables:
TILEDB_PATH
: Path to TileDB core library. If this variable is set and the library is found in the specified folder it is not copied inside of the wheel.
TILEDB_VERSION
: Version of the TileDB core library that you wish to download. This version must be present in the Github releases.
TILEDB_HASH
: SHA256 sum of the desired TileDB core library release. Only used when TILEDB_VERSION
is set.
To build against libtiledb
installed with conda, run:
To test your local installation, install optional dependencies, and then use pytest
:
If TileDB is installed in a non-standard location, you also need to make the dynamic linker aware of libtiledb
's location. Otherwise when importing the tiledb
module you will get an error that the built extension module cannot find libtiledb
's symbols:
For macOS the linker environment variable is DYLD_LIBRARY_PATH
.
If you are building the extension on Windows, first install a Python distribution such as Miniconda. You can then either build TileDB from source, or download the pre-built binaries.
Once you've installed Miniconda and TileDB, execute:
Note that if you built TileDB locally from source, then replace set TILEDB_PATH=C:/path/to/TileDB
with TILEDB_PATH=C:/path/to/TileDB/dist
.
The repository that contains the Docker files for TileDB:
Note that this contains only the TileDB core (C/C++) library and the Python bindings. The reason we exclude all the other bindings (e.g., Java) is to keep the Docker image size relatively small.
Download the prebuilt Docker images from Dockerhub:
Install the Docker daemon from
Clone the TileDB-Docker
repo and build the images:
There is also a tiledb:dev
image if you'd like the latest and greatest (but potentially unstable) TileDB version.
To run:
If you'd like to build TileDB with optional components such as HDFS support, use the enable
build argument when building the images, e.g.:
This package requires the TileDB shared library to be installed and on the system path.
Currently the following platforms are supported:
Linux
macOS
To install the Go bindings:
To install package test dependencies:
Package tests can be run with:
TileDB-Go follows semantic versioning. TileDB-Go version 0.X.Y
is compatible with TileDB core version 1.X.Y
.
The following TileDB core library features are missing from the go api:
TileDB object management
To use TileDB C API in a program, just add #include <tiledb/tiledb.h>
and specify -ltiledb
when compiling, e.g.
To use the C++ API, add #include <tiledb/tiledb>
to your C++ project instead. The TileDB C++ API requires a compiler with C++17 support, thus your project must be compiled using the C++17 standard, e.g.
If TileDB was installed in a non-default location on your system, use the -I
and -L
options:
At runtime, if TileDB is installed in a non-default location, you must make the linker aware of where the shared library resides by exporting an environment variable:
You can avoid the use of these environment variables by installing TileDB in a global (standard) location on your system, or hard-coding the path to the TileDB library at build time by configuring the rpath
, e.g.
Building your program this way will result in a binary that will run without having to configure the LD_LIBRARY_PATH
or DYLD_LIBRARY_PATH
environment variables.
Alternatively, when installing to system-wide paths known to ldconfig
(typically in /etc/ld.so.conf.d/*
or /etc/ld.so.conf
), run sudo ldconfig
after installation to update the search cache.
To use TileDB from a Visual Studio C++ project, you need to add project properties telling the compiler and linker where to find the headers and libraries.
Open your project’s Property Pages. Under the General options for C/C++, edit the “Additional Include Directories” property. Add a new entry pointing to your TileDB installation (either built from source or extracted from the binary release .zip file), e.g. C:\path\to\TileDB\dist\include
.
Under the General options for the Linker, edit the “Additional Library Directories” property. Add a new entry pointing to your TileDB installation, e.g. C:\path\to\TileDB\dist\lib
. Under the Input options for Linker, edit “Additional Dependencies” and add tiledb.lib
.
You should now be able to add #include <tiledb/tiledb.h>
(C API) or #include <tiledb/tiledb>
(C++ API) in your project.
When building your project in Visual Studio, ensure that the x64
build configuration is selected. Because TileDB is currently only available as a 64-bit library, applications that link with TileDB must also be 64-bit.
At runtime, the directory containing the DLLs must be in your PATH
environment variable, or you will see error messages at startup that the TileDB library or its dependencies could not be located. You can do this in Visual Studio by adding PATH=C:\path\to\TileDB\dist\bin
to the “Environment” setting under “Debugging” in the Property Pages. You can also do this from the Windows Control Panel, or at the command prompt like so:
Should you experience any problem with the usage (e.g., getting errors about missing .dll
files when running a program), it is always a good idea to delete the build
and dist
directories in your TileDB repo path and restart the build from scratch, as cmake
’s cached state could present some unexpected problems.
TileDB includes support for CMake’s find_package()
. To use, TileDB must be installed globally or CMAKE_PREFIX_PATH
must be set to the TileDB installation directory.
For example if TileDB was built with ../bootstrap
and no prefix was given then the </path/to/TileDB>/dist/lib/cmake/TileDB
directory will contain the TileDBConfig.cmake
file used for find_package(TileDB)
. In your project, you would set CMAKE_PREFIX_PATH
like so:
You can also pass this like any other CMake variable on the command line when configuring your project, e.g.
To link the executable MyExe
in your project with the TileDB shared library, you would then use:
While disabled by default, TileDB can also be built as a static library. To do this, use the --enable-static-tiledb
(macOS/Linux) or -EnableStaticTileDB
(Windows) bootstrap flag when configuring TileDB, or use the CMake equivalent flag -DTILEDB_STATIC=ON
. Then in your project simply link against the tiledb_static
target instead:
Build dependencies:
.NET 7 SDK
.NET 7 is needed only to build from source; the compiled binaries support at minimum .NET 5.
Building TileDB-CSharp from source can be done using the following commands in a .
As a final build step, we can verify our installation by running unit tests
To help get started using TileDB-CSharp we can run the project:
After the TileDB.CSharp project is built and tests are passing we can make a new .NET project and add a reference to TileDB.CSharp, granting us access to the TileDB-CSharp API.
Resulting in TileDB-Project/ConsoleApp/ConsoleApp.csproj
generating the following configuration:
After installing TileDB, from an R shell:
You can see the in the TileDB source repository to see an example project structure that links against TileDB.
The TileDB.CSharp
project uses . You can during development provide your own native library for purposes like testing. To do that, you have to go to the Directory.Packages.props
file of your repository, and set the LocalLibraryFile
property to the path of your local native binary. This will bypass the standard acquisition mechanism and simply copy the libeary to your project's output directory.
The shipped TileDB.CSharp
NuGet package supports only the official native binaries at the moment. Please or if you want to use TileDB from C# with custom native binaries.
After building TileDB-Java, you can run the examples located in path/to/TileDB-Java/src/main/java/examples
using you IDE or from a terminal.
To run an example from the terminal, use:
You may need to explicitly define the java library path if not using the bundled jar:
After creating some dimensions, you can create the array domain as follows:
The order of the dimensions as added to the domain is important later when slicing subarrays. Remember to give priority to more selective dimensions, in order to maximize the pruning power during slicing.
When creating the domain, the dimension names must be unique.
After installing the Go bindings, download any of the TileDB-Go examples, and then run with:
After installing TileDB and the Python bindings, from a Python shell:
In order to create an encrypted array, you simply need to pass your secret key upon the array creation:
Creating an attribute requires specifying a datatype, and optionally an attribute name (must be unique, and attribute names starting with __
are reserved). In the example below we create an int32
attribute called attr
.
An attribute can also store a fixed number of values (of the same datatype) in a single cell, or a variable number of values. You can specify this as follows:
An attribute may also be nullable. This allows designating each cell as valid or null. Applicable to both fixed-sized and var-sized attributes.
Note: nullable Python attributes should be used with the from_pandas
API or Pandas series with a Pandas extension dtype (e.g. StringDtype
).
Supported Attribute Datatypes:
Crossed data types are deprecated.
For fixed-sized attributes, the input fill value size should be equal to the cell size.
After creating the , and , you can create the array schema as follows:
When creating the array schema, the dimension and attribute names must be unique.
You can set the data tile capacity (applicable to sparse fragments), as follows:
Sparse arrays may allow multiple cells with the same coordinates to exist (dense arrays do not allow duplicates). By default, duplicates are not allowed. You can specify that a sparse array allows duplicates as follows:
When duplicates are allowed, checking for duplicates and deduplication are disabled.
You can check if the array schema is set properly as follows:
Attributes accept filters such as compressors. This is described in detail .
There are situations where you may read "empty spaces" from TileDB arrays. For those empty spaces, a read query will return some for the selected attributes. You can set your own fill values for these cases as follows:
A call to setting the number of cells for an attribute (see above) sets the fill value of the attribute to its . Therefore, make sure you set the fill values after deciding on the number of values this attribute will hold in each cell.
You can set the tile and cell order as follows. The tile order may be set to row-major or column-major; the cell order may be set to row-major, column-major, or .