Summary of Factors

Here you can find a summary of the most important performance factors of TileDB.

Arrays

The following factors must be considered when designing the array schema. In general, designing an effective schema that results in high performance depends on the type of data that the array will store, and the type of queries that will be subsequently issued to the array.

Dense vs. Sparse Arrays

Choosing a dense or sparse array representation depends on the particular data and use case. Sparse arrays in TileDB have additional overhead versus dense arrays due to the need to explicitly store cell coordinates of the non-empty cells. For a particular use case, this extra overhead may be outweighed by the storage savings of not storing dummy values for empty cells in the dense alternative, in which case a sparse array would be a good choice. Sparse array reads also have additional runtime overhead due to the potential need of sorting the cells based on their coordinates.

Space Tile Extent

In dense arrays, the space tile shape and size determines the atomic unit of I/O and compression. All space tiles that intersect a read query’s subarray are read from disk whole, even if they only partially intersect with the query. Matching space tile shape and size closely with the typical read query can greatly improve performance. In sparse arrays, space tiles affect the order of the non-empty cells and, hence, the minimum bounding rectangle (MBR) shape and size. All tiles in a sparse array whose MBRs intersect a read query’s subarray are read from disk. Choosing the space tiles in a way that the shape of the resulting MBR’s is similar to the read access patterns can improve read performance, since fewer MBRs may then overlap with the typical read query.

Tile Capacity

The tile capacity determines the number of cells in a data tile of a sparse fragment, which is the atomic unit of I/O and compression. If the capacity is too small, the data tiles can be too small, requiring many small, independent read operations to fulfill a particular read query. Moreover, very small tiles could lead to poor compressibility. If the capacity is too large, I/O operations would become fewer and larger (a good thing in general), but the amount of wasted work may increase due to reading more data than necessary to fulfill the typical read query.

Dimensions vs. Attributes

If read queries typically access the entire domain of a particular dimension, consider making that dimension an attribute instead. Dimensions should be used when read queries will issue ranges of that dimension of cells to retrieve, which facilitates TileDB to internally prune cell ranges or whole tiles from the data that must be read.

Tile Filtering

Tile filtering (such as compression) can reduce persistent storage consumption, but at the expense of increased time to read/write tiles from/to the array. However, tiles on disk that are smaller due to compression can be read from disk in less time, which can improve performance. Additionally, many compressors offer different levels of compression, which can be used to fine-tune the tradeoff.

Allows duplicates

For sparse arrays, if the data is known to not have any entries that have the same coordinates or if read queries don't mind getting multiple results for a coordinate, allowing duplicates can lead to significant performance improvements. For queries asking for results in the unordered layout, TileDB can process fragments in any order (and less of them at once) as well as skip deduplication of the results, so they will run much faster and require less memory overhead.

Queries

The following factors impact query performance, i.e., when reading/writing data from/to the array.

Number of Array Fragments

Large numbers of fragments can slow down performance during read queries, as TileDB must fetch a lot of fragment metadata, and internally sort and determine potential overlap of cell data across all fragments. Therefore, consolidating arrays with a large number of fragments improves performance; consolidation collapses a set of fragments into a single one.

Attribute Subsetting

A benefit of the column-oriented nature of attribute storage in TileDB is if a read query does not require data for some of the attributes in the array, those attributes will not be touched or read from disk at all. Therefore, specifying only the attributes actually needed for each read query will improve performance. In the current version of TileDB, write queries must still specify values for all attributes.

Read Layout

When an ordered layout (row- or column-major) is used for a read query, if that layout differs from the array physical layout, TileDB must internally sort and reorganize the cell data that is read from the array into the requested order. If you do not care about the particular ordering of cells in the result buffers, you can specify a global order query layout, allowing TileDB to skip the sorting and reorganization steps.

Write Layout

Similarly, if your specified write layout differs from the physical layout of your array (i.e., the global cell order), then TileDB needs to internally sort and reorganize the cell data into global order before writing to the array. If your cells are already laid out in global order, issuing a write query with global-order allows TileDB to skip the sorting and reorganization step. Additionally, multiple consecutive global-order writes to an open array will append to a single fragment rather than creating a new fragment per write, which can improve later the read performance. Care must be taken for global-order writes, as it is an error to write data to a subarray that is not aligned with the space tiling of the array.

Opening and Closing Arrays

Before issuing queries, an array must be “opened.” When an array is opened, TileDB reads and decodes the array and fragment data, which may require disk operations (and potentially decompression). An array can be opened once, and many queries issued to it before closing. Minimizing the number of array open and close operations can improve performance.

When reading from an array, metadata from relevant fragments is fetched from the backend and cached in memory. The more queries you perform, then more metadata is fetched throughout time. In order to save memory space, you will need to occasionally close and reopen the array.

Configuration Parameters

The following runtime configuration parameters can tune the performance of several internal TileDB subsystems.

Cache Size

When a read query causes a tile to be read from disk, TileDB places the uncompressed tile in an in-memory LRU cache associated with the query’s context. When subsequent read queries in the same context request that tile, it may be copied directly from the cache instead of re-fetched from disk. The sm.tile_cache_size config parameter determines the overall size in bytes of the in-memory tile cache. Increasing it can lead to a higher cache hit ratio, and better performance.

Coordinate Deduplication

During sparse writes, setting sm.dedup_coords to true will cause TileDB to internally deduplicate cells being written so that multiple cells with the same coordinate tuples are not written to the array (which is an error). A lighter-weight check is enabled by default with sm.check_coord_dups, meaning TileDB will simply perform the check for duplicates and return an error if any are found. Disabling these checks can lead to better performance on writes.

Coordinate Out-of-bounds Check

During sparse writes, setting sm.check_coord_oob to true (default) will cause TileDB to internally check whether the given coordinates fall outside the domain or not. If you are certain that this is not possible in your application, you can set this param to false, avoiding the check and thus boosting performance.

Coordinate Global Order Check

During sparse writes in global order, setting sm.check_global_order to true (default) will cause TileDB to internally check whether the given coordinates obey the global order. If you are certain that this is not possible in your application, you can set this param to false, avoiding the check and thus boosting performance.

Consolidation Parameters

The effect of the sm.consolidation.* parameters is explained in Consolidation.

Memory Budget

Controlled by the sm.memory_budget, sm.memory_budget_varparameters, this caps the total number of bytes that can be fetched for each fixed- or var-sized attribute during reads. This can prevent OOM issues when a read query overlaps with a huge number of tiles that must be fetched and decompressed in memory. For large subarrays, this may lead to incomplete queries.

Concurrency

Controlled by the sm.compute_concurrency_level and sm.io_concurrency_level parameters. These parameters define an upper-limit on the number of operating system threads to allocate for compute-bound and IO-bound tasks, respectively. These parameters default to the number of logical cores on the system.

VFS Read Batching

During read queries, the VFS system will batch reads for distinct tiles that are physically close together (not necessarily adjacent) in the same file. The vfs.min_batch_size parameter sets the minimum size in bytes that a single batched read operation can be. VFS will use this parameter to group “close by” tiles into the same batch, if the new batch size is smaller than or equal to vfs.min_batch_size. This can help minimize the I/O latency that can come with numerous very small VFS read operations. Similarly, vfs.min_batch_gap defines the minimum number of bytes between the end of one batch and the start of its subsequent one. If two batches are fewer bytes apart than vfs.min_batch_gap, they get stitched into a single batch read operation. This can help better group and parallelize over “adjacent” batch read operations.

S3 Parallelism

This is controlled by vfs.s3.max_parallel_ops. This determines the maximum number of parallel operations for s3:// URIs independently of the VFS thread pool size, allowing you to over- or under-subscribe VFS threads. Oversubscription can be helpful in some cases with S3, to help hide I/O latency.

S3 write size

Replacing vfs.min_parallel_size for S3 objects, vfs.s3.multipart_part_size controls the minimum part size of S3 multipart writes. Note that vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops bytes will be buffered in memory by TileDB before actually submitting an S3 write request, at which point all of the parts of the multipart write are issued in parallel.

System Parameters

Hardware Parallelism

The number of cores and hardware threads of the machine impacts the amount of parallelism TileDB can use internally to accelerate reads, writes and compression/decompression.

Storage Backend (S3, local, etc)

The different types of storage backend (S3, local disk, etc) have different throughput and latency characteristics, which can impact query time.

Last updated