vfs.min_batch_gap
and their resulting size is not bigger than vfs.min_batch_size
. Then, each byte range (always corresponding to the same attribute file) becomes an IO task. These IO tasks are dispatched for concurrent execution, where the maximum level of concurrency is controlled by the sm.io_concurrency_level
parameter.vfs.file.max_parallel_ops
(for posix and Windows), vfs.s3.max_parallel_ops
(for S3) and vfs.min_parallel_size
. Those partitions are then read in parallel. Currently, the maximum parallel operations for HDFS is set to 1, i.e., this task parallelization step does not apply to HDFS.sm.compute_concurrency_level
parameter impacts the for
loops above, although it is not recommended to modify this configuration parameter from its default setting. The nested parallelism in reads allows for maximum utilization of the available cores for filtering (e.g. decompression), in either the case where the query intersects few large tiles or many small tiles. sm.compute_concurrency_level
parameter impacts the for
loops above, although it is not recommended to modify this configuration parameter from its default setting.sm.io_concurrency_level
parameter. For HDFS, this is the only parallelization TileDB provides for writes. For the other backends, TileDB parallelizes the writes further. vfs.file.max_parallel_ops
and vfs.min_parallel_size
. Those partitions are then written in parallel using the VFS thread pool, whose size is controlled by vfs.io_concurrency
. vfs.s3.max_parallel_ops * vfs.s3.multipart_part_size
. When the buffer is filled, TileDB issues vfs.s3.max_parallel_ops
parallel multipart upload requests to S3.