Result Estimation

When reading from sparse arrays or variable-length attributes from either dense or sparse arrays, there is no way to know how big the result will be, unless we actually execute the query. If that is the case, how should one allocate their buffers before passing them to TileDB? TileDB offers a way to get the estimated result size for any attribute. Note that TileDB does not actually execute the query and, therefore, getting the estimated result is very fast. However, this comes at the cost of accuracy, since allocating your buffers based on the estimate may still lead to incomplete queries. Therefore, you should always check for the query status, even if you allocate your buffers based on the result estimate.

You can get the result estimate as follows:

// ... create context ctx
// ... create query

// Get estimate for a fixed-length attribute or dimension (in bytes)
uint64_t size;
tiledb_query_get_est_result_size(ctx, query, "a", &size);

// Get estimate for a variable-length attribute or dimension (in bytes)
uint64_t size_off, size_val;
tiledb_query_get_est_query_size_var(ctx, query, "b", &size_off, &size_val);

The number of bytes returned is an estimation and may not be divisible by the datatype size. It is left to the user to perform any ceiling operations necessary.

Last updated