Result Estimation

When reading from sparse arrays or variable-length attributes from either dense or sparse arrays, there is no way to know how big the result will be, unless we actually execute the query. If that is the case, how should one allocate their buffers before passing them to TileDB? TileDB offers a way to get the estimated result size for any attribute. Note that TileDB does not actually execute the query and, therefore, getting the estimated result is very fast. However, this comes at the cost of accuracy, since allocating your buffers based on the estimate may still lead to incomplete queries. Therefore, you should always check for the query status, even if you allocate your buffers based on the result estimate.

You can get the result estimate as follows:

C
C++
Python
R
Java
Go
C
// ... create context ctx
// ... create query
// Get estimate for a fixed-length attribute (in bytes)
uint64_t size;
tiledb_query_get_est_result_size(ctx, query, "a", &size);
// `tiledb_query_get_est_result_size(ctx, query, TILEDB_COORDS, &size);` would also work
// Get estimate for a variable-length attribute (in bytes)
uint64_t size_off, size_val;
tiledb_query_get_est_query_size_var(ctx, query, "b", &size_off, &size_val);
C++
// ... create query
// Get estimate for a fixed-length attribute (in bytes)
uint64_t est_size = query.est_result_size("a");
// `uint64_t est_size = query.est_result_size(TILEDB_COORDS);` would also work
// Get estimate for a variable-length attribute (in bytes)
// The first returned element is for the offsets buffer, and the second
// for the variable-length values buffer
std::pair<uint64_t, uint64_t> est_size = query.est_result_size_var("b");
Python
# The Python API automatically allocates result buffers for
# queries, by calling the estimation and incomplete/retry APIs
# as necessary to satisfy a request.
R
# TODO: R does not support estimated result sizes
Java
// TODO: Currently not supported in Java
Go
// TODO: Currently not supported in Go