Result Estimation

When reading from sparse arrays or variable-length attributes from either dense or sparse arrays, there is no way to know how big the result will be, unless we actually execute the query. If that is the case, how should one allocate their buffers before passing them to TileDB? TileDB offers a way to get the estimated result size for any attribute. Note that TileDB does not actually execute the query and, therefore, getting the estimated result is very fast. However, this comes at the cost of accuracy, since allocating your buffers based on the estimate may still lead to incomplete queries. Therefore, you should always check for the query status, even if you allocate your buffers based on the result estimate.

You can get the result estimate as follows:

C
C++
Python
R
Java
Go
C
// ... create context ctx
// ... create query
// Get estimate for a fixed-length attribute or dimension (in bytes)
uint64_t size;
tiledb_query_get_est_result_size(ctx, query, "a", &size);
// Get estimate for a variable-length attribute or dimension (in bytes)
uint64_t size_off, size_val;
tiledb_query_get_est_query_size_var(ctx, query, "b", &size_off, &size_val);
C++
// ... create query
// Get estimate for a fixed-length attribute or dimension (in bytes)
uint64_t est_size = query.est_result_size("a");
// Get estimate for a variable-length attribute or dimension (in bytes)
// The first returned element is for the offsets buffer, and the second
// for the variable-length values buffer
std::pair<uint64_t, uint64_t> est_size = query.est_result_size_var("b");
Python
# Create query object:
with tiledb.open(uri) as A:
iterable = A.query(return_incomplete=True).multi_index[:]
# then call `estimated_result_sizes`, which will return an
# OrderedDict of {'result name': estimate}
iterable.estimated_result_sizes()
R
# ...create query object
# estimated size of a fixed-length attribute in sparse array
sz <- tiledb_query_get_est_result_size(qryptr, "a")
# estimated size of a variable-length attribute in dense of sparse array
sz <- tiledb_query_get_est_result_size_var(qryptr, "b")
Java
// ... create query
// Get estimate for a fixed-length attribute or dimension (in bytes)
int estSize = query.getEstResultSize("a");
// Get estimate for a variable-length attribute or dimension (in bytes)
// The first returned element is for the offsets buffer, and the second
// for the variable-length values buffer
Pair<Integer, Integer> estSize = query.getEstResultSizeVar("b");
Go
// ... create context ctx
// ... create query
// Get estimate for a fixed-length attribute or dimension (in bytes)
size, _ := query.EstResultSize("a")
// Get estimate for a variable-length attribute or dimension (in bytes)
sizeOff, sizeVal, _ := query.EstResultSizeVar("a")

The number of bytes returned is an estimation and may not be divisible by the datatype size. It is left to the user to perform any ceiling operations necessary.