Using Performance Statistics
A lot of performance optimization for TileDB programs involves minimizing wasted work. TileDB comes with an internal statistics reporting system that can help identify potential areas of performance improvement for your TileDB programs, including reducing wasted work.
The TileDB statistics can be enabled and disabled at runtime, and a report can be dumped at any point. A typical situation is to enable the statistics immediately before submitting a query, submit the query, and then immediately dump the report. This can be done like so:
tiledb_stats_enable();
// ... create some query here
tiledb_query_submit(ctx, query);
// Dump the statistics
tiledb_stats_dump(FILE* out);
tiledb_stats_disable();
// You can also reset the stats as follows
tiledb_stats_reset();
tiledb::Stats::enable();
// ... create some query here
// Submit the query
query.submit();
// Dump the statistics
tiledb::Stats::dump(stdout);
tiledb::Stats::disable();
// You can also reset the stats as follows
tiledb::Stats::reset();
tiledb.stats_enable()
# Do some work
data = A[:]
# Dump the statistics
tiledb.stats_dump()
tiledb.stats_disable()
# You can also reset the stats as follows
tiledb.stats_reset()
# You can get the report for invidual context and query runs
with tiledb.open(array_uri, mode="r", ctx=ctx) as A:
qry = A.query()[:]
# Dump the stats for a given query call
qry.get_stats()
# Dump the stats for a given context
ctx.get_stats()
# Start collecting statistics
tiledb_stats_enable()
# ... create some query here
A[1:4]
# Stop collecting statistics
tiledb_stats_disable()
# Show the statistics on the console
tiledb_stats_print()
# Save the statistics to a file
tiledb_stats_dump(my_file_name)
# You can reset the stats as follows
tiledb_stats_reset()
# Statistics can also be collected on a per-query basis
arr <- tiledb_array(uri, query_statistics = TRUE)
res <- arr[]
# we can access the 'query_statistics' attribute of the result
qstats <- attr(res, "query_statistics")
## then use jsonlite::fromJSON(qstats) to parse
# Statistics can also be retrieved for the current context
cstats <- tiledb_ctx_stats()
## then jsonlite::fromJSON(cstats)
try (Array array = new Array(ctx, "<array-uri>", TILEDB_READ);
ArraySchema schema = array.getSchema();
Query query = new Query(array, TILEDB_READ)) {
query.addRange(0, 1, 2);
query.addRange(1, 2, 4);
query.setLayout(TILEDB_ROW_MAJOR);
NativeArray dim1Array = new NativeArray(ctx, 6, Integer.class);
NativeArray dim2Array = new NativeArray(ctx, 6, Integer.class);
NativeArray a1Array = new NativeArray(ctx, 12, Character.class);
NativeArray a2Array = new NativeArray(ctx, 6, Float.class);
query.setBuffer("rows", dim1Array);
query.setBuffer("cols", dim2Array);
query.setBuffer("a1", a1Array);
query.setBuffer("a2", a2Array);
// Submit query
query.submit();
String stats = query.getStats();
System.out.println(query.getStats());
tiledb.StatsEnable()
// ... create some query here
// Submit the query
query.Submit()
// Dump the statistics
tiledb.StatsDumpSTDOUT()
tiledb.StatsDisable()
// You can also reset the stats as follows
tiledb.StatsReset()
With the dump
call, a report containing the gathered statistics will be printed. The report prints values of many individual counters. Typically the summary contains the necessary information to make high-level performance tuning decisions. An example summary is shown below:
{
"timers": {
"Context.StorageManager.Query.Subarray.read_load_relevant_rtrees.sum": 0.000225025,
"Context.StorageManager.Query.Subarray.read_load_relevant_rtrees.avg": 0.000112513,
"Context.StorageManager.Query.Subarray.read_compute_tile_overlap.sum": 0.000541446,
"Context.StorageManager.Query.Subarray.read_compute_tile_overlap.avg": 0.000270723,
"Context.StorageManager.Query.Subarray.read_compute_tile_coords.sum": 3.001e-06,
"Context.StorageManager.Query.Subarray.read_compute_tile_coords.avg": 3.001e-06,
"Context.StorageManager.Query.Subarray.read_compute_relevant_tile_overlap.sum": 0.000177502,
"Context.StorageManager.Query.Subarray.read_compute_relevant_tile_overlap.avg": 8.8751e-05,
"Context.StorageManager.Query.Subarray.read_compute_relevant_frags.sum": 0.000110015,
"Context.StorageManager.Query.Subarray.read_compute_relevant_frags.avg": 5.50075e-05,
"Context.StorageManager.Query.Subarray.read_compute_est_result_size.sum": 0.000230768,
"Context.StorageManager.Query.Subarray.read_compute_est_result_size.avg": 2.8846e-05,
"Context.StorageManager.Query.Reader.unfilter_attr_tiles.sum": 0.000114806,
"Context.StorageManager.Query.Reader.unfilter_attr_tiles.avg": 5.7403e-05,
"Context.StorageManager.Query.Reader.read.sum": 0.00153529,
"Context.StorageManager.Query.Reader.read.avg": 0.00153529,
"Context.StorageManager.Query.Reader.load_tile_offsets.sum": 0.000285125,
"Context.StorageManager.Query.Reader.load_tile_offsets.avg": 0.000285125,
"Context.StorageManager.Query.Reader.init_state.sum": 2.1814e-05,
"Context.StorageManager.Query.Reader.init_state.avg": 2.1814e-05,
"Context.StorageManager.Query.Reader.fill_dense_coords.sum": 4.537e-06,
"Context.StorageManager.Query.Reader.fill_dense_coords.avg": 4.537e-06,
"Context.StorageManager.Query.Reader.copy_fixed_attr_values.sum": 2.6773e-05,
"Context.StorageManager.Query.Reader.copy_fixed_attr_values.avg": 1.33865e-05,
"Context.StorageManager.Query.Reader.copy_attr_values.sum": 0.000641413,
"Context.StorageManager.Query.Reader.copy_attr_values.avg": 0.000641413,
"Context.StorageManager.Query.Reader.compute_sparse_result_tiles.sum": 2.39e-06,
"Context.StorageManager.Query.Reader.compute_sparse_result_tiles.avg": 2.39e-06,
"Context.StorageManager.Query.Reader.compute_sparse_result_cell_slabs_dense.sum": 3.941e-05,
"Context.StorageManager.Query.Reader.compute_sparse_result_cell_slabs_dense.avg": 3.941e-05,
"Context.StorageManager.Query.Reader.compute_result_coords.sum": 7.273e-06,
"Context.StorageManager.Query.Reader.compute_result_coords.avg": 7.273e-06,
"Context.StorageManager.Query.Reader.attr_tiles.sum": 0.000193725,
"Context.StorageManager.Query.Reader.attr_tiles.avg": 0.000193725,
"Context.StorageManager.Query.Reader.SubarrayPartitioner.read_next_partition.sum": 0.000815125,
"Context.StorageManager.Query.Reader.SubarrayPartitioner.read_next_partition.avg": 0.000815125
},
"counters": {
"Context.StorageManager.Query.Subarray.precompute_tile_overlap.tile_overlap_byte_size": 160,
"Context.StorageManager.Query.Subarray.precompute_tile_overlap.relevant_fragment_num": 2,
"Context.StorageManager.Query.Subarray.precompute_tile_overlap.ranges_requested": 2,
"Context.StorageManager.Query.Subarray.precompute_tile_overlap.ranges_computed": 2,
"Context.StorageManager.Query.Subarray.precompute_tile_overlap.fragment_num": 2,
"Context.StorageManager.Query.Reader.result_num": 3,
"Context.StorageManager.Query.Reader.read_unfiltered_byte_num": 72,
"Context.StorageManager.Query.Reader.overlap_tile_num": 2,
"Context.StorageManager.Query.Reader.loop_num": 1,
"Context.StorageManager.Query.Reader.dim_num": 2,
"Context.StorageManager.Query.Reader.dim_fixed_num": 2,
"Context.StorageManager.Query.Reader.cell_num": 8,
"Context.StorageManager.Query.Reader.attr_num": 2,
"Context.StorageManager.Query.Reader.attr_fixed_num": 2,
"Context.StorageManager.Query.Reader.SubarrayPartitioner.compute_current_start_end.not_found": 1,
"Context.StorageManager.Query.Reader.SubarrayPartitioner.compute_current_start_end.fixed_result_size_overflow": 1
}
}
The TileDB library is built by default with statistics enabled. You can disable statistics gathering with the -DTILEDB_STATS=OFF
CMake variable.
Last updated