Basic Reading

To read either a dense or a sparse array, the user typically opens the array in read mode and provides a subarray, any subset of the attributes (potentially including the coordinates) and the layout to get the results into (see Reading for more details). You can read from an array as follows:

// Create TileDB context
tiledb_ctx_t* ctx;
tiledb_ctx_alloc(NULL, &ctx);

// Open a 2D array for reading
tiledb_array_t* array;
tiledb_array_alloc(ctx, "<array-uri>", &array);
tiledb_array_open(ctx, array, TILEDB_READ);

// Slice only rows 1, 2 and cols 2, 3, 4
int subarray[] = {1, 2, 2, 4};

// Prepare the vectors that will hold the results
int d1[20];
uint64_t d1_size = sizeof(d1);
int d2[20];
uint64_t d2_size = sizeof(d2);
int a[20];
uint64_t a_size = sizeof(a);

// Create query
tiledb_query_t* query;
tiledb_query_alloc(ctx, array, TILEDB_READ, &query);
tiledb_query_set_subarray(ctx, query, subarray);
tiledb_query_set_layout(ctx, query, TILEDB_ROW_MAJOR);
tiledb_query_set_data_buffer(ctx, query, "a", a, &a_size);
tiledb_query_set_data_buffer(ctx, query, "d1", d1, &d1_size);
tiledb_query_set_data_buffer(ctx, query, "d2", d2, &d2_size);

// NOTE: although not recommended (for performance reasons), 
// you can get the coordinates even when slicing dense arrays. 

// NOTE: The layout could have also been TILEDB_COL_MAJOR or
// TILEDB_GLOBAL_ORDER.

// Submit query
tiledb_query_submit(ctx, query);

// Close array
tiledb_array_close(ctx, array);

// NOTE: a_size, d1_size and d2_size now reflect the result size,
// i.e., TileDB changes those values so that you know how many
// results were retrieved (in bytes)

// Clean up
tiledb_array_free(&array);
tiledb_query_free(&query);
tiledb_ctx_free(&ctx);

Variable-length Attributes

You can read variable-length attributes (as written by the earlier example) as follows:

// ... create contect ctx
// ... create query

// You need two buffers per variable-length attribute
char b_val[100];
unsigned long long b_val_size = sizeof(b_val);
unsigned long long b_off[20];
unsigned long long b_off_size = sizeof(b_off);

// Set buffers for the variable-length attributes
tiledb_query_set_data_buffer(ctx, query, "b", b_val, &b_val_size);
tiledb_query_set_offsets_buffer(ctx, query, "b", b_off, &b_off_size);

// Submit query
tiledb_query_submit(ctx, query);

// Close array
tiledb_array_close(ctx, array);

// NOTE: b_off_size and b_val_size now reflect the result size (in bytes)
// for the offsets and values of the results on this attribute,
// i.e., TileDB changes those values so that you know how many
// results were retrieved

// Clean up
tiledb_array_free(&array);
tiledb_query_free(&query);
tiledb_ctx_free(&ctx);

Fixed-length, Nullable Attributes

You can read fixed-length, nullable attributes as follows:

// Create TileDB context
tiledb_ctx_t* ctx;
tiledb_ctx_alloc(NULL, &ctx);

// Open a 2D array for reading
tiledb_array_t* array;
tiledb_array_alloc(ctx, "<array-uri>", &array);
tiledb_array_open(ctx, array, TILEDB_READ);

// Slice only rows 1, 2 and cols 2, 3, 4
int subarray[] = {1, 2, 2, 4};

// Prepare the vectors that will hold the results
int d1[20];
uint64_t d1_size = sizeof(d1);
int d2[20];
uint64_t d2_size = sizeof(d2);
int a[20];
uint64_t a_size = sizeof(a);
uint8_t a_validity[20];
uint64_t a_validity_size = sizeof(a_validity);

// Create query
tiledb_query_t* query;
tiledb_query_alloc(ctx, array, TILEDB_READ, &query);
tiledb_query_set_subarray(ctx, query, subarray);
tiledb_query_set_layout(ctx, query, TILEDB_ROW_MAJOR);
tiledb_query_set_data_buffer(ctx, query, "d1", d1, &d1_size);
tiledb_query_set_data_buffer(ctx, query, "d2", d2, &d2_size);
tiledb_query_set_data_buffer(ctx, query, "a", a, &a_size);
tiledb_query_set_validity_buffer(
    ctx, query, "a", a_validity, &a_validity_size);

// NOTE: although not recommended (for performance reasons), 
// you can get the coordinates even when slicing dense arrays. 

// NOTE: The layout could have also been TILEDB_COL_MAJOR or
// TILEDB_GLOBAL_ORDER.

// Submit query
tiledb_query_submit(ctx, query);

// Close array
tiledb_array_close(ctx, array);

// NOTE: a_size, a_validity_size, d1_size and d2_size now reflect
// the result size, i.e., TileDB changes those values so that you
// know how many results were retrieved (in bytes)

// Clean up
tiledb_array_free(&array);
(&query);
tiledb_ctx_free(&ctx);

Variable-length, Nullable Attributes

You can read variable-length, nullable attributes as follows:

// ... create contect ctx
// ... create query

// You need three buffers per variable-length, nullable attribute
char b_val[100];
unsigned long long b_val_size = sizeof(b_val);
unsigned long long b_off[20];
unsigned long long b_off_size = sizeof(b_off);
uint8_t b_validity[20];
unsigned long long b_validity_size = sizeof(b_validity);

// Set buffers for the variable-length, nullable attribute
tiledb_query_set_data_buffer(ctx, query, "b", b_val, &b_val_size);
tiledb_query_set_offsets_buffer(ctx, query, "b", b_off, &b_off_size);
tiledb_query_set_validity_buffer(
  ctx, query, "b", b_validity, &b_validity_size);

// Submit query
tiledb_query_submit(ctx, query);

// Close array
tiledb_array_close(ctx, array);

// NOTE: b_off_size, b_val_size, and b_validity_size now reflect
// the result size (in bytes) for the offsets, data values, and validity
// values of the results on this attribute, i.e., TileDB changes those
// values so that you know how many results were retrieved

// Clean up
tiledb_array_free(&array);
tiledb_query_free(&query);
tiledb_ctx_free(&ctx);

Getting the Non-empty Domain

You can get the non-empty domain of an array as follows:

// ... open array for reading

// Get non-empty domain for a dimension based on its index
int dom[2];
int is_empty;
tiledb_array_get_non_empty_domain_from_index(ctx, array, 0, domain, &is_empty);

// Or by name
tiledb_array_get_non_empty_domain_from_name(ctx, array, "dim", domain, &is_empty);

// For string dimensions, we need to first get the size of the 
// start and end of the domain range, using the dimension index
unsigned long long start_size, end_size;
tiledb_array_get_non_empty_domain_var_size_from_index(
    ctx, array, 0, &start_size, &end_size, &is_empty);
// Or by dimension name
tiledb_array_get_non_empty_domain_var_size_from_name(
    ctx, array, "dim", &start_size, &end_size, &is_empty);

// Then we can allocate appropriately strings that will hold the start and end
char start[start_size];
char end[end_size];
tiledb_array_get_non_empty_domain_var_from_index(
    ctx, array, 0, start, end, &is_empty);
// Or by dimension name
tiledb_array_get_non_empty_domain_var_from_name(
    ctx, array, "dim", start, end, &is_empty);

Reopening Arrays

Assuming an already open array, you can reopen the array at the current timestamp. This is useful when potential writes happened since you last opened the array, and you wish to reopen it to get the most up-to-date view of the array. Also note that this is more efficient than closing and opening the array, as it will prevent refetching already loaded fragment metadata. You can reopen an array as follows:

// ... create context ctx
// ... open an array for reading

tiledb_array_reopen(ctx, array);

Slicing Negative Domains

You can slice negative domains in Python as follows:

# NOTE: In `multi_index`, all ranges are inclusive
with tiledb.SparseArray(path) as A:
    print(A.multi_index[-3:3])

Last updated