Datetimes with numpy

Datetime support in core TileDB depends on the high-level APIs for each language to provide usability and interoperability with high-level datetime objects. On this page we demonstrate using the TileDB Python API, where np.datetime64 objects can be used directly.

The following example shows how to use np.datetime64 to create a 1D dense TileDB array with a domain of 10 years, at the day resolution.

import numpy as np

# Domain is 10 years, day resolution, one tile per 365 days
dim = tiledb.Dim(name='d1', ctx=ctx,
                 domain=(np.datetime64('2010-01-01'), np.datetime64('2020-01-01')),
                 tile=np.timedelta64(365, 'D'),
                 dtype=np.datetime64('', 'D').dtype)
dom = tiledb.Domain(dim, ctx=ctx)
schema = tiledb.ArraySchema(ctx=ctx, domain=dom,
                            attrs=(tiledb.Attr('a1', dtype=np.float64),))

tiledb.Array.create(my_array_name, schema)

Note the usage of np.timedelta64 to specify the tile extent of the vector.

Writing a range of values to the array is as expected:

# Randomly generate 2 years of values for attribute 'a1'
ndays = 365 * 2
a1_vals = np.random.rand(ndays)

# Write the data at the beginning of the domain.
start = np.datetime64('2010-01-01')
end = start + np.timedelta64(ndays - 1, 'D')

with tiledb.DenseArray('my_array_name', 'w', ctx=ctx) as T:
    T[start: end] = {'a1': a1_vals}

The array can also be sliced using datetimes when reading:

# Slice a few days from the middle using two datetimes
with tiledb.DenseArray('my_array_name', 'r', attr='a1', ctx=ctx) as T:
    vals = T[np.datetime64('2010-11-01'): np.datetime64('2011-01-31')]

TileDB internally stores datetimes asre int64 values. Therefore, slicing datetime dimensions in this case is just as efficient as other integer domain types.

Last updated