Creating an attribute requires specifying a datatype, and optionally an attribute name (must be unique, and attribute names starting with __ are reserved). In the example below we create an int32 attribute called attr.
#include<tiledb/tiledb.h>// Create contexttiledb_ctx_t* ctx;tiledb_ctx_alloc(NULL,&ctx);// Create attributetiledb_attribute_t* attr;tiledb_attribute_alloc(ctx,"attr", TILEDB_INT32,&attr);// ... // Make sure to free the attribute and contexttiledb_attribute_free(&attr);tiledb_ctx_free(&ctx);
An attribute can also store a fixed number of values (of the same datatype) in a single cell, or a variable number of values. You can specify this as follows:
// Setting a fixed number of valuestiledb_attribute_set_cell_val_num(ctx, attr,3);// Setting a variable number of valuestiledb_attribute_set_cell_val_num(ctx, attr, TILEDB_VAR_NUM);
// Setting a fixed number of valuesattr.set_cell_val_num(3);// Setting a variable number of valuesattr.set_cell_val_num(TILEDB_VAR_NUM);
# Setting a fixed number of valuesattr = tiledb.Attr(name="attr", dtype=np.dtype('i4, i4, i4'))# Setting a variable number of valuesattr = tiledb.Attr(name="attr", dtype=np.int32, var=True)# multiple int32# Strings are implicitly `var=True`:attr = tiledb.Attr(name="str_attr", dtype=str)# is equivalent to:attr = tiledb.Attr(name="foo", dtype=str, var=True)
## Attribute value counts can be retrieved from Rattr <-tiledb_attr("a1", type ="INT32")cell_val_num(a1)## Attribute value counts can be set via the lower level API## set int32 attribute to length threeattr <-tiledb_attr("a1", type ="INT32")tiledb:::libtiledb_attribute_set_cell_val_num(attr@ptr, 3)## set char attribute to variable length which is encoded as a NAattr <-tiledb_attr("a1", type ="CHAR")tiledb:::libtiledb_attribute_set_cell_val_num(attr@ptr, NA)
// Setting a fixed number of valuesattr.setCellValNum(3);// Setting a variable number of valuesattr.setCellValNum(TILEDB_VAR_NUM);
// Setting a fixed number of valuesattr.SetCellValNum(3)// Setting a variable number of valuesattr.SetCellValNum(tiledb.TILEDB_VAR_NUM)
// Setting a fixed number of valuesattr.SetCellValNum(3);// Setting a variable number of valuesattr.SetCellValNum(Attribute.VariableSized);
An attribute may also be nullable. This allows designating each cell as valid or null. Applicable to both fixed-sized and var-sized attributes.
// Setting a nullable attributetiledb_attribute_set_nullable(ctx, attr,1);
// Setting a nullable attributeattr.set_nullable(true);
Note: nullable Python attributes should be used with the from_pandas API or Pandas series with a Pandas extension dtype (e.g. StringDtype).
tiledb.Attr(name="foo", dtype=str, nullable=True)
## Create a nullable attribute attr <-tiledb_attr("a1", type ="INT32", nullable =TRUE)
attr.setNullable(true)
// TODO
attr.SetNullable(true);
Supported Attribute Datatypes:
Crossed data types are deprecated.
Datatype
Description
TILEDB_BLOB
Opaque bytes. Note: the TILEDB_BLOB datatype does not support query conditions.
TILEDB_INT8
8-bit integer
TILEDB_UINT8
8-bit unsigned integer
TILEDB_INT16
16-bit integer
TILEDB_UINT16
16-bit unsigned integer
TILEDB_INT32
32-bit integer
TILEDB_UINT32
32-bit unsigned integer
TILEDB_INT64
64-bit integer
TILEDB_UINT64
64-bit unsigned integer
TILEDB_FLOAT32
32-bit floating point
TILEDB_FLOAT64
64-bit floating point
TILEDB_DATETIME_YEAR
Years
TILEDB_DATETIME_MONTH
Months
TILEDB_DATETIME_WEEK
Weeks
TILEDB_DATETIME_DAY
Days
TILEDB_DATETIME_HR
Hours
TILEDB_DATETIME_MIN
Minutes
TILEDB_DATETIME_SEC
Seconds
TILEDB_DATETIME_MS
Milliseconds
TILEDB_DATETIME_US
Microseconds
TILEDB_DATETIME_NS
Nanoseconds
TILEDB_DATETIME_PS
Picoseconds
TILEDB_DATETIME_FS
Femtoseconds
TILEDB_DATETIME_AS
Attoseconds
TILEDB_CHAR
Single character
TILEDB_STRING_ASCII
ASCII string
TILEDB_STRING_UTF8
UTF-8 string
TILEDB_STRING_UTF16
UTF-16 string
TILEDB_STRING_UTF32
UTF-32 string
TILEDB_STRING_UCS2
UCS-2 string
TILEDB_STRING_UCS4
UCS-4 string
TILEDB_ANY
(datatype, bytelength, value)
Datatype
Description
Internal TILEDB Datatype Mapping
np.int8
8-bit integer
TILEDB_INT8
np.uint8
8-bit unsigned integer
TILEDB_UINT8
np.int16
16-bit integer
TILEDB_INT16
np.uint16
16-bit unsigned integer
TILEDB_UINT16
np.int32
32-bit integer
TILEDB_INT32
np.uint32
32-bit unsigned integer
TILEDB_UINT32
np.int64
64-bit integer
TILEDB_INT64
np.uint64
64-bit unsigned integer
TILEDB_UINT64
np.float32
32-bit floating point
TILEDB_FLOAT32
np.float64
64-bit floating point
TILEDB_FLOAT64
"ascii"
ASCII
TILEDB_STRING_ASCII
np.dtype('S1')
Character
TILEDB_CHAR
np.dtype('U')
Unicode UTF-8
TILEDB_STRING_UTF8
"datetime64[Y]"
Years
TILEDB_DATETIME_YEAR
"datetime64[M]"
Months
TILEDB_DATETIME_MONTH
"datetime64[W]"
Weeks
TILEDB_DATETIME_WEEK
"datetime64[D]"
Days
TILEDB_DATETIME_DAY
"datetime64[h]"
Hours
TILEDB_DATETIME_HR
"datetime64[m]"
Minutes
TILEDB_DATETIME_MIN
"datetime64[s]"
Seconds
TILEDB_DATETIME_SEC
"datetime64[ms]"
Milliseconds
TILEDB_DATETIME_MS
"datetime64[us]"
Microseconds
TILEDB_DATETIME_US
"datetime64[ns]"
Nanoseconds
TILEDB_DATETIME_NS
"datetime64[ps]"
Picoseconds
TILEDB_DATETIME_PS
"datetime64[fs]"
Femtoseconds
TILEDB_DATETIME_FS
"datetime64[as]"
Attoseconds
TILEDB_DATETIME_AS
Datatype
Description
"CHAR"
Single character
"INT8"
8-bit integer
"UINT8"
8-bit unsigned integer
"INT16"
16-bit integer
"UINT16"
16-bit unsigned integer
"INT32"
32-bit integer
"UINT32"
32-bit unsigned integer
"INT64"
64-bit integer
"UINT64"
64-bit unsigned integer
"FLOAT32"
32-bit floating point
"FLOAT64"
64-bit floating point
"DATETIME_YEAR"
Years
"DATETIME_MONTH"
Months
"DATETIME_WEEK"
Weeks
"DATETIME_DAY"
Days
"DATETIME_HR"
Hours
"DATETIME_MIN"
Minutes
"DATETIME_SEC"
Seconds
"DATETIME_MS"
Milliseconds
"DATETIME_US"
Microseconds
"DATETIME_NS"
Nanoseconds
"DATETIME_PS"
Picoseconds
"DATETIME_FS"
Femtoseconds
"DATETIME_AS"
Attoseconds
Setting Filters
Attributes accept filters such as compressors. This is described in detail here.
Setting Fill Values
There are situations where you may read "empty spaces" from TileDB arrays. For those empty spaces, a read query will return some default fill values for the selected attributes. You can set your own fill values for these cases as follows:
// Assumming a int32 attributeint value =0;unsignedlonglong size =sizeof(value);tiledb_attribute_set_fill_value(ctx, attr,&value, size);// Assumming a var-sized attributeconstchar* value ="null";unsignedlonglong size =strlen(value);tiledb_attribute_set_fill_value(ctx, attr, value, size);
# Setting a fixed-size fill valueattr = tiledb.Attr(name="attr", dtype=np.int32, fill=123)# Setting a var-sized (string attribute) fill valueattr = tiledb.Attr(name="attr", dtype=np.str, fill="filler")# Retrieving the valueattr.fill # -> "filler"
# ... create int attribute attr# set fill value to 42Ltiledb_attribute_set_fill_value(attr, 42L)# ... create variable-sized attributte attr# set fill value to "..."tiledb_attribute_set_fill_value(attr, "...")
attr.setFillValue((byte) 'c');
// Assumming a int32 attributeattribute.SetFillValue(0)// Assumming a var-sized attributeattribute.SetFillValue("null")
// Assumming a int32 attributeattr.SetFillValue(0);// Assumming a var-sized attributestrAttr.SetFillValue("null");
A call to setting the number of cells for an attribute (see above) sets the fill value of the attribute to its default. Therefore, make sure you set the fill values after deciding on the number of values this attribute will hold in each cell.
For fixed-sized attributes, the input fill value size should be equal to the cell size.