<version>is the TileDB Embedded version you are looking for (e.g.,
a2(var-sized string). TileDB stores the values of each attribute in separate files, i.e., it is a “columnar” format (similar to Parquet). Every fixed-sized attribute generates one file, whereas every var-sized attribute generates two (one for the actual data and one for the starting positions of each var-sized value). Moreover, each fragment folder contains a fragment metadata file, which contains information about the non-empty domain (i.e., the bounding box where cell values are written into) and other important metadata that will be explained later.
__metainside the array folder, simply serialized in binary files. Those files are timestamped in the same manner as fragments for the same reasons (immutability, concurrent writes and time traveling). The metadata file organization is shown in the figure below.
aws s3 sync src_uri dest_uri.
1, 2, 3, which have the following little-endian representation when stored adjacent in memory:
100, 104, 108, 112, ..., then the resulting positive-encoded data would be
0, 4, 4, 4, .... This encoding is advantageous in that producing long runs of repeated values can result in better compression ratios, if a compression filter is added after positive-delta encoding.
uint64. Initially, each cell of data for that attribute requires 8 bytes of storage. However, if you know that the actual value of the attribute is often 255 or less, those cells can be stored using just a single byte in the form of a
uint8, saving 7 bytes of storage per cell. The bit-width reduction filter performs this analysis and compression automatically over windows of attribute data.
300, 350, 400, the bit-width reduction filter would first determine that the minimum value in the window was
300, and the relative cell values were
0, 50, 100. These relative values are now less than 255 and can be represented by a
__t1_t2_uuid_v.meta. The file contains the fragment URIs whose footers are included, along with the footers in serialized binary form. The footers contain only very small information about the fragments, such as the non-empty domain and other light metadata.
t1is the first timestamp of the first fragment whose metadata is being consolidated, and
t2is the second timestamp of the last fragment. Upon opening an array, regardless of the number of fragments, TileDB can fetch this single small file in main memory with a single REST request. In other words, the TileDB format has mechanisms for making the communication protocol with the object store more lightweight.
t1is the first timestamp of the first fragment being consolidated, and
t2is the second timestamp of the last fragment.
__t1_t2_uuid_v.vac, i.e., with the same name as the consolidated fragment with added suffix
.vac. This file contains the URIs of the fragments that were consolidated. The user can choose to retain the consolidated fragments (for time traveling purposes) or vacuum them by deleting them. The
.vacfiles are used in the vacuum process so that only consolidated fragments get deleted.