GDAL

Tutorial on using GDAL to create and inspect TileDB arrays

Installation

GDAL is a translator library for raster and vector datasets, there has been a supported TileDB raster driver since GDAL 3.0. You can run the GDAL code as follows:

docker run -it --rm -u 0 -v /local/path:/data tiledb/geospatial /bin/bash

To confirm that the TileDB driver is available, run:

gdalinfo --formats | grep TileDB
TileDB -raster- (rw+vs): TileDB

Here are the supported options of the TileDB driver:

gdalinfo --format TileDB
Format Details:
Short Name: TileDB
Long Name: TileDB
Supports: Raster
Help Topic: frmt_tiledb.html
Supports: Subdatasets
Supports: Open() - Open existing dataset.
Supports: Create() - Create writable dataset.
Supports: CreateCopy() - Create dataset by copying another.
Supports: Virtual IO - eg. /vsimem/
Creation Datatypes: Byte UInt16 Int16 UInt32 Int32 Float32 Float64 CInt16 CInt32 CFloat32 CFloat64
<CreationOptionList>
<Option name="COMPRESSION" type="string-select" description="image compression to use" default="NONE">
<Value>NONE</Value>
<Value>GZIP</Value>
<Value>ZSTD</Value>
<Value>LZ4</Value>
<Value>RLE</Value>
<Value>BZIP2</Value>
<Value>DOUBLE-DELTA</Value>
<Value>POSITIVE-DELTA</Value>
</Option>
<Option name="COMPRESSION_LEVEL" type="int" description="Compression level" />
<Option name="BLOCKXSIZE" type="int" description="Tile Width" />
<Option name="BLOCKYSIZE" type="int" description="Tile Height" />
<Option name="STATS" type="boolean" description="Dump TileDB stats" />
<Option name="TILEDB_CONFIG" type="string" description="location of configuration file for TileDB" />
<Option name="TILEDB_ATTRIBUTE" type="string" description="co-registered file to add as TileDB attributes" />
</CreationOptionList>
<OpenOptionList>
<Option name="STATS" type="boolean" description="Dump TileDB stats" />
<Option name="TILEDB_ATTRIBUTE" type="string" description="Attribute to read from each band" />
<Option name="TILEDB_CONFIG" type="string" description="location of configuration file for TileDB" />
</OpenOptionList>

Ingesting Data

Download this simple GeoTIFF image. Simply run:

gdal_translate -OF TileDB UTM2GTIF.TIF <array-name>

This will create a new TileDB array called <array-name> and ingest the GeoTIFF image as a TileDB 2D dense array with a simple attribute that will store the greyscale value of each pixel. Note that the array name can be an S3 URI path as well. In that case, you would need to create an aws.config file, and add your S3 keys in the following way:

vfs.s3.aws_access_key_id xxxxxx
vfs.s3.aws_secret_access_key xxxxxx

Then, you can ingest directly into a TileDB array on S3 as follows:

gdal_translate -OF TileDB UTM2GTIF.TIF -OO TILEDB_CONFIG=aws.config <array-name>

After ingesting the array, you can get its info with:

gdalinfo <array-name>
# or, gdalinfo -OO TILEDB_CONFIG=aws.config <array-name>, if the array is on S3

Finally, you can check if the values in the TileDB array are the same as in the original GeoTIFF file:

gdallocationinfo <array-name> 10 10
Report:
Location: (10P,10L)
Band 1:
Value: 156
gdallocationinfo UTM2GTIF.TIF 10 10
Report:
Location: (10P,10L)
Band 1:
Value: 156

If everything worked correctly, the values in the TileDB array must be identical to those in the original GeoTIFF file.