Comment on page
This is a simple guide that demonstrates how to use TileDB on HDFS. HDFS is a distributed Java-based filesystem for storing large amounts of data. It is the underlying distributed storage layer for the Hadoop stack.
The HDFS backend currently only works on POSIX (Linux, macOS) platforms. Windows is currently not supported.
TileDB integrates with HDFS through the
libhdfslibrary (HDFS C-API). The HDFS backend is enabled by default and
libhdfsloading happens at runtime based on environment variables:
If the library cannot be found, or if the Hadoop library cannot locate the correct library dependencies a runtime, an error will be returned.
To use HDFS with TileDB, change the URI you use to an HDFS path:
For instance, if you are running a local HDFS namenode on port 9000:
If you want to use the namenode specified in your HDFS configuration files, then change the prefix to:
Most HDFS configuration variables are defined in Hadoop specific XML files. TileDB allows the following configuration variables to be set at run time through configuration parameters: