Links

Serverless UDFs

TileDB Cloud allows you to run any lambda-like user-defined function (UDF). More specifically, you write the code on your laptop using the TileDB Cloud client (see Installation), your function gets shipped and executed on stateless TileDB Cloud REST workers. You get charged only for the time it took to run the function and the amount of data that got returned to your laptop. You do not need to worry about launching or managing any computational resources.
There are two types of supported UDFs:
  • Generic: These can include any code.
  • Array UDFs: These are UDFs that are applied to slices of one or more arrays.
Running UDFs is particularly useful if you want to perform reductions (such as a sum or an average), since the amount of data returned is very small regardless of how much data the UDF processes.
TileDB Cloud currently supports only Python and R UDFs, but support for more languages will be added soon.
TileDB Cloud runs your UDF in a separate dedicated container for security. Any array access is executed in parallel on the same REST worker but separate containers, and the results are sent to the UDF container using zero-copy techniques for performance.
We offer Python and R UDF images based on the following versions:
Python
R
3.9
4.2.2-1.2004.0
3.7.10
In the environment that the UDF runs, we include the following Python packages:
Package
Version (Python 3.9)
Version (Python 3.7.10)
numpy
1.23.4
1.21.6
pandas
1.4.0
1.3.5
tensorflow
2.8.1
2.8.1
numexpr
2.8.3
2.8.3
numba
0.56.3
0.56.3
xarray
2022.11.0
0.20.2
tiledb
2.14.1
2.14.1
scipy
1.9.3
1.7.3
boto3
1.26.3
1.26.3
tiledb-vcf
0.22.2
0.22.2
In the environment that the UDF runs, we include the following R packages:
Package
Version (R 4.2.2-1.2004.0)
Rcpp
1.0.9
TileDB-R
0.86.0
TileDB-SC
0.1.3
TileDB-SOMA
0.1.14
curl
4.3.3
RcppSpdlog
0.0.9
jsonlite
1.8.3
base64enc
0.1-3
R6
2.5.1
httr
1.4.4
mmap
0.6-19
remotes
2.4.2
SeuratObject
4.1.2
Seurat
4.2.0
BiocManager
1.30.19
SingleCellExperiment
1.20.0
Geospacial image is based on Python images and include the following packages:
Package
Version
geos
3.11.0
geotiff
1.7.1
laszip
3.4.3
proj
9.1.0
proj-data
1.11
gdal
3.4.1
PDAL
3.2.0
rasterio
1.2.10
fiona
1.8.18
geopandas
0.8.1
scikit-mobility
1.1.2
xarray
0.19.0
tiledb-cf
0.6.2
tiledb-segy
0.3.0
Genomics image is based on Python images and include the following packages:
Package
Version (3.9)
Version (3.7.10)
bwa
0.7.17
0.7.17
java-jdk
8.0.112
8.0.112
picard
2.27.4
2.27.4
samtools
1.16.1
1.16.1
sra-tools
3.0.0
3.0.0
Imaging image is based on Python images and include the following packages:
If you would like additional packages added to the UDF environment, please leave your suggestion on our feedback request board.
Each UDF allows for the following configurations to be used:
Type
CPU (max)
RAM (max)
standard (Default)
2
2GB
large
8
8GB
In the future, TileDB Cloud will offer more flexibility in choosing the types of resources to run the UDF on.
All UDFs will time out by default after 15 minutes, the value is configurable