Serverless UDFs

Basic Usage

Below we show how to use Python UDFs in TileDB Cloud, with an example that uses numpy to compute the median of random numbers.
Python
R
1
import tiledb, tiledb.cloud, numpy, random
2
3
def mymedian():
4
vals = []
5
for i in range(1, random.randrange(2,50)):
6
vals.append(random.randrange(0, i))
7
return numpy.median(vals)
8
9
tiledb.cloud.login(username="my_username", password="my_password")
10
# or tiledb.cloud.login(token="my_token")
11
12
res = tiledb.cloud.udf.exec(mymedian)
13
print(res)
Copied!
1
library(tiledbcloud)
2
3
mymedian <- function() {
4
n <- sample(2:50, 1)
5
vals <- vector(mode="numeric", length=n)
6
for (i in 1:n) {
7
vals[i] <- sample(0:i, 1)
8
}
9
median(vals)
10
}
11
12
tiledbcloud::login(username=username, password=password)
13
# or tiledbcloud::login(api_key="my_token")
14
15
tiledbcloud::execute_generic_udf(mymedian)
Copied!

Passing Arguments

The UDF can receive any number of arguments, with keyword arguments supported as well.
Python
1
import tiledb, tiledb.cloud, numpy, random
2
3
def multi_args(arg1, arg2, arg3=None, arg4={}):
4
# These will print in the logs of the udf
5
print("type(arg1)={}, arg1={}\n".format(type(arg1), arg1))
6
print("type(arg2)={}, arg2={}\n".format(type(arg2), arg2))
7
print("type(arg3)={}, arg3={}\n".format(type(arg3), arg3))
8
print("type(arg4)={}, arg4={}\n".format(type(arg4), arg4))
9
return
10
11
tiledb.cloud.login(username="my_username", password="my_password")
12
# or tiledb.cloud.login(token="my_token")
13
14
res = tiledb.cloud.udf.exec(multi_args,
15
[1,2,3],
16
{"dictionary": "arg2_test"},
17
False,
18
arg4=True)
19
print(res) # None since the function returned nothing
20
21
# View the logs
22
print(tiledb.cloud.last_udf_task().logs)
Copied!

Asynchronous Execution

An async version of UDFs is available, which returns a future.
Python
R
1
import tiledb, tiledb.cloud, numpy, random
2
3
def mymedian():
4
vals = []
5
for i in range(1, random.randrange(2,50)):
6
vals.append(random.randrange(0, i))
7
return numpy.median(vals)
8
9
tiledb.cloud.login(username="my_username", password="my_password")
10
# or tiledb.cloud.login(token="my_token")
11
12
# res will be a future
13
res = tiledb.cloud.udf.exec_async(mymedian)
14
15
# call res.get() to block on the results
16
print(res.get())
Copied!
1
library(tiledbcloud)
2
3
mymedian <- function() {
4
n <- sample(2:50, 1)
5
vals <- vector(mode="numeric", length=n)
6
for (i in 1:n) {
7
vals[i] <- sample(0:i, 1)
8
}
9
median(vals)
10
}
11
12
tiledb.cloud.login(username="my_username", password="my_password")
13
# or tiledb.cloud.login(token="my_token")
14
15
# res will be a future
16
res = tiledbcloud::delayed_generic_udf(mymedian, args=list())
17
18
# call compute(res) to block on the results
19
print(compute(res))
Copied!

Selecting Who to Charge

If you you are a member of an organization, then by default the organization is charged for your UDF. If you would like to charge the UDF task to yourself, you just need to add one extra argument namespace.
Python
R
1
import tiledb, tiledb.cloud, numpy, random
2
3
def mymedian():
4
vals = []
5
for i in range(1, random.randrange(2,50)):
6
vals.append(random.randrange(0, i))
7
return numpy.median(vals)
8
9
tiledb.cloud.login(username="my_username", password="my_password")
10
# or tiledb.cloud.login(token="my_token")
11
12
res = tiledb.cloud.udf.exec(mymedian, namespace="my_username")
13
print(res)
Copied!
1
library(tiledbcloud)
2
3
mymedian <- function() {
4
n <- sample(2:50, 1)
5
vals <- vector(mode="numeric", length=n)
6
for (i in 1:n) {
7
vals[i] <- sample(0:i, 1)
8
}
9
median(vals)
10
}
11
12
tiledb.cloud.login(username="my_username", password="my_password")
13
# or tiledb.cloud.login(token="my_token")
14
15
res <- tiledbcloud::execute_generic_udf(mymedian, namespace="my_username")
16
print(res)
Copied!

Registering a UDF

You can register a UDF (similar to arrays) as follows:
Python
R
1
import tiledb, tiledb.cloud, numpy, random
2
3
def mymedian():
4
vals = []
5
for i in range(1, random.randrange(2,50)):
6
vals.append(random.randrange(0, i))
7
return numpy.median(vals)
8
9
tiledb.cloud.login(username="my_username", password="my_password")
10
# or tiledb.cloud.login(token="my_token")
11
12
tiledb.cloud.udf.register_generic_udf(median, name="my_median", namespace="my_username")
Copied!
1
library(tiledbcloud)
2
3
mymedian <- function() {
4
n <- sample(2:50, 1)
5
vals <- vector(mode="numeric", length=n)
6
for (i in 1:n) {
7
vals[i] <- sample(0:i, 1)
8
}
9
median(vals)
10
}
11
12
tiledb.cloud.login(username="my_username", password="my_password")
13
# or tiledb.cloud.login(token="my_token")
14
15
tiledbcloud::register_udf(namespace="my_namespace", type='generic', func=mymedian)
Copied!
In order to be able to register a UDF you need to set up the default storage path for you and/or your organization.

Retry Settings