Representation of the grouped TileDB arrays that constitute a TileDB-VCF dataset, which includes a sparse 3D array containing the actual variant data and a sparse 1D array containing various sample metadata and the VCF header lines. Read more about the data model here.
attrs: (list of str attrs) List of attributes to extract. Can include attributes from the VCF INFO and FORMAT fields (prefixed with info_ and fmt_, respectively) as well as any of the builtin attributes:
samples: (list of str samples) CSV list of sample names to be read
regions: (list of str regions) CSV list of genomic regions to be read
samples_file: (str filesystem location) URI of file containing sample names to be read, one per line
bed_file: (str filesystem location) URI of a BED file of genomic regions to be read
skip_check_samples: (bool) Should checking the samples requested exist in the array
disable_progress_estimation: (bool) Should we skip estimating the progress in verbose mode? Estimating progress can have performance or memory impacts in some cases.
For large datasets, a call to read() may not be able to fit all results in memory. In that case, the returned dataframe will contain as many results as possible, and in order to retrieve the rest of the results, use the continue_read() function.
You can also use the Python generator version, read_iter().
Returns: Pandas DataFrame containing results.
A read is considered complete if the resulting dataframe contained all results.
Returns: (bool) True if the previous read operation was complete
Counts data in a TileDB-VCF dataset.
samples: (list of str samples) CSV list of sample names to **include in the count
regions: (list of str regions) CSV list of genomic regions to include in the count
Returns: Number of intersecting records in the dataset
List queryable attributes available in the VCF dataset
attr_type: (list of str attributes) The subset of attributes to retrieve; "info" or "fmt" will only retrieve attributes ingested from the VCF INFO and FORMAT fields, respectively, "builtin" retrieves the static attributes defined in TileDB-VCF's schema, "all" (the default) returns all queryable attributes
Returns: a list of strings representing the attribute names