lamindb.curators.TiledbsomaCatManager

class lamindb.curators.TiledbsomaCatManager(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None)

Bases: CatManager

Curation flow for tiledbsoma.Experiment.

Parameters:
  • experiment_uri (UPathStr | Artifact) – A local or cloud path to a tiledbsoma.Experiment.

  • var_index (dict[str, tuple[str, FieldAttr]]) – The registry fields for mapping the .var indices for measurements. Should be in the form {"measurement name": ("var column", field)}. These keys should be used in the flattened form ('{measurement name}__{column name in .var}') in .standardize or .add_new_from, see the output of .var_index.

  • categoricals (dict[str, FieldAttr] | None, default: None) – A dictionary mapping categorical .obs columns to a registry field.

  • obs_columns (FieldAttr, default: FieldAttr(Feature.name)) – The registry field for mapping the names of the .obs columns.

  • organism (str | None, default: None) – The organism name.

  • sources (dict[str, Record] | None, default: None) – A dictionary mapping .obs columns to Source records.

Example:

import lamindb as ln
import bionty as bt

curator = ln.curators.TiledbsomaCatManager(
    "./my_array_store.tiledbsoma",
    var_index={"RNA": ("var_id", bt.Gene.symbol)},
    categoricals={
        "cell_type_ontology_id": bt.CellType.ontology_id,
        "donor_id": ln.ULabel.name
    },
    organism="human",
)

Attributes

property categoricals: dict[str, DeferredAttribute]

Return the obs fields to validate against.

property non_validated: dict[str, list]

Return the non-validated features and labels.

property var_index: dict[str, DeferredAttribute]

Return the registry fields with flattened keys to validate variables indices against.

Class methods

classmethod from_anndata(data, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), verbosity='hint', organism=None, sources=None)
Return type:

AnnDataCatManager

classmethod from_df(df, categoricals=None, columns=FieldAttr(Feature.name), verbosity='hint', organism=None)
Return type:

DataFrameCatManager

classmethod from_mudata(mdata, var_index, categoricals=None, verbosity='hint', organism=None)
Return type:

MuDataCatManager

classmethod from_spatialdata(sdata, var_index, categoricals=None, organism=None, sources=None, verbosity='hint', *, sample_metadata_key='sample')
classmethod from_tiledbsoma(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None)
Return type:

TiledbsomaCatManager

Methods

add_new_from(key, **kwargs)

Add validated & new categories.

Parameters:

key (str) – The key referencing the slot in the tiledbsoma store. It should be '{measurement name}__{column name in .var}' for columns in .var or a column name in .obs.

Return type:

None

lookup(public=False)

Lookup categories.

Parameters:

public (bool, default: False) – If “public”, the lookup is performed on the public reference.

Return type:

CurateLookup

save_artifact(*, key=None, description=None, revises=None, run=None)

Save the validated tiledbsoma store and metadata.

Parameters:
  • description (str | None, default: None) – A description of the tiledbsoma store.

  • key (str | None, default: None) – A path-like key to reference artifact in default storage, e.g., "myfolder/mystore.tiledbsoma". Artifacts with the same key form a version family.

  • revises (Artifact | None, default: None) – Previous version of the artifact. Triggers a revision.

  • run (Run | None, default: None) – The run that creates the artifact.

Return type:

Artifact

Returns:

A saved artifact record.

standardize(key)

Replace synonyms with standardized values.

Modifies the dataset inplace.

Parameters:

key (str) – The key referencing the slot in the tiledbsoma store. It should be '{measurement name}__{column name in .var}' for columns in .var or a column name in .obs.

validate()

Validate categories.