lamindb.curators.SpatialDataCatManager

class lamindb.curators.SpatialDataCatManager(sdata, var_index, categoricals=None, verbosity='hint', organism=None, sources=None, exclude=None, *, sample_metadata_key='sample')

Bases: CatManager

Curation flow for a Spatialdata object.

See also Curator.

Note that if genes or other measurements are removed from the SpatialData object, the object should be recreated.

In the following docstring, an accessor refers to either a .table key or the sample_metadata_key.

Parameters:
  • sdata (Any) – The SpatialData object to curate.

  • var_index (dict[str, DeferredAttribute]) – A dictionary mapping table keys to the .var indices.

  • categoricals (dict[str, dict[str, DeferredAttribute]] | None, default: None) – A nested dictionary mapping an accessor to dictionaries that map columns to a registry field.

  • organism (str | None, default: None) – The organism name.

  • sources (dict[str, dict[str, Record]] | None, default: None) – A dictionary mapping an accessor to dictionaries that map columns to Source records.

  • exclude (dict[str, dict] | None, default: None) – A dictionary mapping an accessor to dictionaries of column names to values to exclude from validation. When specific Source instances are pinned and may lack default values (e.g., “unknown” or “na”), using the exclude parameter ensures they are not validated.

  • verbosity (str, default: 'hint') – The verbosity level of the logger.

  • sample_metadata_key (str | None, default: 'sample') – The key in .attrs that stores the sample level metadata.

Examples

>>> import bionty as bt
>>> curator = SpatialDataCatManager(
...     sdata,
...     var_index={
...         "table_1": bt.Gene.ensembl_gene_id,
...     },
...     categoricals={
...         "table1":
...             {"cell_type_ontology_id": bt.CellType.ontology_id, "donor_id": ULabel.name},
...         "sample":
...             {"experimental_factor": bt.ExperimentalFactor.name},
...     },
...     organism="human",
... )

Attributes

property categoricals: dict[str, dict[str, DeferredAttribute]]

Return the categorical keys and fields to validate against.

property non_validated: dict[str, dict[str, list[str]]]

Return the non-validated features and labels.

property var_index: DeferredAttribute

Return the registry fields to validate variables indices against.

Class methods

classmethod from_anndata(data, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), verbosity='hint', organism=None, sources=None)
Return type:

AnnDataCatManager

classmethod from_df(df, categoricals=None, columns=FieldAttr(Feature.name), verbosity='hint', organism=None)
Return type:

DataFrameCatManager

classmethod from_mudata(mdata, var_index, categoricals=None, verbosity='hint', organism=None)
Return type:

MuDataCatManager

classmethod from_spatialdata(sdata, var_index, categoricals=None, organism=None, sources=None, exclude=None, verbosity='hint', *, sample_metadata_key='sample')
classmethod from_tiledbsoma(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None, exclude=None)
Return type:

TiledbsomaCatManager

Methods

add_new_from(key, accessor=None, **kwargs)

Save new values of categorical from sample level metadata or table.

Parameters:
  • key (str) – The key referencing the slot in the DataFrame.

  • accessor (str | None, default: None) – The accessor key such as ‘sample’ or ‘table x’.

  • organism – The organism name.

  • **kwargs – Additional keyword arguments to pass to create new records.

Return type:

None

add_new_from_var_index(table, **kwargs)

Save new values from .var.index of table.

Parameters:
  • table (str) – The table key.

  • organism – The organism name.

  • **kwargs – Additional keyword arguments to pass to create new records.

Return type:

None

lookup(public=False)

Look up categories.

Parameters:

public (bool, default: False) – Whether the lookup is performed on the public reference.

Return type:

CurateLookup

save_artifact(*, key=None, description=None, revises=None, run=None)

Save the validated SpatialData store and metadata.

Parameters:
  • description (str | None, default: None) – A description of the dataset.

  • key (str | None, default: None) – A path-like key to reference artifact in default storage, e.g., "myartifact.zarr". Artifacts with the same key form a version family.

  • revises (Artifact | None, default: None) – Previous version of the artifact. Triggers a revision.

  • run (Run | None, default: None) – The run that creates the artifact.

Return type:

Artifact

Returns:

A saved artifact record.

standardize(key, accessor=None)

Replace synonyms with canonical values.

Modifies the dataset inplace.

Parameters:
  • key (str) – The key referencing the slot in the table or sample metadata.

  • accessor (str | None, default: None) – The accessor key such as ‘sample_key’ or ‘table_key’.

Return type:

None

validate()

Validate variables and categorical observations.

This method also registers the validated records in the current instance: - from public sources

Parameters:

organism – The organism name.

Return type:

bool

Returns:

Whether the SpatialData object is validated.