lamindb.core.MuDataCatManager¶
- class lamindb.core.MuDataCatManager(mdata, var_index, categoricals=None, verbosity='hint', organism=None, sources=None, exclude=None)¶
Bases:
CatManager
Curation flow for a
MuData
object.- Parameters:
mdata (MuData | Artifact) – The MuData object to curate.
var_index (dict[str, FieldAttr]) – The registry field for mapping the
.var
index for each modality. For example:{"modality_1": bt.Gene.ensembl_gene_id, "modality_2": CellMarker.name}
categoricals (dict[str, FieldAttr] | None, default:
None
) – A dictionary mapping.obs.columns
to a registry field. Use modality keys to specify categoricals for MuData slots such as"rna:cell_type": bt.CellType.name"
.verbosity (str, default:
'hint'
) – The verbosity level.organism (str | None, default:
None
) – The organism name.sources (dict[str, Record] | None, default:
None
) – A dictionary mapping.obs.columns
to Source records.exclude (dict | None, default:
None
) – A dictionary mapping column names to values to exclude from validation. When specificSource
instances are pinned and may lack default values (e.g., “unknown” or “na”), using the exclude parameter ensures they are not validated.
Examples
>>> import bionty as bt >>> curator = ln.Curator.from_mudata( ... mdata, ... var_index={ ... "rna": bt.Gene.ensembl_gene_id, ... "adt": CellMarker.name ... }, ... categoricals={ ... "cell_type_ontology_id": bt.CellType.ontology_id, ... "donor_id": ULabel.name ... }, ... organism="human", ... )
Attributes¶
- property categoricals: dict¶
Return the obs fields to validate against.
- property non_validated: dict[str, dict[str, list[str]]]¶
Return the non-validated features and labels.
- property var_index: DeferredAttribute¶
Return the registry field to validate variables index against.
Class methods¶
- classmethod from_anndata(data, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), verbosity='hint', organism=None, sources=None)¶
- Return type:
AnnDataCatManager
- classmethod from_df(df, categoricals=None, columns=FieldAttr(Feature.name), verbosity='hint', organism=None)¶
- Return type:
- classmethod from_mudata(mdata, var_index, categoricals=None, verbosity='hint', organism=None)¶
- Return type:
- classmethod from_spatialdata(sdata, var_index, categoricals=None, organism=None, sources=None, exclude=None, verbosity='hint', *, sample_metadata_key='sample')¶
- classmethod from_tiledbsoma(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None, exclude=None)¶
- Return type:
Methods¶
- add_new_from(key, modality=None, **kwargs)¶
Add validated & new categories.
- Parameters:
key (
str
) – The key referencing the slot in the DataFrame.modality (
str
|None
, default:None
) – The modality name.organism – The organism name.
**kwargs – Additional keyword arguments to pass to create new records.
- add_new_from_columns(modality, column_names=None, **kwargs)¶
- add_new_from_var_index(modality, **kwargs)¶
Update variable records.
- Parameters:
modality (
str
) – The modality name.organism – The organism name.
**kwargs – Additional keyword arguments to pass to create new records.
- lookup(public=False)¶
Lookup categories.
- Parameters:
public (
bool
, default:False
) – Perform lookup on public source ontologies.- Return type:
- save_artifact(*, key=None, description=None, revises=None, run=None)¶
Save an annotated artifact.
- Parameters:
key (
str
|None
, default:None
) – A path-like key to reference artifact in default storage, e.g.,"myfolder/myfile.fcs"
. Artifacts with the same key form a version family.description (
str
|None
, default:None
) – A description.revises (
Artifact
|None
, default:None
) – Previous version of the artifact. Is an alternative way to passingkey
to trigger a new version.run (
Run
|None
, default:None
) – The run that creates the artifact.
- Return type:
- Returns:
A saved artifact record.
- standardize(key, modality=None)¶
Replace synonyms with standardized values.
- Parameters:
key (
str
) – The key referencing the slot in theMuData
.modality (
str
|None
, default:None
) – The modality name.
Inplace modification of the dataset.
- validate()¶
Validate categories.
- Return type:
bool