lamindb.core.MuDataCurator¶
- class lamindb.core.MuDataCurator(mdata, var_index, categoricals=None, using_key='default', verbosity='hint', organism=None, sources=None, exclude=None)¶
Bases:
object
Curation flow for a
MuData
object.See also
Curator
.Note that if genes or other measurements are removed from the MuData object, the object should be recreated using
from_mudata()
.- Parameters:
mdata (
MuData
) – The MuData object to curate.var_index (
dict
[str
,dict
[str
,DeferredAttribute
]]) – The registry field for mapping the.var
index for each modality. For example:{"modality_1": bt.Gene.ensembl_gene_id, "modality_2": ln.CellMarker.name}
categoricals (
dict
[str
,DeferredAttribute
] |None
, default:None
) – A dictionary mapping.obs.columns
to a registry field. Use modality keys to specify categoricals for MuData slots such as"rna:cell_type": bt.CellType.name"
.using_key (
str
, default:'default'
) – A reference LaminDB instance.verbosity (
str
, default:'hint'
) – The verbosity level.organism (
str
|None
, default:None
) – The organism name.sources (
dict
[str
,Record
] |None
, default:None
) – A dictionary mapping.obs.columns
to Source records.exclude (
dict
|None
, default:None
) – A dictionary mapping column names to values to exclude.
Examples
>>> import bionty as bt >>> curate = ln.Curator.from_mudata( ... mdata, ... var_index={ ... "rna": bt.Gene.ensembl_gene_id, ... "adt": ln.CellMarker.name ... }, ... categoricals={ ... "cell_type_ontology_id": bt.CellType.ontology_id, ... "donor_id": ln.ULabel.name ... }, ... organism="human", ... )
Attributes¶
- property categoricals: dict¶
Return the obs fields to validate against.
- property var_index: DeferredAttribute¶
Return the registry field to validate variables index against.
Methods¶
- add_new_from(key, modality=None, organism=None, **kwargs)¶
Add validated & new categories.
- Parameters:
key (
str
) – The key referencing the slot in the DataFrame.modality (
str
|None
, default:None
) – The modality name.organism (
str
|None
, default:None
) – The organism name.**kwargs – Additional keyword arguments to pass to the registry model.
- add_new_from_columns(modality, column_names=None, organism=None, **kwargs)¶
Update columns records.
- Parameters:
modality (
str
) – The modality name.column_names (
list
[str
] |None
, default:None
) – The column names to save.organism (
str
|None
, default:None
) – The organism name.**kwargs – Additional keyword arguments to pass to the registry model.
- add_new_from_var_index(modality, organism=None, **kwargs)¶
Update variable records.
- Parameters:
modality (
str
) – The modality name.organism (
str
|None
, default:None
) – The organism name.**kwargs – Additional keyword arguments to pass to the registry model.
- add_validated_from(key, modality=None, organism=None)¶
Add validated categories.
- Parameters:
key (
str
) – The key referencing the slot in the DataFrame.modality (
str
|None
, default:None
) – The modality name.organism (
str
|None
, default:None
) – The organism name.
- add_validated_from_var_index(modality, organism=None)¶
Add validated variable records.
- Parameters:
modality (
str
) – The modality name.organism (
str
|None
, default:None
) – The organism name.
- lookup(using_key=None)¶
Lookup categories.
- Parameters:
using_key (
str
|None
, default:None
) – The instance where the lookup is performed. if None (default), the lookup is performed on the instance specified in “using_key” parameter of the validator. if “public”, the lookup is performed on the public reference.- Return type:
- save_artifact(description=None, **kwargs)¶
Save the validated
MuData
and metadata.- Parameters:
description (
str
|None
, default:None
) – Description of theMuData
object.**kwargs – Object level metadata.
- Return type:
- Returns:
A saved artifact record.
- validate(organism=None)¶
Validate categories.
- Return type:
bool