lamindb.core.CatManager¶
- class lamindb.core.CatManager(*, dataset, categoricals, sources, organism, exclude, columns_field=None)¶
Bases:
object
Manage valid categoricals by updating registries.
A
CatManager
object makes it easy to validate, standardize & annotate datasets.Example:
>>> cat_manager = ln.CatManager( >>> dataset, >>> # define validation criteria as mappings >>> columns=Feature.name, # map column names >>> categoricals={"perturbation": ULabel.name}, # map categories >>> ) >>> cat_manager.validate() # validate the dataframe >>> artifact = cat_manager.save_artifact(description="my RNA-seq") >>> artifact.describe() # see annotations
cat_manager.validate()
maps values withindf
according to the mapping criteria and logs validated & problematic values.If you find non-validated values, you have several options:
new values found in the data can be registered using
add_new_from()
non-validated values can be accessed using
non_validated()
and addressed manually
Attributes¶
- property categoricals: dict¶
Return the columns fields to validate against.
- property non_validated: dict[str, list[str]]¶
Return the non-validated features and labels.
Class methods¶
- classmethod from_anndata(data, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), verbosity='hint', organism=None, sources=None)¶
- Return type:
AnnDataCatManager
- classmethod from_df(df, categoricals=None, columns=FieldAttr(Feature.name), verbosity='hint', organism=None)¶
- Return type:
- classmethod from_mudata(mdata, var_index, categoricals=None, verbosity='hint', organism=None)¶
- Return type:
- classmethod from_spatialdata(sdata, var_index, categoricals=None, organism=None, sources=None, exclude=None, verbosity='hint', *, sample_metadata_key='sample')¶
- classmethod from_tiledbsoma(experiment_uri, var_index, categoricals=None, obs_columns=FieldAttr(Feature.name), organism=None, sources=None, exclude=None)¶
- Return type:
Methods¶
- save_artifact(*, key=None, description=None, revises=None, run=None)¶
Save an annotated artifact.
- Parameters:
key (
str
|None
, default:None
) – A path-like key to reference artifact in default storage, e.g.,"myfolder/myfile.fcs"
. Artifacts with the same key form a version family.description (
str
|None
, default:None
) – A description.revises (
Artifact
|None
, default:None
) – Previous version of the artifact. Is an alternative way to passingkey
to trigger a new version.run (
Run
|None
, default:None
) – The run that creates the artifact.
- Return type:
- Returns:
A saved artifact record.
- standardize(key)¶
Replace synonyms with standardized values.
Inplace modification of the dataset.
- Parameters:
key (
str
) – The name of the column to standardize.- Return type:
None
- Returns:
None
- validate()¶
Validate dataset.
This method also registers the validated records in the current instance.
- Return type:
bool
- Returns:
The boolean
True
if the dataset is validated. Otherwise, a string with the error message.