lamindb.curators.AnnDataCurator

class lamindb.curators.AnnDataCurator(dataset, schema)

Bases: Curator

Curator for a DataFrame object.

See also Curator and Schema.

Added in version 1.1.0.

Parameters:
  • dataset (AnnData | Artifact) – The AnnData-like object to validate & annotate.

  • schema (Schema) – A Schema object that defines the validation constraints.

Example:

import lamindb as ln
import bionty as bt

# define valid labels
cell_medium = ln.ULabel(name="CellMedium", is_type=True).save()
ln.ULabel(name="DMSO", type=cell_medium).save()
ln.ULabel(name="IFNG", type=cell_medium).save()
bt.CellType.from_source(name="B cell").save()
bt.CellType.from_source(name="T cell").save()

# define obs schema
obs_schema = ln.Schema(
    name="small_dataset1_obs_level_metadata",
    features=[
        ln.Feature(name="cell_medium", dtype="cat[ULabel[CellMedium]]").save(),
        ln.Feature(name="sample_note", dtype=str).save(),
        ln.Feature(name="cell_type_by_expert", dtype=bt.CellType").save(),
        ln.Feature(name="cell_type_by_model", dtype=bt.CellType").save(),
    ],
).save()

# define var schema
var_schema = ln.Schema(
    name="scRNA_seq_var_schema",
    itype=bt.Gene.ensembl_gene_id,
    dtype="num",
).save()

# define composite schema
anndata_schema = ln.Schema(
    name="small_dataset1_anndata_schema",
    otype="AnnData",
    components={"obs": obs_schema, "var": var_schema},
).save()

# curate an AnnData
adata = datasets.small_dataset1(otype="AnnData")
curator = ln.curators.AnnDataCurator(adata, anndata_schema)
artifact = curator.save_artifact(key="example_datasets/dataset1.h5ad")
assert artifact.schema == anndata_schema

Methods

save_artifact(*, key=None, description=None, revises=None, run=None)

Save an annotated artifact.

Parameters:
  • key (default: None) – A path-like key to reference artifact in default storage, e.g., "myfolder/myfile.fcs". Artifacts with the same key form a version family.

  • description (default: None) – A description.

  • revises (default: None) – Previous version of the artifact. Is an alternative way to passing key to trigger a new version.

  • run (default: None) – The run that creates the artifact.

Returns:

A saved artifact record.

validate()

Validate dataset.

Raises:

lamindb.errors.ValidationError – If validation fails.

Return type:

None