lamindb.curators.core.ComponentCurator

class lamindb.curators.core.ComponentCurator(dataset, schema, slot=None)

Bases: Curator

Curator for DataFrame.

Provides all key functionality to validate Pandas DataFrames. This class is not user facing unlike DataFrameCurator which extends this class with functionality to validate the attrs slot.

Parameters:
  • dataset (DataFrame | Artifact) – The DataFrame-like object to validate & annotate.

  • schema (Schema) – A Schema object that defines the validation constraints.

  • slot (str | None, default: None) – Indicate the slot in a composite curator for a composite data structure.

Attributes

property cat: DataFrameCatManager

Manage categoricals by updating registries.

Methods

standardize()

Standardize the dataset. :rtype: None

  • Adds missing columns for features

  • Fills missing values for features with default values

validate()

Validate dataset against Schema.

Raises:

lamindb.errors.ValidationError – If validation fails.

Return type:

None

save_artifact(*, key=None, description=None, revises=None, run=None)

Save an annotated artifact.

Parameters:
  • key (str | None, default: None) – A path-like key to reference artifact in default storage, e.g., "myfolder/myfile.fcs". Artifacts with the same key form a version family.

  • description (str | None, default: None) – A description.

  • revises (Artifact | None, default: None) – Previous version of the artifact. Is an alternative way to passing key to trigger a new version.

  • run (Run | None, default: None) – The run that creates the artifact.

Return type:

Artifact

Returns:

A saved artifact record.