lamindb.core.HasFeatures

class lamindb.core.HasFeatures

Bases: object

Base class linking features, in particular, for Artifact & Collection.

Attributes

features(host): FeatureManager = <class 'lnschema_core.models.FeatureManager'>

Feature manager.

Features denote dataset dimensions, i.e., the variables that measure labels & numbers.

Annotate with features & values:

artifact.features.add_values({
     "species": organism,  # here, organism is an Organism record
     "scientist": ['Barbara McClintock', 'Edgar Anderson'],
     "temperature": 27.6,
     "study": "Study 0: initial plant gathering"
})

Query for features & values:

ln.Artifact.features.filter(scientist="Barbara McClintock")

Features may or may not be part of the artifact content in storage. For instance, the Annotate flow validates the columns of a DataFrame-like artifact and annotates it with features corresponding to these columns. artifact.features.add_values, by contrast, does not validate the content of the artifact.

property labels: LabelManager

Label manager.

To annotate with labels, you typically use the registry-specific accessors, for instance ulabels:

candidate_marker_study = ln.ULabel(name="Candidate marker study").save()
artifact.ulabels.add(candidate_marker_study)

Similarly, you query based on these accessors:

ln.Artifact.filter(ulabels__name="Candidate marker study").all()

The .labels accessor allows you to associate labels of any registry with features:

study = ln.Feature(name="study", dtype="cat").save()
artifact.labels.add(candidate_marker_study, study)

Methods

describe(print_types=False)

Describe relations of data record.

Examples

>>> ln.Artifact(ln.core.datasets.file_jpg_paradisi05(), description="paradisi05").save()
>>> artifact = ln.Artifact.filter(description="paradisi05").one()
>>> ln.save(ln.ULabel.from_values(["image", "benchmark", "example"], field="name"))
>>> ulabels = ln.ULabel.filter(name__in=["image", "benchmark", "example"]).all()
>>> artifact.ulabels.set(ulabels)
>>> artifact.describe()

.

view_lineage(with_children=True)

Graph of data flow.

Return type:

None

Notes

For more info, see use cases: Data lineage.

Examples

>>> collection.view_lineage()
>>> artifact.view_lineage()