lamindb.Transform¶
- class lamindb.Transform(name: str, key: str | None = None, version: str | None = None, type: TransformType | None = None, is_new_version_of: Transform | None = None)¶
Bases:
Record
,HasParents
,IsVersioned
Data transformations.
A transform can refer to a Python function, a script, notebook, or a pipeline. If you execute a transform, you generate a run (
Run
). A run has input and output data.A pipeline is typically created with a workflow tool (Nextflow, Snakemake, Prefect, Flyte, MetaFlow, redun, Airflow, …) and stored in a versioned repository.
Transforms are versioned so that a given transform maps 1:1 to a specific version of code.
Can I sync transforms to git?
If you switch on
sync_git_repo
a script-like transform is synched to its hashed state in a git repository upon callingln.track()
.The definition of transforms and runs is consistent the OpenLineage specification where a
Transform
record would be called a “job” and aRun
record a “run”.- Parameters:
name –
str
A name or title.key –
str | None = None
A short name or path-like semantic key.version –
str | None = None
A version.type –
TransformType | None = "pipeline"
Either'notebook'
,'pipeline'
or'script'
.is_new_version_of –
Transform | None = None
An old version of the transform.
Notes
Examples
Create a transform for a pipeline:
>>> transform = ln.Transform(name="Cell Ranger", version="7.2.0", type="pipeline") >>> transform.save()
Create a transform from a notebook:
>>> ln.track()
View parents of a transform:
>>> transform.view_parents()
Attributes¶
Fields¶
-
version:
str
¶ Version (default
None
).Defines version of a family of records characterized by the same
stem_uid
.Consider using semantic versioning with Python versioning.
-
id:
int
¶ Internal id, valid only in one DB instance.
-
uid:
str
¶ Universal id.
-
name:
str
¶ A name or title. For instance. pipeline name, notebook title, etc.
-
key:
str
¶ A key for concise reference & versioning (optional).
-
description:
str
¶ A description (optional).
-
type:
str
¶ Transform type (default
"pipeline"
).
-
reference:
str
¶ Reference for the transform, e.g.. URL.
-
reference_type:
str
¶ Type of reference, e.g., ‘url’ or ‘doi’.
-
ulabels:
ULabel
¶ Accessor to the related objects manager on the forward and reverse sides of a many-to-many relation.
In the example:
class Pizza(Model): toppings = ManyToManyField(Topping, related_name='pizzas')
Pizza.toppings
andTopping.pizzas
areManyToManyDescriptor
instances.Most of the implementation is delegated to a dynamically defined manager class built by
create_forward_many_to_many_manager()
defined below.
-
parents:
Transform
¶ Parent transforms (predecessors) in data flow.
These are auto-populated whenever a transform loads an artifact or collection as run input.
-
created_at:
datetime
¶ Time of creation of record.
-
updated_at:
datetime
¶ Time of last update to record.
Methods¶
- delete()¶
- Return type:
None
- get_type_display(*, field=<django.db.models.fields.CharField: type>)¶