lamindb.Run .md

class lamindb.Run(transform: Transform, name: str | None = None, description: str | None = None, entrypoint: str | None = None, params: dict | None = None, reference: str | None = None, reference_type: str | None = None, initiated_by_run: Run | None = None, plan: Artifact | None = None)

Bases: SQLRecord, TracksUpdates

Runs of transforms such as the executions of a script.

Parameters:
  • transformTransform A data transformation object.

  • namestr | None = None A name.

  • paramsdict | None = None A dictionary of parameters.

  • referencestr | None = None For instance, an external ID or URL.

  • reference_typestr | None = None For instance, redun_id, nextflow_id or url.

  • initiated_by_runRun | None = None The run that triggers this run.

See also

track()

Globally track a script or notebook run.

step()

Track a function executionwith this decorator.

Examples

Create a run record:

ln.Transform(key="Cell Ranger", version="7.2.0", kind="pipeline").save()
transform = ln.Transform.get(key="Cell Ranger", version="7.2.0")
run = ln.Run(transform)

Track a global run of a notebook or script:

ln.track()
ln.context.run  # global run object

You can pass parameters to Run(transform, params=params) or add them later:

run.params = {
    "learning_rate": 0.01,
    "input_dir": "s3://my-bucket/mydataset",
    "downsample": True,
    "preprocess_params": {
        "normalization_type": "cool",
        "subset_highlyvariable": True,
    },
}
run.save()

In contrast to .params, features are indexed in the Feature registry and can reference relational categorical values. If you want to link feature values, use:

run.features.set_values({
    "experiment": "My experiment 1",
})

Guide: Track parameters & features

Attributes

property features: FeatureManager

Manage annotations with features.

For examples, see Run or FeatureManager.

property status: Literal['scheduled', 'restarted', 'started', 'completed', 'errored', 'aborted']

Run status.

Get the status of the run:

status

code

description

scheduled

-3

run is scheduled

restarted

-2

run was restarted

started

-1

run has started

completed

0

run completed successfully

errored

1

run ended with an error

aborted

2

run was aborted

The database stores the run status as an integer code in field _status_code.

Example

See the status of a run:

run.status
#> 'completed'

Query by status:

ln.Run.filter(status="completed").to_dataframe()

Simple fields

uid: str

Universal id, valid across DB instances.

name: str | None

An optional name for this run.

description: str | None

An optional description for this run.

entrypoint: str | None

The entrypoint of the transform.

This could be a function name or the entry point of a CLI or workflow manager.

started_at: datetime

The time this run started.

finished_at: datetime | None

The time this run finished or aborted.

params: dict

Parameters (plain JSON values).

reference: str | None

A reference like a URL or an external ID such as from a workflow manager.

reference_type: str | None

The type of the reference such as a workflow manager execution ID.

cli_args: str | None

CLI arguments if the run was invoked from the command line.

created_at: datetime

The time of creation of this run.

is_locked: bool

Whether the object is locked for edits.

updated_at: datetime

Time of last update to record.

Relational fields

branch: Branch

The branch on which the object is defined.

created_on: Branch

The branch on which the object was created.

space: Space

The space in which the object is defined.

transform: Transform

The transform that is being run ← runs.

report: Artifact | None

The report of this run such as an .html or .txt file.

environment: Artifact | None

The computational environment for this run.

For instance, Dockerfile, docker image, requirements.txt, environment.yml, etc.

plan: Artifact | None

The (agent) plan for this run.

Also see: initiated_by_run.

created_by: User

The creator of this run ← created_runs.

initiated_by_run: Run | None

The run that initiated this run ← initiated_runs.

json_values: RelatedManager[JsonValue]

Feature-indexed JSON values ← runs.

ulabels: RelatedManager[ULabel]

The ulabels annotating this run ← runs.

linked_in_records: RelatedManager[Record]

This run is linked in these records as a value ← linked_runs.

artifacts: RelatedManager[Artifact]

The artifacts annotated by this run ← runs.

linked_artifacts: RelatedManager[Artifact]

The artifacts linked by this run through the run’s features ← artifact.

initiated_runs: RelatedManager[Run]

The runs that were initiated by this run.

values_artifact
output_artifacts: RelatedManager[Artifact]

The artifacts created in this run ← run.

This does not include recreated artifacts, which are tracked via recreated_artifacts.

If you want to query created + recreated artifacts, use query_output_artifacts() instead.

input_artifacts: RelatedManager[Artifact]

The artifacts serving as input for this run ← input_of_runs.

recreated_artifacts: RelatedManager[Artifact]

The output artifacts that were recreated by this run ← recreating_runs.

Artifacts are recreated if they trigger a hash lookup match for an existing artifact.

output_collections: RelatedManager[Collection]

The collections created in this run ← run.

input_collections: RelatedManager[Collection]

The collections serving as input for this run ← input_of_runs.

recreated_collections: RelatedManager[Collection]

The output collections that were recreated by this run ← recreating_runs.

Collections are recreated if they trigger a hash lookup match for an existing collection.

output_records: RelatedManager[Record]

The collections created in this run ← run.

input_records: RelatedManager[Record]

The collections serving as input for this run ← input_of_runs.

records: RelatedManager[Record]

The records annotating this run ← runs.

projects: RelatedManager[Project]

The projects annotating this run ← runs.

ablocks: RelatedManager[RunBlock]

Attached blocks ← run.

Class methods

filter(**expressions)

Query records.

Parameters:
  • queries – One or multiple Q objects.

  • expressions – Fields and values passed as Django query expressions.

Return type:

QuerySet

See also

Examples

>>> ln.Project(name="my label").save()
>>> ln.Project.filter(name__startswith="my").to_dataframe()
classmethod get(idlike=None, **expressions)

Get a single record.

Parameters:
  • idlike (int | str | None, default: None) – Either a uid stub, uid or an integer id.

  • expressions – Fields and values passed as Django query expressions.

Raises:

lamindb.errors.ObjectDoesNotExist – In case no matching record is found.

Return type:

SQLRecord

See also

Examples

record = ln.Record.get("FvtpPJLJ")
record = ln.Record.get(name="my-label")
classmethod to_dataframe(include=None, features=False, limit=100)

Evaluate and convert to pd.DataFrame.

By default, this returns up to 100 rows for a fast overview. Pass limit=None to fetch all matching records.

By default, maps simple fields and foreign keys onto DataFrame columns.

Guide: Query & search registries

Parameters:
  • include (str | list[str] | None, default: None) – Related data to include as columns. Takes strings of form "records__name", "cell_types__name", etc. or a list of such strings. For Artifact, Record, and Run, can also pass "features" to include features with data types pointing to entities in the core schema. If "privates", includes private fields (fields starting with _).

  • features (bool | list[str], default: False) – Configure the features to include. Can be a feature name or a list of such names. If "queryset", infers the features used within the current queryset. Only available for Artifact, Record, and Run.

  • limit (int, default: 100) – Maximum number of rows to display. Defaults to 100. If None, includes all results.

  • order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.

Return type:

DataFrame

Examples

Include the name of the creator:

ln.Record.to_dataframe(include="created_by__name"])

Include features:

ln.Artifact.to_dataframe(include="features")

Include selected features:

ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
classmethod search(string, *, field=None, limit=20, case_sensitive=False)

Search.

Parameters:
  • string (str) – The input string to match against the field ontology values.

  • field (str | DeferredAttribute | None, default: None) – The field or fields to search. Search all string fields by default.

  • limit (int | None, default: 20) – Maximum amount of top results to return.

  • case_sensitive (bool, default: False) – Whether the match is case sensitive.

Return type:

QuerySet

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True. QuerySet.

See also

filter() lookup()

Examples

records = ln.Record.from_values(["Label1", "Label2", "Label3"], field="name").save()
ln.Record.search("Label2")
classmethod lookup(field=None, return_field=None)

Return an auto-complete object for a field.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – The field to look up the values for. Defaults to first string field.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. If None, returns the whole record.

  • keep – When multiple records are found for a lookup, how to return the records. - "first": return the first record. - "last": return the last record. - False: return all records.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

Lookup via auto-complete on .:

import bionty as bt
bt.Gene.from_source(symbol="ADGB-DT").save()
lookup = bt.Gene.lookup()
lookup.adgb_dt

Look up via auto-complete in dictionary:

lookup_dict = lookup.dict()
lookup_dict['ADGB-DT']

Look up via a specific field:

lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id")
genes.ensg00000002745

Return a specific field value instead of the full record:

lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
classmethod connect(instance)

Query a non-default LaminDB instance.

Parameters:

instance (str | None) – An instance identifier of form “account_handle/instance_name”.

Return type:

QuerySet

Examples

ln.Record.connect("account_handle/instance_name").search("label7", field="name")

Methods

query_output_artifacts(include_recreated=True)

Query output artifacts including recreated ones.

This runs the following query under the hood:

ln.Artifact.filter(ln.Q(run=self) | ln.Q(recreating_runs=self)).distinct()
Parameters:

include_recreated (bool, default: True) – If True, return both originally created and recreated artifacts. If False, return only originally created artifacts.

Return type:

QuerySet

Returns:

A queryset of Artifact objects.

See also

output_artifacts

QuerySet of originally created artifacts.

recreated_artifacts

QuerySet of recreated artifacts.

restore()

Restore from trash onto the main branch.

Does not restore descendant objects if the object is HasType with is_type = True.

Return type:

None

delete(permanent=None, **kwargs)

Delete object.

If object is HasType with is_type = True, deletes all descendant objects, too.

Parameters:

permanent (bool | None, default: None) – Whether to permanently delete the object (skips trash). If None, performs soft delete if the object is not already in the trash.

Returns:

When permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.

Examples

For any SQLRecord object sqlrecord, call:

sqlrecord.delete()
save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

TypeVar(T, bound= SQLRecord)

classmethod describe(include=None)

Describe record including relations.

Parameters:
  • return_str (bool, default: False) – Return a string instead of printing.

  • include (None | Literal['comments'], default: None) – Include additional content. Use "comments" to display readme and comment blocks.

Return type:

None | str

refresh_from_db(using=None, fields=None, from_queryset=None)

Reload field values from the database.

By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.

Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.

When accessing deferred fields of an instance, the deferred loading of the field will call this method.