Manage notebooks, scripts & workflows .md .md

This guide walks from tracking data lineage in a notebook to tracking parameters in workflows.

Note: To run examples, if you don’t have a lamindb instance, create one:

!lamin init --storage ./test-track
Hide code cell output
 initialized lamindb: testuser1/test-track

Manage notebooks and scripts

Call track() to save your notebook or script as a transform and start tracking inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

You find your notebooks and scripts in the Transform registry along with pipelines & functions:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code             # source code
transform.runs.to_dataframe()     # all runs in a dataframe
transform.latest_run.report       # report of latest run
transform.latest_run.environment  # environment of latest run

You can use the CLI to load a transform into your current (development) directory:

lamin load --key my_analyses/my_notebook.ipynb

Here is how you’d load the notebook from the video into your local directory:

lamin load https://lamin.ai/laminlabs/lamindata/transform/F4L3oC6QsZvQ

Organize local development

If no development directory is set, script & notebook keys equal their filenames. Otherwise, they represent the relative path in the development directory.

The exception is packaged source code, whose keys have the form pypackages/{package_name}/path/to/file.py.

To set the development directory to your current shell development directory, run:

lamin settings set dev-dir .

You can see the current status by running:

lamin info

Use projects

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create & save a project
ln.track(project="My project")  # pass project
open("sample.fasta", "w").write(">seq1\nACGT\n")  # create a dataset
ln.Artifact("sample.fasta", key="sample.fasta").save()  # auto-labeled by project
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('BEtg2ltMShat0000', key='track.ipynb'), started new Run('vDRBdU6TY1bcH2wH') at 2026-04-27 13:57:14 UTC
 notebook imports: lamindb
 recommendation: to identify the notebook across renames, pass the uid: ln.track("BEtg2ltMShat", project="My project")
Artifact(uid='fPz2blEYkLOWarVD0000', key='sample.fasta', description=None, suffix='.fasta', kind=None, otype=None, size=11, hash='83rEPcAoBHmYiIuyBYrFKg', n_files=None, n_observations=None, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2026-04-27 13:57:15 UTC, is_locked=False, version_tag=None, is_latest=True)

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
1 fPz2blEYkLOWarVD0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None ... True False 2026-04-27 13:57:15.848000+00:00 1 1 1 1 1 None 1

1 rows × 21 columns

Access entities linked to a project:

my_project.artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
1 fPz2blEYkLOWarVD0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None ... True False 2026-04-27 13:57:15.848000+00:00 1 1 1 1 1 None 1

1 rows × 21 columns

The same works for my_project.transforms or my_project.runs.

Use spaces

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Sync code with git

To sync scripts or workflows with their correponding files in a git repo, either export an environment variable:

export LAMINDB_SYNC_GIT_REPO = <YOUR-GIT-REPO-URL>

Or set the following setting:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

If you work on a single project in your lamindb instance, it makes sense to set LaminDB’s dev-dir to the root of the local git repo clone.

dbs/
  project1/
    .git/
    script1.py
    notebook1.ipynb
  ...

If you work on multiple projects in your lamindb instance, you can use the dev-dir as the local root and nest git repositories in it.

dbs/
  database1/
    repo1/
      .git/
    repo2/
      .git/
  ...

Track agent plans

Saving an agent plan automatically tags with artifact.kind = "plan" and infers a key starting with .plans/:

lamin save /path/to/.cursor/plans/my_task.plan.md
lamin save /path/to/.claude/plans/my_task.md

Link an agent plan against a run:

ln.track(plan=".plans/my-agent-plan.md")

This links the plan artifact to a run in the same way as transform, an initiating run (initiated_by_run), and report / environment artifacts are linked to the run.

While transform acts as the deterministic source code for the run and initiated_by_run enables higher-level runs in workflow orchestration, the agent plan complements these by linking a plan that steers a non-deterministic agent.

Manage workflows

Here we’ll manage workflows with lamindb’s flow() and step() decorators, which works out-of-the-box with the majority of Python workflow managers:

tool

workflow decorator

step/task decorator

notes

lamindb

@flow

@step

inspired by prefect

prefect

@flow

@task

two decorators

redun

@task (on main)

@task

single decorator for everything

dagster

@job or @asset

@op or @asset

asset-centric; @asset is primary

flyte

@workflow

@task

also @dynamic for runtime DAGs

airflow

@dag

@task

TaskFlow API (modern); also supports operators

zenml

@pipeline

@step

inspired by prefect

If you’re looking for more in-depth examples or for integrating with non-decorator-based workflow managers such as Nextflow or Snakemake, see Manage computational pipelines.

tool

workflow

step/task

notes

nextflow

workflow keyword

process keyword

groovy-based DSL

snakemake

rule keyword

rule keyword

file-based DSL

metaflow

FlowSpec

@step

class-based

kedro

Pipeline()

node()

function-based

A one-step workflow

Decorate a function with flow() to track it as a workflow:

my_workflow.py
import lamindb as ln


@ln.flow()
def ingest_dataset(key: str) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet")

Let’s run the workflow:

!python scripts/my_workflow.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('iXxjoxmPO2RK0000', key='my_workflow.py'), started new Run('PyXOM89rven017sk', entrypoint='ingest_dataset') at 2026-04-27 13:57:18 UTC
→ params: key='my_analysis/dataset.parquet'
 recommendation: to identify the script across renames, pass the uid: @ln.flow(uid="iXxjoxmPO2RK")

Query the workflow via its filename:

transform = ln.Transform.get(key="my_workflow.py")
transform.describe()
Hide code cell output
Transform: my_workflow.py (0000)
├── uid: iXxjoxmPO2RK0000                                     
hash: uJ3fsnfaNN6EZ7Q0d8SQtw         type: script         
branch: main                         space: all           
created_at: 2026-04-27 13:57:18 UTC  created_by: testuser1
└── source_code: 
    import lamindb as ln
    
    
    @ln.flow()
    def ingest_dataset(key: str) -> ln.Artifact:
        df = ln.examples.datasets.mini_immuno.get_dataset1()
        artifact = ln.Artifact.from_dataframe(df, key=key).save()
        return artifact
    
    
    if __name__ == "__main__":
        ingest_dataset(key="my_analysis/dataset.parquet")

The run stored the parameter value for key:

transform.latest_run.describe()
Hide code cell output
Run: PyXOM89 (my_workflow.py)
├── uid: PyXOM89rven017sk                transform: my_workflow.py (0000)    
started_at: 2026-04-27 13:57:18 UTC  finished_at: 2026-04-27 13:57:22 UTC
status: completed                                                        
branch: main                         space: all                          
created_at: 2026-04-27 13:57:18 UTC  created_by: testuser1               
├── report: DsCrhbm
→ connected lamindb: testuser1/test-track
→ created Transform('iXxjoxmPO2RK0000', key='my_workflow.py'), started new Run(' …
→ params: key='my_analysis/dataset.parquet'
• recommendation: to identify the script across renames, pass the uid: @ln.flow( …
├── environment: IJ4G64f
aiobotocore==3.5.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.5
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset.parquet

It links output artifacts:

transform.latest_run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
3 3t5USfr0z3FvHDyf0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 RBXAgQ4vMiLAszcX7rfYtg None 3 ... True False 2026-04-27 13:57:22.508000+00:00 1 1 1 1 2 None 1

1 rows × 21 columns

You can query for all runs that ran with that parameter:

ln.Run.filter(
    params__key="my_analysis/dataset.parquet",
).to_dataframe()
Hide code cell output
uid name description entrypoint started_at finished_at params reference reference_type cli_args ... created_at branch_id created_on_id space_id transform_id report_id environment_id plan_id created_by_id initiated_by_run_id
id
2 PyXOM89rven017sk None None ingest_dataset 2026-04-27 13:57:18.207460+00:00 2026-04-27 13:57:22.517143+00:00 {'key': 'my_analysis/dataset.parquet'} None None None ... 2026-04-27 13:57:18.560000+00:00 1 1 1 2 4 2 None 1 None

1 rows × 21 columns

You can also pass complex parameters and features, see: Track parameters & features.

A multi-step workflow

Here, the workflow calls an additional processing step:

my_workflow_with_step.py
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> ln.Artifact:
    df = artifact.load()
    new_data = df.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key).save()


@ln.flow()
def ingest_dataset(key: str, subset: bool = False) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    if subset:
        artifact = subset_dataframe(artifact)
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet", subset=True)

Let’s run the workflow:

!python scripts/my_workflow_with_step.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('GGTRHw67pbNY0000', key='my_workflow_with_step.py'), started new Run('vmIJT0N6d5gibhhM', entrypoint='ingest_dataset') at 2026-04-27 13:57:24 UTC
→ params: key='my_analysis/dataset.parquet', subset=True
 recommendation: to identify the script across renames, pass the uid: @ln.flow(uid="GGTRHw67pbNY")
 returning artifact with same hash: Artifact(uid='3t5USfr0z3FvHDyf0000', key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='RBXAgQ4vMiLAszcX7rfYtg', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=2, schema_id=None, created_by_id=1, created_at=2026-04-27 13:57:22 UTC, is_locked=False, version_tag=None, is_latest=True); to track this artifact as an input, use: ln.Artifact.get()
 loaded Transform('GGTRHw67pbNY0000', key='my_workflow_with_step.py'), started new Run('pHmx6bz50zJ926YC', entrypoint='subset_dataframe') at 2026-04-27 13:57:26 UTC
→ params: artifact='Artifact[3t5USfr0z3FvHDyf0000]', subset_rows=2, subset_cols=2

The lineage of the subsetted artifact resolves the subsetting step:

subsetted_artifact = ln.Artifact.get(key="my_analysis/dataset_subsetted.parquet")
subsetted_artifact.view_lineage()
Hide code cell output
_images/f81442a5265fb155ee988eb4540ed37e73543498c7f18f163f54980f259d890d.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run
Hide code cell output
Run(uid='pHmx6bz50zJ926YC', name=None, description=None, entrypoint='subset_dataframe', started_at=2026-04-27 13:57:26 UTC, finished_at=2026-04-27 13:57:27 UTC, params={'artifact': 'Artifact[3t5USfr0z3FvHDyf0000]', 'subset_rows': 2, 'subset_cols': 2}, reference=None, reference_type=None, cli_args=None, branch_id=1, created_on_id=1, space_id=1, transform_id=3, report_id=None, environment_id=None, plan_id=None, created_by_id=1, initiated_by_run_id=3, created_at=2026-04-27 13:57:26 UTC, is_locked=False)

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run
Hide code cell output
Run(uid='vmIJT0N6d5gibhhM', name=None, description=None, entrypoint='ingest_dataset', started_at=2026-04-27 13:57:24 UTC, finished_at=2026-04-27 13:57:27 UTC, params={'key': 'my_analysis/dataset.parquet', 'subset': True}, reference=None, reference_type=None, cli_args=None, branch_id=1, created_on_id=1, space_id=1, transform_id=3, report_id=6, environment_id=2, plan_id=None, created_by_id=1, initiated_by_run_id=None, created_at=2026-04-27 13:57:24 UTC, is_locked=False)

These are the parameters of the run:

subsetted_artifact.run.params
Hide code cell output
{'artifact': 'Artifact[3t5USfr0z3FvHDyf0000]',
 'subset_rows': 2,
 'subset_cols': 2}

These are the input artifacts:

subsetted_artifact.run.input_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
3 3t5USfr0z3FvHDyf0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 RBXAgQ4vMiLAszcX7rfYtg None 3 ... True False 2026-04-27 13:57:22.508000+00:00 1 1 1 1 2 None 1

1 rows × 21 columns

These are output artifacts:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 ZqVhHgdcN7Yu8HkP0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 mrK4zo5LCWcAYIcQz6myJg None 2 ... True False 2026-04-27 13:57:27.459000+00:00 1 1 1 1 4 None 1

1 rows × 21 columns

A workflow with CLI arguments

Let’s use click to parse CLI arguments:

my_workflow_with_click.py
import click
import lamindb as ln


@click.command()
@click.option("--key", required=True)
@ln.flow()
def main(key: str):
    df = ln.examples.datasets.mini_immuno.get_dataset2()
    ln.Artifact.from_dataframe(df, key=key).save()


if __name__ == "__main__":
    main()

Let’s run the workflow:

!python scripts/my_workflow_with_click.py --key my_analysis/dataset2.parquet
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --key my_analysis/dataset2.parquet
 created Transform('yQYLC5VFcpwW0000', key='my_workflow_with_click.py'), started new Run('dLYK0NEQ9CF5FCiI', entrypoint='main') at 2026-04-27 13:57:29 UTC
→ params: key='my_analysis/dataset2.parquet'
 recommendation: to identify the script across renames, pass the uid: @ln.flow(uid="yQYLC5VFcpwW")

CLI arguments are tracked and accessible via run.cli_args:

run = ln.Run.filter(transform__key="my_workflow_with_click.py").first()
run.describe()
Hide code cell output
Run: dLYK0NE (my_workflow_with_click.py)
├── uid: dLYK0NEQ9CF5FCiI                transform: my_workflow_with_click.py (0000)
started_at: 2026-04-27 13:57:29 UTC  finished_at: 2026-04-27 13:57:31 UTC       
status: completed                                                               
branch: main                         space: all                                 
created_at: 2026-04-27 13:57:29 UTC  created_by: testuser1                      
├── cli_args: 
--key my_analysis/dataset2.parquet
├── report: bAZVd1G
→ connected lamindb: testuser1/test-track
→ created Transform('yQYLC5VFcpwW0000', key='my_workflow_with_click.py'), starte …
→ params: key='my_analysis/dataset2.parquet'
• recommendation: to identify the script across renames, pass the uid: @ln.flow( …
├── environment: IJ4G64f
aiobotocore==3.5.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.5
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset2.parquet

Note that it doesn’t matter whether you use click, argparse, or any other CLI argument parser.

Track parameters & features

We just saw that the function decorators @ln.flow() and @ln.step() track parameter values automatically. Here is how to pass parameters to ln.track():

run_track_with_params.py
import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run_track_with_params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --input-dir ./mydataset --learning-rate 0.01 --downsample
 created Transform('h2DmA6dIRHhb0000', key='run_track_with_params.py'), started new Run('uQVVwWHM68bbLFaC') at 2026-04-27 13:57:33 UTC
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
 recommendation: to identify the script across renames, pass the uid: ln.track("h2DmA6dIRHhb", params={...})

Query for all runs that match certain parameters:

ln.Run.filter(
    params__learning_rate=0.01,
    params__preprocess_params__downsample=True,
).to_dataframe()
Hide code cell output
uid name description entrypoint started_at finished_at params reference reference_type cli_args ... created_at branch_id created_on_id space_id transform_id report_id environment_id plan_id created_by_id initiated_by_run_id
id
6 uQVVwWHM68bbLFaC None None None 2026-04-27 13:57:33.412953+00:00 2026-04-27 13:57:34.840568+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... ... 2026-04-27 13:57:33.751000+00:00 1 1 1 5 9 2 None 1 None

1 rows × 21 columns

Describe & get parameters:

run = ln.Run.filter(params__learning_rate=0.01).order_by("-started_at").first()
run.describe()
run.params
Hide code cell output
Run: uQVVwWH (run_track_with_params.py)
├── uid: uQVVwWHM68bbLFaC                transform: run_track_with_params.py (0000)
started_at: 2026-04-27 13:57:33 UTC  finished_at: 2026-04-27 13:57:34 UTC      
status: completed                                                              
branch: main                         space: all                                
created_at: 2026-04-27 13:57:33 UTC  created_by: testuser1                     
├── cli_args: 
--input-dir ./mydataset --learning-rate 0.01 --downsample
├── report: rHu7Wgy
→ connected lamindb: testuser1/test-track
→ created Transform('h2DmA6dIRHhb0000', key='run_track_with_params.py'), started …
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downs …
• recommendation: to identify the script across renames, pass the uid: ln.track( …
├── environment: IJ4G64f
aiobotocore==3.5.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.5
aioitertools==0.13.0
│ …
└── Params
    ├── input_dir: ./mydataset
    ├── learning_rate: 0.01
    └── preprocess_params: {'downsample': True, 'normalization': 'the_good_one'}
{'input_dir': './mydataset',
 'learning_rate': 0.01,
 'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}

You can also access the CLI arguments used to start the run directly:

run.cli_args
Hide code cell output
'--input-dir ./mydataset --learning-rate 0.01 --downsample'

You can also track run features in analogy to artifact features.

In contrast to params, features are validated against the Feature registry and allow to express relationships with entities in your registries.

Let’s first define labels & features.

experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment_label = ln.Record(name="Experiment1", type=experiment_type).save()
ln.Feature(name="s3_folder", dtype=str).save()
ln.Feature(name="experiment", dtype=experiment_type).save()
Hide code cell output
Feature(uid='lGyuxjix0HIW', is_type=False, name='experiment', _dtype_str='cat[Record[thGuHxnZtq5noOOq]]', unit=None, description=None, array_rank=0, array_size=0, array_shape=None, synonyms=None, default_value=None, nullable=True, coerce=None, branch_id=1, created_on_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, created_at=2026-04-27 13:57:35 UTC, is_locked=False)
!python scripts/run_track_with_features_and_params.py  --s3-folder s3://my-bucket/my-folder --experiment Experiment1
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --s3-folder s3://my-bucket/my-folder --experiment Experiment1
 created Transform('BToZpZluOusa0000', key='run_track_with_features_and_params.py'), started new Run('GfqLVT9tyniORuBL') at 2026-04-27 13:57:36 UTC
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
 recommendation: to identify the script across renames, pass the uid: ln.track("BToZpZluOusa", params={...})
ln.Run.filter(s3_folder="s3://my-bucket/my-folder").to_dataframe()
Hide code cell output
uid name description entrypoint started_at finished_at params reference reference_type cli_args ... created_at branch_id created_on_id space_id transform_id report_id environment_id plan_id created_by_id initiated_by_run_id
id
7 GfqLVT9tyniORuBL None None None 2026-04-27 13:57:36.479554+00:00 2026-04-27 13:57:38.423530+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... ... 2026-04-27 13:57:36.821000+00:00 1 1 1 6 10 2 None 1 None

1 rows × 21 columns

Describe & get feature values.

run2 = ln.Run.filter(
    s3_folder="s3://my-bucket/my-folder", experiment="Experiment1"
).last()
run2.describe()
run2.features.get_values()
Hide code cell output
Run: GfqLVT9 (run_track_with_features_and_params.py)
├── uid: GfqLVT9tyniORuBL                transform: run_track_with_features_and_params.py (0000)
started_at: 2026-04-27 13:57:36 UTC  finished_at: 2026-04-27 13:57:38 UTC                   
status: completed                                                                           
branch: main                         space: all                                             
created_at: 2026-04-27 13:57:36 UTC  created_by: testuser1                                  
├── cli_args: 
--s3-folder s3://my-bucket/my-folder --experiment Experiment1
├── report: jWeJqpP
→ connected lamindb: testuser1/test-track
→ created Transform('BToZpZluOusa0000', key='run_track_with_features_and_params. …
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
│ …
├── environment: IJ4G64f
aiobotocore==3.5.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.5
aioitertools==0.13.0
│ …
├── Params
│   └── example_param: 42
└── Features
    └── experiment                     Record[Experiment]                   Experiment1                            
        s3_folder                      str                                  s3://my-bucket/my-folder               
{'experiment': 'Experiment1', 's3_folder': 's3://my-bucket/my-folder'}

Manage functions in scripts and notebooks

If you want more-fined-grained data lineage tracking in a script or notebook where you called ln.track(), you can also use the step() decorator.

In a notebook

@ln.step()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_dataframe(df, key=input_artifact_key).save()
Hide code cell output
 returning artifact with same hash: Artifact(uid='3t5USfr0z3FvHDyf0000', key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='RBXAgQ4vMiLAszcX7rfYtg', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=2, schema_id=None, created_by_id=1, created_at=2026-04-27 13:57:22 UTC, is_locked=False, version_tag=None, is_latest=True); to track this artifact as an input, use: ln.Artifact.get()

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key, subset_rows=1)
Hide code cell output
 ignoring transform with same filename in different folder:
    BEtg2ltMShat0000 → track.ipynb
 created Transform('bmzCfW6Qq0fO0000', key='track.ipynb'), started new Run('AramO3SZQIjwwlMO', entrypoint='subset_dataframe') at 2026-04-27 13:57:39 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=1, subset_cols=2
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
_images/7f7db3317fae31db44cb36d5b95cc2b5bbf2707e3e1217108350c616940699ed.svg

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
 loaded Transform('bmzCfW6Qq0fO0000', key='track.ipynb'), started new Run('7u1DhZBOoAdL1zLF', entrypoint='subset_dataframe') at 2026-04-27 13:57:40 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=2, subset_cols=3
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'
_images/f5ec021e6a9f1ae6d4df38d2586526e444aa51d255c43afe12998097655ee986.svg

We created a new run:

subsetted_artifact.run
Hide code cell output
Run(uid='7u1DhZBOoAdL1zLF', name=None, description=None, entrypoint='subset_dataframe', started_at=2026-04-27 13:57:40 UTC, finished_at=2026-04-27 13:57:41 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 3}, reference=None, reference_type=None, cli_args=None, branch_id=1, created_on_id=1, space_id=1, transform_id=7, report_id=None, environment_id=None, plan_id=None, created_by_id=1, initiated_by_run_id=1, created_at=2026-04-27 13:57:40 UTC, is_locked=False)

With new parameters:

subsetted_artifact.run.params
Hide code cell output
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_rows': 2,
 'subset_cols': 3}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
12 ZqVhHgdcN7Yu8HkP0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 FUJlSl4q2wA97xdIAfEyNA None 2 ... True False 2026-04-27 13:57:41.587000+00:00 1 1 1 1 9 None 1

1 rows × 21 columns

In a script

run_script_with_step.py
import argparse
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"is_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_dataframe(
            df, key="my_analysis/dataset.parquet"
        ).save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()
!python scripts/run_script_with_step.py --subset
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --subset
 created Transform('sy8fUo2rCZWR0000', key='run_script_with_step.py'), started new Run('exwZ2i3oWN6wx9NE') at 2026-04-27 13:57:42 UTC
→ params: is_subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("sy8fUo2rCZWR", params={...})
 returning artifact with same hash: Artifact(uid='3t5USfr0z3FvHDyf0000', key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='RBXAgQ4vMiLAszcX7rfYtg', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=2, schema_id=None, created_by_id=1, created_at=2026-04-27 13:57:22 UTC, is_locked=False, version_tag=None, is_latest=True); to track this artifact as an input, use: ln.Artifact.get()
 script invoked with: --subset
 loaded Transform('sy8fUo2rCZWR0000', key='run_script_with_step.py'), started new Run('YLjmY4ltm5FE2vvf', entrypoint='subset_dataframe') at 2026-04-27 13:57:45 UTC
→ params: artifact='Artifact[3t5USfr0z3FvHDyf0000]', subset_rows=2, subset_cols=2
 returning artifact with same hash: Artifact(uid='ZqVhHgdcN7Yu8HkP0000', key='my_analysis/dataset_subsetted.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=3696, hash='mrK4zo5LCWcAYIcQz6myJg', n_files=None, n_observations=2, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=4, schema_id=None, created_by_id=1, created_at=2026-04-27 13:57:27 UTC, is_locked=False, version_tag=None, is_latest=False); to track this artifact as an input, use: ln.Artifact.get()
! you are saving to a non-latest version of the artifact
ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
12 ZqVhHgdcN7Yu8HkP0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 FUJlSl4q2wA97xdIAfEyNA None 2.0 ... True False 2026-04-27 13:57:41.587000+00:00 1 1 1 1 9 None 1
11 ZqVhHgdcN7Yu8HkP0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 JYhR-3HY-PKvUOlDFPop9g None 1.0 ... False False 2026-04-27 13:57:40.692000+00:00 1 1 1 1 8 None 1
7 tzbU1HH4gPR9vwkM0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 79Sz30IvW4Jg424PTXRrcg None 3.0 ... True False 2026-04-27 13:57:31.784000+00:00 1 1 1 1 5 None 1
5 ZqVhHgdcN7Yu8HkP0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 mrK4zo5LCWcAYIcQz6myJg None 2.0 ... False False 2026-04-27 13:57:27.459000+00:00 1 1 1 1 4 None 1
3 3t5USfr0z3FvHDyf0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 RBXAgQ4vMiLAszcX7rfYtg None 3.0 ... True False 2026-04-27 13:57:22.508000+00:00 1 1 1 1 2 None 1
1 fPz2blEYkLOWarVD0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN ... True False 2026-04-27 13:57:15.848000+00:00 1 1 1 1 1 None 1

6 rows × 21 columns

Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value ... coerce is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
2 lGyuxjix0HIW experiment cat[Record[thGuHxnZtq5noOOq]] None None 0 0 None None None ... None False False 2026-04-27 13:57:35.279000+00:00 1 1 1 1 1 None
1 pBqJevGDrKEZ s3_folder str None None 0 0 None None None ... None False False 2026-04-27 13:57:35.268000+00:00 1 1 1 1 1 None

2 rows × 21 columns

JsonValue
value hash is_locked created_at branch_id created_on_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-04-27 13:57:37.302000+00:00 1 1 1 1 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
1 EaRtZ2EL3bKD My project None None None None None False False 2026-04-27 13:57:13.316000+00:00 1 1 1 1 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id created_on_id space_id created_by_id type_id schema_id run_id
id
2 fUwryeJ1I6R6Hi7h Experiment1 None None None None False False 2026-04-27 13:57:35.260000+00:00 1 1 1 1 1.0 None 1
1 thGuHxnZtq5noOOq Experiment None None None None False True 2026-04-27 13:57:35.253000+00:00 1 1 1 1 NaN None 1
! truncated query result to limit=7 Run objects
Run
uid name description entrypoint started_at finished_at params reference reference_type cli_args ... created_at branch_id created_on_id space_id transform_id report_id environment_id plan_id created_by_id initiated_by_run_id
id
11 YLjmY4ltm5FE2vvf None None subset_dataframe 2026-04-27 13:57:45.292081+00:00 2026-04-27 13:57:46.105888+00:00 {'artifact': 'Artifact[3t5USfr0z3FvHDyf0000]',... None None --subset ... 2026-04-27 13:57:45.293000+00:00 1 1 1 8 NaN NaN None 1 10.0
10 exwZ2i3oWN6wx9NE None None None 2026-04-27 13:57:42.874997+00:00 2026-04-27 13:57:46.107957+00:00 {'is_subset': True} None None --subset ... 2026-04-27 13:57:43.224000+00:00 1 1 1 8 13.0 2.0 None 1 NaN
9 7u1DhZBOoAdL1zLF None None subset_dataframe 2026-04-27 13:57:40.767244+00:00 2026-04-27 13:57:41.595936+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None ... 2026-04-27 13:57:40.768000+00:00 1 1 1 7 NaN NaN None 1 1.0
8 AramO3SZQIjwwlMO None None subset_dataframe 2026-04-27 13:57:39.858987+00:00 2026-04-27 13:57:40.700971+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None ... 2026-04-27 13:57:39.860000+00:00 1 1 1 7 NaN NaN None 1 1.0
7 GfqLVT9tyniORuBL None None None 2026-04-27 13:57:36.479554+00:00 2026-04-27 13:57:38.423530+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... ... 2026-04-27 13:57:36.821000+00:00 1 1 1 6 10.0 2.0 None 1 NaN
6 uQVVwWHM68bbLFaC None None None 2026-04-27 13:57:33.412953+00:00 2026-04-27 13:57:34.840568+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... ... 2026-04-27 13:57:33.751000+00:00 1 1 1 5 9.0 2.0 None 1 NaN
5 dLYK0NEQ9CF5FCiI None None main 2026-04-27 13:57:29.414069+00:00 2026-04-27 13:57:31.789881+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet ... 2026-04-27 13:57:29.757000+00:00 1 1 1 4 8.0 2.0 None 1 NaN

7 rows × 21 columns

Storage
uid root description type region instance_uid is_locked created_at branch_id created_on_id space_id created_by_id run_id
id
1 C2g4bO5awSCT /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-04-27 13:57:11.886000+00:00 1 1 1 1 None
! truncated query result to limit=7 Transform objects
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id created_on_id space_id environment_id plan_id created_by_id
id
8 sy8fUo2rCZWR0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-04-27 13:57:42.872000+00:00 1 1 1 None None 1
7 bmzCfW6Qq0fO0000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-04-27 13:57:39.854000+00:00 1 1 1 None None 1
6 BToZpZluOusa0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-04-27 13:57:36.477000+00:00 1 1 1 None None 1
5 h2DmA6dIRHhb0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-04-27 13:57:33.410000+00:00 1 1 1 None None 1
4 yQYLC5VFcpwW0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-04-27 13:57:29.264000+00:00 1 1 1 None None 1
3 GGTRHw67pbNY0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-04-27 13:57:24.207000+00:00 1 1 1 None None 1
2 iXxjoxmPO2RK0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-04-27 13:57:18.203000+00:00 1 1 1 None None 1

The database

See the state of the database after we ran these different examples:

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
12 ZqVhHgdcN7Yu8HkP0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 FUJlSl4q2wA97xdIAfEyNA None 2.0 ... True False 2026-04-27 13:57:41.587000+00:00 1 1 1 1 9 None 1
11 ZqVhHgdcN7Yu8HkP0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 JYhR-3HY-PKvUOlDFPop9g None 1.0 ... False False 2026-04-27 13:57:40.692000+00:00 1 1 1 1 8 None 1
7 tzbU1HH4gPR9vwkM0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 79Sz30IvW4Jg424PTXRrcg None 3.0 ... True False 2026-04-27 13:57:31.784000+00:00 1 1 1 1 5 None 1
5 ZqVhHgdcN7Yu8HkP0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 mrK4zo5LCWcAYIcQz6myJg None 2.0 ... False False 2026-04-27 13:57:27.459000+00:00 1 1 1 1 4 None 1
3 3t5USfr0z3FvHDyf0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 RBXAgQ4vMiLAszcX7rfYtg None 3.0 ... True False 2026-04-27 13:57:22.508000+00:00 1 1 1 1 2 None 1
1 fPz2blEYkLOWarVD0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN ... True False 2026-04-27 13:57:15.848000+00:00 1 1 1 1 1 None 1

6 rows × 21 columns

Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value ... coerce is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
2 lGyuxjix0HIW experiment cat[Record[thGuHxnZtq5noOOq]] None None 0 0 None None None ... None False False 2026-04-27 13:57:35.279000+00:00 1 1 1 1 1 None
1 pBqJevGDrKEZ s3_folder str None None 0 0 None None None ... None False False 2026-04-27 13:57:35.268000+00:00 1 1 1 1 1 None

2 rows × 21 columns

JsonValue
value hash is_locked created_at branch_id created_on_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-04-27 13:57:37.302000+00:00 1 1 1 1 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
1 EaRtZ2EL3bKD My project None None None None None False False 2026-04-27 13:57:13.316000+00:00 1 1 1 1 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id created_on_id space_id created_by_id type_id schema_id run_id
id
2 fUwryeJ1I6R6Hi7h Experiment1 None None None None False False 2026-04-27 13:57:35.260000+00:00 1 1 1 1 1.0 None 1
1 thGuHxnZtq5noOOq Experiment None None None None False True 2026-04-27 13:57:35.253000+00:00 1 1 1 1 NaN None 1
! truncated query result to limit=7 Run objects
Run
uid name description entrypoint started_at finished_at params reference reference_type cli_args ... created_at branch_id created_on_id space_id transform_id report_id environment_id plan_id created_by_id initiated_by_run_id
id
11 YLjmY4ltm5FE2vvf None None subset_dataframe 2026-04-27 13:57:45.292081+00:00 2026-04-27 13:57:46.105888+00:00 {'artifact': 'Artifact[3t5USfr0z3FvHDyf0000]',... None None --subset ... 2026-04-27 13:57:45.293000+00:00 1 1 1 8 NaN NaN None 1 10.0
10 exwZ2i3oWN6wx9NE None None None 2026-04-27 13:57:42.874997+00:00 2026-04-27 13:57:46.107957+00:00 {'is_subset': True} None None --subset ... 2026-04-27 13:57:43.224000+00:00 1 1 1 8 13.0 2.0 None 1 NaN
9 7u1DhZBOoAdL1zLF None None subset_dataframe 2026-04-27 13:57:40.767244+00:00 2026-04-27 13:57:41.595936+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None ... 2026-04-27 13:57:40.768000+00:00 1 1 1 7 NaN NaN None 1 1.0
8 AramO3SZQIjwwlMO None None subset_dataframe 2026-04-27 13:57:39.858987+00:00 2026-04-27 13:57:40.700971+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None ... 2026-04-27 13:57:39.860000+00:00 1 1 1 7 NaN NaN None 1 1.0
7 GfqLVT9tyniORuBL None None None 2026-04-27 13:57:36.479554+00:00 2026-04-27 13:57:38.423530+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... ... 2026-04-27 13:57:36.821000+00:00 1 1 1 6 10.0 2.0 None 1 NaN
6 uQVVwWHM68bbLFaC None None None 2026-04-27 13:57:33.412953+00:00 2026-04-27 13:57:34.840568+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... ... 2026-04-27 13:57:33.751000+00:00 1 1 1 5 9.0 2.0 None 1 NaN
5 dLYK0NEQ9CF5FCiI None None main 2026-04-27 13:57:29.414069+00:00 2026-04-27 13:57:31.789881+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet ... 2026-04-27 13:57:29.757000+00:00 1 1 1 4 8.0 2.0 None 1 NaN

7 rows × 21 columns

Storage
uid root description type region instance_uid is_locked created_at branch_id created_on_id space_id created_by_id run_id
id
1 C2g4bO5awSCT /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-04-27 13:57:11.886000+00:00 1 1 1 1 None
! truncated query result to limit=7 Transform objects
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id created_on_id space_id environment_id plan_id created_by_id
id
8 sy8fUo2rCZWR0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-04-27 13:57:42.872000+00:00 1 1 1 None None 1
7 bmzCfW6Qq0fO0000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-04-27 13:57:39.854000+00:00 1 1 1 None None 1
6 BToZpZluOusa0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-04-27 13:57:36.477000+00:00 1 1 1 None None 1
5 h2DmA6dIRHhb0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-04-27 13:57:33.410000+00:00 1 1 1 None None 1
4 yQYLC5VFcpwW0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-04-27 13:57:29.264000+00:00 1 1 1 None None 1
3 GGTRHw67pbNY0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-04-27 13:57:24.207000+00:00 1 1 1 None None 1
2 iXxjoxmPO2RK0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-04-27 13:57:18.203000+00:00 1 1 1 None None 1

Using transform versions as templates

A transform acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook or script, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

  • label using ULabel or Record, e.g., transform.records.add(template_label)

  • tag with an indicative version string, e.g., transform.version = "T1"; transform.save()

Saving a notebook as an artifact

Sometimes you might want to save a notebook as an artifact. This is how you can do it:

lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact

A few checks at the end of this notebook:

assert run.params == {
    "input_dir": "./mydataset",
    "learning_rate": 0.01,
    "preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}, run.params
assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()