Manage notebooks, scripts & workflows .md .md

If you don’t have a lamindb instance, here’s how to create one:

!lamin init --storage ./test-track
Hide code cell output
 initialized lamindb: testuser1/test-track

Manage notebooks and scripts

Call track() to save your notebook or script as a transform and start tracking inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

You find your notebooks and scripts in the Transform registry along with pipelines & functions:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code             # source code
transform.runs.to_dataframe()     # all runs in a dataframe
transform.latest_run.report       # report of latest run
transform.latest_run.environment  # environment of latest run

You can use the CLI to load a transform into your current (development) directory:

lamin load --key my_analyses/my_notebook.ipynb

If your instance is connected to LaminHub, you can search or filter the transform page and explore data lineage:

Here is how you’d load the notebook from the video into your local directory:

lamin load https://lamin.ai/laminlabs/lamindata/transform/F4L3oC6QsZvQ

Organize local development

If no development directory is set, script & notebook keys equal their filenames. Otherwise, they represent the relative path in the development directory.

The exception is packaged source code, whose keys have the form pypackages/{package_name}/path/to/file.py.

To set the development directory to your current shell development directory, run:

lamin settings set dev-dir .

You can see the current status by running:

lamin info

Use projects

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create & save a project
ln.track(project="My project")  # pass project
open("sample.fasta", "w").write(">seq1\nACGT\n")  # create a dataset
ln.Artifact("sample.fasta", key="sample.fasta").save()  # auto-labeled by project
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('GqSRFu3oF2pG0000', key='track.ipynb'), started new Run('4eRaimHCB7ma3Hmk') at 2026-02-02 09:04:23 UTC
 notebook imports: lamindb==2.1.0
 recommendation: to identify the notebook across renames, pass the uid: ln.track("GqSRFu3oF2pG", project="My project")
Artifact(uid='ZBUW3Q0sRjiMmylP0000', version_tag=None, is_latest=True, key='sample.fasta', description=None, suffix='.fasta', kind=None, otype=None, size=11, hash='83rEPcAoBHmYiIuyBYrFKg', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=3, run_id=1, schema_id=None, created_by_id=3, created_at=2026-02-02 09:04:25 UTC, is_locked=False)

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 ZBUW3Q0sRjiMmylP0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None None True False 2026-02-02 09:04:25.201000+00:00 1 1 3 1 None 3

Access entities linked to a project:

my_project.artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 ZBUW3Q0sRjiMmylP0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None None True False 2026-02-02 09:04:25.201000+00:00 1 1 3 1 None 3

The same works for my_project.transforms or my_project.runs.

Use spaces

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Sync code with git

To sync scripts or workflows with their correponding files in a git repo, either export an environment variable:

export LAMINDB_SYNC_GIT_REPO = <YOUR-GIT-REPO-URL>

Or set the following setting:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

If you work on a single project in your lamindb instance, it makes sense to set LaminDB’s dev-dir to the root of the local git repo clone.

dbs/
  project1/
    .git/
    script1.py
    notebook1.ipynb
  ...

If you work on multiple projects in your lamindb instance, you can use the dev-dir as the local root and nest git repositories in it.

dbs/
  database1/
    repo1/
      .git/
    repo2/
      .git/
  ...

Manage workflows

Here we’ll manage workflows with lamindb’s flow() and step() decorators, which works out-of-the-box with the majority of Python workflow managers:

tool

workflow decorator

step/task decorator

notes

lamindb

@flow

@step

inspired by prefect

prefect

@flow

@task

two decorators

redun

@task (on main)

@task

single decorator for everything

dagster

@job or @asset

@op or @asset

asset-centric; @asset is primary

flyte

@workflow

@task

also @dynamic for runtime DAGs

airflow

@dag

@task

TaskFlow API (modern); also supports operators

zenml

@pipeline

@step

inspired by prefect

If you’re looking for more in-depth examples or for integrating with non-decorator-based workflow managers such as Nextflow or Snakemake, see Manage computational pipelines.

tool

workflow

step/task

notes

nextflow

workflow keyword

process keyword

groovy-based DSL

snakemake

rule keyword

rule keyword

file-based DSL

metaflow

FlowSpec

@step

class-based

kedro

Pipeline()

node()

function-based

A one-step workflow

Decorate a function with flow() to track it as a workflow:

my_workflow.py
import lamindb as ln


@ln.flow()
def ingest_dataset(key: str) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet")

Let’s run the workflow:

!python scripts/my_workflow.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('KYWVhGrdlkjh0000', key='my_workflow.py'), started new Run('UUvwYh6Skd7lk2qO') at 2026-02-02 09:04:27 UTC
→ params: key='my_analysis/dataset.parquet'
 recommendation: to identify the script across renames, pass the uid: ln.track("KYWVhGrdlkjh", params={...})
 writing the in-memory object into cache

Query the workflow via its filename:

transform = ln.Transform.get(key="my_workflow.py")
transform.describe()
Hide code cell output
Transform: my_workflow.py (0000)
├── uid: KYWVhGrdlkjh0000                                     
hash: uJ3fsnfaNN6EZ7Q0d8SQtw         type: script         
branch: main                         space: all           
created_at: 2026-02-02 09:04:27 UTC  created_by: testuser1
└── source_code: 
    import lamindb as ln
    
    
    @ln.flow()
    def ingest_dataset(key: str) -> ln.Artifact:
        df = ln.examples.datasets.mini_immuno.get_dataset1()
        artifact = ln.Artifact.from_dataframe(df, key=key).save()
        return artifact
    
    
    if __name__ == "__main__":
        ingest_dataset(key="my_analysis/dataset.parquet")

The run stored the parameter value for key:

transform.latest_run.describe()
Hide code cell output
Run: UUvwYh6 (my_workflow.py)
├── uid: UUvwYh6Skd7lk2qO                transform: my_workflow.py (0000)    
started_at: 2026-02-02 09:04:27 UTC  finished_at: 2026-02-02 09:04:28 UTC
status: completed                                                        
branch: main                         space: all                          
created_at: 2026-02-02 09:04:27 UTC  created_by: testuser1               
├── report: XBZvSUN
→ connected lamindb: testuser1/test-track
→ created Transform('KYWVhGrdlkjh0000', key='my_workflow.py'), started new Run(' …
→ params: key='my_analysis/dataset.parquet'
• recommendation: to identify the script across renames, pass the uid: ln.track( …
│ …
├── environment: dNtQwru
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset.parquet

It links output artifacts:

transform.latest_run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
3 dUV2fQQCkIjI0lbJ0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3 None True False 2026-02-02 09:04:28.832000+00:00 1 1 3 2 None 3

You can query for all runs that ran with that parameter:

ln.Run.filter(
    params__key="my_analysis/dataset.parquet",
).to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
2 UUvwYh6Skd7lk2qO None ingest_dataset 2026-02-02 09:04:27.676842+00:00 2026-02-02 09:04:28.836383+00:00 {'key': 'my_analysis/dataset.parquet'} None None None False 2026-02-02 09:04:27.678000+00:00 1 1 2 4 2 3 None

You can also pass complex parameters and features, see: Track parameters & features.

A multi-step workflow

Here, the workflow calls an additional processing step:

my_workflow_with_step.py
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> ln.Artifact:
    df = artifact.load()
    new_data = df.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key).save()


@ln.flow()
def ingest_dataset(key: str, subset: bool = False) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    if subset:
        artifact = subset_dataframe(artifact)
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet", subset=True)

Let’s run the workflow:

!python scripts/my_workflow_with_step.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('EyLEa4lR3VVR0000', key='my_workflow_with_step.py'), started new Run('O1zaOPB2uoWFM4XZ') at 2026-02-02 09:04:31 UTC
→ params: key='my_analysis/dataset.parquet', subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("EyLEa4lR3VVR", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='dUV2fQQCkIjI0lbJ0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-02 09:04:28 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
 loaded Transform('EyLEa4lR3VVR0000', key='my_workflow_with_step.py'), started new Run('TuOvcqvvFQiulTbH') at 2026-02-02 09:04:32 UTC
→ params: artifact='Artifact[dUV2fQQCkIjI0lbJ0000]', subset_rows=2, subset_cols=2
 recommendation: to identify the script across renames, pass the uid: ln.track("EyLEa4lR3VVR", params={...})
 writing the in-memory object into cache

The lineage of the subsetted artifact resolves the subsetting step:

subsetted_artifact = ln.Artifact.get(key="my_analysis/dataset_subsetted.parquet")
subsetted_artifact.view_lineage()
Hide code cell output
_images/bd84551c6697eab28c3aec2c3e26b9f70ce3f089ed580134e4fc5df46c6c1df4.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run
Hide code cell output
Run(uid='TuOvcqvvFQiulTbH', name=None, entrypoint='subset_dataframe', started_at=2026-02-02 09:04:32 UTC, finished_at=2026-02-02 09:04:33 UTC, params={'artifact': 'Artifact[dUV2fQQCkIjI0lbJ0000]', 'subset_rows': 2, 'subset_cols': 2}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=3, report_id=None, environment_id=2, created_by_id=3, initiated_by_run_id=3, created_at=2026-02-02 09:04:32 UTC, is_locked=False)

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run
Hide code cell output
Run(uid='O1zaOPB2uoWFM4XZ', name=None, entrypoint='ingest_dataset', started_at=2026-02-02 09:04:31 UTC, finished_at=2026-02-02 09:04:33 UTC, params={'key': 'my_analysis/dataset.parquet', 'subset': True}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=3, report_id=6, environment_id=2, created_by_id=3, initiated_by_run_id=None, created_at=2026-02-02 09:04:31 UTC, is_locked=False)

These are the parameters of the run:

subsetted_artifact.run.params
Hide code cell output
{'artifact': 'Artifact[dUV2fQQCkIjI0lbJ0000]',
 'subset_rows': 2,
 'subset_cols': 2}

These are the input artifacts:

subsetted_artifact.run.input_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
3 dUV2fQQCkIjI0lbJ0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3 None True False 2026-02-02 09:04:28.832000+00:00 1 1 3 2 None 3

These are output artifacts:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
5 Ox8uaCIVr0VnZKHN0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2 None True False 2026-02-02 09:04:33.680000+00:00 1 1 3 4 None 3

A workflow with CLI arguments

Let’s use click to parse CLI arguments:

my_workflow_with_click.py
import click
import lamindb as ln


@click.command()
@click.option("--key", required=True)
@ln.flow()
def main(key: str):
    df = ln.examples.datasets.mini_immuno.get_dataset2()
    ln.Artifact.from_dataframe(df, key=key).save()


if __name__ == "__main__":
    main()

Let’s run the workflow:

!python scripts/my_workflow_with_click.py --key my_analysis/dataset2.parquet
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --key my_analysis/dataset2.parquet
 created Transform('bDpEWkJtV6Xe0000', key='my_workflow_with_click.py'), started new Run('3aTAu3ngnVYVVhOZ') at 2026-02-02 09:04:36 UTC
→ params: key='my_analysis/dataset2.parquet'
 recommendation: to identify the script across renames, pass the uid: ln.track("bDpEWkJtV6Xe", params={...})
 writing the in-memory object into cache

CLI arguments are tracked and accessible via run.cli_args:

run = ln.Run.filter(transform__key="my_workflow_with_click.py").first()
run.describe()
Hide code cell output
Run: 3aTAu3n (my_workflow_with_click.py)
├── uid: 3aTAu3ngnVYVVhOZ                transform: my_workflow_with_click.py (0000)    
                                     |   description: CLI: my_workflow_with_click.py
started_at: 2026-02-02 09:04:36 UTC  finished_at: 2026-02-02 09:04:37 UTC           
status: completed                                                                   
branch: main                         space: all                                     
created_at: 2026-02-02 09:04:36 UTC  created_by: testuser1                          
├── cli_args: 
--key my_analysis/dataset2.parquet
├── report: afuT48E
→ connected lamindb: testuser1/test-track
→ created Transform('bDpEWkJtV6Xe0000', key='my_workflow_with_click.py'), starte …
→ params: key='my_analysis/dataset2.parquet'
• recommendation: to identify the script across renames, pass the uid: ln.track( …
│ …
├── environment: dNtQwru
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset2.parquet

Note that it doesn’t matter whether you use click, argparse, or any other CLI argument parser.

Track parameters & features

We just saw that the function decorators @ln.flow() and @ln.step() track parameter values automatically. Here is how to pass parameters to ln.track():

run_track_with_params.py
import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run_track_with_params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --input-dir ./mydataset --learning-rate 0.01 --downsample
 created Transform('YclFwMY23Khi0000', key='run_track_with_params.py'), started new Run('XZ9A4lXtuyZR8FvM') at 2026-02-02 09:04:41 UTC
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
 recommendation: to identify the script across renames, pass the uid: ln.track("YclFwMY23Khi", params={...})

Query for all runs that match certain parameters:

ln.Run.filter(
    params__learning_rate=0.01,
    params__preprocess_params__downsample=True,
).to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
6 XZ9A4lXtuyZR8FvM None None 2026-02-02 09:04:41.034414+00:00 2026-02-02 09:04:42.175919+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-02 09:04:41.035000+00:00 1 1 5 9 2 3 None

Describe & get parameters:

run = ln.Run.filter(params__learning_rate=0.01).order_by("-started_at").first()
run.describe()
run.params
Hide code cell output
Run: XZ9A4lX (run_track_with_params.py)
├── uid: XZ9A4lXtuyZR8FvM                transform: run_track_with_params.py (0000)    
                                     |   description: CLI: run_track_with_params.py
started_at: 2026-02-02 09:04:41 UTC  finished_at: 2026-02-02 09:04:42 UTC          
status: completed                                                                  
branch: main                         space: all                                    
created_at: 2026-02-02 09:04:41 UTC  created_by: testuser1                         
├── cli_args: 
--input-dir ./mydataset --learning-rate 0.01 --downsample
├── report: vZpTg78
→ connected lamindb: testuser1/test-track
→ created Transform('YclFwMY23Khi0000', key='run_track_with_params.py'), started …
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downs …
• recommendation: to identify the script across renames, pass the uid: ln.track( …
├── environment: dNtQwru
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    ├── input_dir: ./mydataset
    ├── learning_rate: 0.01
    └── preprocess_params: {'downsample': True, 'normalization': 'the_good_one'}
{'input_dir': './mydataset',
 'learning_rate': 0.01,
 'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}

You can also access the CLI arguments used to start the run directly:

run.cli_args
Hide code cell output
'--input-dir ./mydataset --learning-rate 0.01 --downsample'

You can also track run features in analogy to artifact features.

In contrast to params, features are validated against the Feature registry and allow to express relationships with entities in your registries.

Let’s first define labels & features.

experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment_label = ln.Record(name="Experiment1", type=experiment_type).save()
ln.Feature(name="s3_folder", dtype=str).save()
ln.Feature(name="experiment", dtype=experiment_type).save()
Hide code cell output
Feature(uid='4tk4kl7CDL5F', is_type=False, name='experiment', _dtype_str='cat[Record[xjhPFj0N2yZsHOB4]]', unit=None, description=None, array_rank=0, array_size=0, array_shape=None, synonyms=None, default_value=None, nullable=True, coerce=None, branch_id=1, space_id=1, created_by_id=3, run_id=1, type_id=None, created_at=2026-02-02 09:04:42 UTC, is_locked=False)
!python scripts/run_track_with_features_and_params.py  --s3-folder s3://my-bucket/my-folder --experiment Experiment1
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --s3-folder s3://my-bucket/my-folder --experiment Experiment1
 created Transform('K7ZABrKhUE3f0000', key='run_track_with_features_and_params.py'), started new Run('IDshWVEdnlIBADr7') at 2026-02-02 09:04:45 UTC
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
 recommendation: to identify the script across renames, pass the uid: ln.track("K7ZABrKhUE3f", params={...})
ln.Run.filter(s3_folder="s3://my-bucket/my-folder").to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
7 IDshWVEdnlIBADr7 None None 2026-02-02 09:04:45.436744+00:00 2026-02-02 09:04:46.804844+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-02 09:04:45.438000+00:00 1 1 6 10 2 3 None

Describe & get feature values.

run2 = ln.Run.filter(
    s3_folder="s3://my-bucket/my-folder", experiment="Experiment1"
).last()
run2.describe()
run2.features.get_values()
Hide code cell output
Run: IDshWVE (run_track_with_features_and_params.py)
├── uid: IDshWVEdnlIBADr7                transform: run_track_with_features_and_params.py (0000)    
                                     |   description: CLI: run_track_with_features_and_params.py
started_at: 2026-02-02 09:04:45 UTC  finished_at: 2026-02-02 09:04:46 UTC                       
status: completed                                                                               
branch: main                         space: all                                                 
created_at: 2026-02-02 09:04:45 UTC  created_by: testuser1                                      
├── cli_args: 
--s3-folder s3://my-bucket/my-folder --experiment Experiment1
├── report: T7BXmZL
→ connected lamindb: testuser1/test-track
→ created Transform('K7ZABrKhUE3f0000', key='run_track_with_features_and_params. …
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
│ …
├── environment: dNtQwru
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
├── Params
│   └── example_param: 42
└── Features
    └── experiment                     Record[Experiment]                   Experiment1                            
        s3_folder                      str                                  s3://my-bucket/my-folder               
{'experiment': 'Experiment1', 's3_folder': 's3://my-bucket/my-folder'}

Manage functions in scripts and notebooks

If you want more-fined-grained data lineage tracking in a script or notebook where you called ln.track(), you can also use the step() decorator.

In a notebook

@ln.step()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_dataframe(df, key=input_artifact_key).save()
Hide code cell output
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='dUV2fQQCkIjI0lbJ0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-02 09:04:28 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key, subset_rows=1)
Hide code cell output
 ignoring transform with same filename in different folder:
    GqSRFu3oF2pG0000 → track.ipynb
 created Transform('zJUcL6jNopDr0000', key='track.ipynb'), started new Run('0HMyp8RSLC6xXIm9') at 2026-02-02 09:04:47 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=1, subset_cols=2
 writing the in-memory object into cache
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
_images/9e27efd0a0706f1fc2910f0030955fa310c0ded50c790e29d08366cc206da013.svg

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
 loaded Transform('zJUcL6jNopDr0000', key='track.ipynb'), started new Run('tKEL0nC8EEAEMnr4') at 2026-02-02 09:04:48 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=2, subset_cols=3
 writing the in-memory object into cache
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'
_images/ff559d5ebd88e4a2bb67240362a0c166d351edaba54fcec10447664e364a2611.svg

We created a new run:

subsetted_artifact.run
Hide code cell output
Run(uid='tKEL0nC8EEAEMnr4', name=None, entrypoint='subset_dataframe', started_at=2026-02-02 09:04:48 UTC, finished_at=2026-02-02 09:04:49 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 3}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=7, report_id=None, environment_id=None, created_by_id=3, initiated_by_run_id=1, created_at=2026-02-02 09:04:48 UTC, is_locked=False)

With new parameters:

subsetted_artifact.run.params
Hide code cell output
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_rows': 2,
 'subset_cols': 3}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Ox8uaCIVr0VnZKHN0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2 None True False 2026-02-02 09:04:49.079000+00:00 1 1 3 9 None 3

In a script

run_script_with_step.py
import argparse
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"is_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_dataframe(
            df, key="my_analysis/dataset.parquet"
        ).save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()
!python scripts/run_script_with_step.py --subset
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --subset
 created Transform('y2J2nRqJCsNu0000', key='run_script_with_step.py'), started new Run('mv2aBFsnXopmPRUS') at 2026-02-02 09:04:51 UTC
→ params: is_subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("y2J2nRqJCsNu", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='dUV2fQQCkIjI0lbJ0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-02 09:04:28 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
 script invoked with: --subset
 loaded Transform('y2J2nRqJCsNu0000', key='run_script_with_step.py'), started new Run('DGWYRc8ICFBvG7CR') at 2026-02-02 09:04:53 UTC
→ params: artifact='Artifact[dUV2fQQCkIjI0lbJ0000]', subset_rows=2, subset_cols=2
 recommendation: to identify the script across renames, pass the uid: ln.track("y2J2nRqJCsNu", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='Ox8uaCIVr0VnZKHN0000', version_tag=None, is_latest=False, key='my_analysis/dataset_subsetted.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=3696, hash='IHbRUtKzvpoCjL0E6F72qw', n_files=None, n_observations=2, branch_id=1, space_id=1, storage_id=3, run_id=4, schema_id=None, created_by_id=3, created_at=2026-02-02 09:04:33 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Ox8uaCIVr0VnZKHN0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2.0 None True False 2026-02-02 09:04:49.079000+00:00 1 1 3 9 None 3
11 Ox8uaCIVr0VnZKHN0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 mmgoaxh-b6n5jIsE-MPRlQ None 1.0 None False False 2026-02-02 09:04:48.195000+00:00 1 1 3 8 None 3
7 oTxShqMTJRVwnd5b0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 zuZ1rL4nKln6JhkAMpzocQ None 3.0 None True False 2026-02-02 09:04:37.954000+00:00 1 1 3 5 None 3
5 Ox8uaCIVr0VnZKHN0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2.0 None False False 2026-02-02 09:04:33.680000+00:00 1 1 3 4 None 3
3 dUV2fQQCkIjI0lbJ0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3.0 None True False 2026-02-02 09:04:28.832000+00:00 1 1 3 2 None 3
1 ZBUW3Q0sRjiMmylP0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN None True False 2026-02-02 09:04:25.201000+00:00 1 1 3 1 None 3
Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value nullable coerce is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
2 4tk4kl7CDL5F experiment cat[Record[xjhPFj0N2yZsHOB4]] None None 0 0 None None None True None False False 2026-02-02 09:04:42.793000+00:00 1 1 3 1 None
1 PwJagoeIZSyz s3_folder str None None 0 0 None None None True None False False 2026-02-02 09:04:42.784000+00:00 1 1 3 1 None
JsonValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-02-02 09:04:45.684000+00:00 1 1 3 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
1 b2Ls25myGJOQ My project None None None None None False False 2026-02-02 09:04:22.631000+00:00 1 1 3 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 UEXCKDd4Q4OBFhMj Experiment1 None None None None False False 2026-02-02 09:04:42.776000+00:00 1 1 3 1.0 None 1
1 xjhPFj0N2yZsHOB4 Experiment None None None None False True 2026-02-02 09:04:42.770000+00:00 1 1 3 NaN None 1
Run
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
11 DGWYRc8ICFBvG7CR None subset_dataframe 2026-02-02 09:04:53.036178+00:00 2026-02-02 09:04:53.878222+00:00 {'artifact': 'Artifact[dUV2fQQCkIjI0lbJ0000]',... None None --subset False 2026-02-02 09:04:53.037000+00:00 1 1 8 NaN 2.0 3 10.0
10 mv2aBFsnXopmPRUS None None 2026-02-02 09:04:51.867653+00:00 2026-02-02 09:04:53.880440+00:00 {'is_subset': True} None None --subset False 2026-02-02 09:04:51.869000+00:00 1 1 8 13.0 2.0 3 NaN
9 tKEL0nC8EEAEMnr4 None subset_dataframe 2026-02-02 09:04:48.269396+00:00 2026-02-02 09:04:49.088670+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-02 09:04:48.270000+00:00 1 1 7 NaN NaN 3 1.0
8 0HMyp8RSLC6xXIm9 None subset_dataframe 2026-02-02 09:04:47.385920+00:00 2026-02-02 09:04:48.204813+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-02 09:04:47.386000+00:00 1 1 7 NaN NaN 3 1.0
7 IDshWVEdnlIBADr7 None None 2026-02-02 09:04:45.436744+00:00 2026-02-02 09:04:46.804844+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-02 09:04:45.438000+00:00 1 1 6 10.0 2.0 3 NaN
6 XZ9A4lXtuyZR8FvM None None 2026-02-02 09:04:41.034414+00:00 2026-02-02 09:04:42.175919+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-02 09:04:41.035000+00:00 1 1 5 9.0 2.0 3 NaN
5 3aTAu3ngnVYVVhOZ None main 2026-02-02 09:04:36.797605+00:00 2026-02-02 09:04:37.960115+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet False 2026-02-02 09:04:36.799000+00:00 1 1 4 8.0 2.0 3 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
3 ITfIVTF2OdiE /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-02-02 09:04:19.371000+00:00 1 1 3 None
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id space_id environment_id created_by_id
id
8 y2J2nRqJCsNu0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-02-02 09:04:51.865000+00:00 1 1 None 3
7 zJUcL6jNopDr0000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-02-02 09:04:47.381000+00:00 1 1 None 3
6 K7ZABrKhUE3f0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-02-02 09:04:45.434000+00:00 1 1 None 3
5 YclFwMY23Khi0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-02-02 09:04:41.032000+00:00 1 1 None 3
4 bDpEWkJtV6Xe0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-02-02 09:04:36.793000+00:00 1 1 None 3
3 EyLEa4lR3VVR0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-02-02 09:04:31.751000+00:00 1 1 None 3
2 KYWVhGrdlkjh0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-02-02 09:04:27.674000+00:00 1 1 None 3

The database

See the state of the database after we ran these different examples:

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Ox8uaCIVr0VnZKHN0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2.0 None True False 2026-02-02 09:04:49.079000+00:00 1 1 3 9 None 3
11 Ox8uaCIVr0VnZKHN0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 mmgoaxh-b6n5jIsE-MPRlQ None 1.0 None False False 2026-02-02 09:04:48.195000+00:00 1 1 3 8 None 3
7 oTxShqMTJRVwnd5b0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 zuZ1rL4nKln6JhkAMpzocQ None 3.0 None True False 2026-02-02 09:04:37.954000+00:00 1 1 3 5 None 3
5 Ox8uaCIVr0VnZKHN0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2.0 None False False 2026-02-02 09:04:33.680000+00:00 1 1 3 4 None 3
3 dUV2fQQCkIjI0lbJ0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3.0 None True False 2026-02-02 09:04:28.832000+00:00 1 1 3 2 None 3
1 ZBUW3Q0sRjiMmylP0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN None True False 2026-02-02 09:04:25.201000+00:00 1 1 3 1 None 3
Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value nullable coerce is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
2 4tk4kl7CDL5F experiment cat[Record[xjhPFj0N2yZsHOB4]] None None 0 0 None None None True None False False 2026-02-02 09:04:42.793000+00:00 1 1 3 1 None
1 PwJagoeIZSyz s3_folder str None None 0 0 None None None True None False False 2026-02-02 09:04:42.784000+00:00 1 1 3 1 None
JsonValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-02-02 09:04:45.684000+00:00 1 1 3 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
1 b2Ls25myGJOQ My project None None None None None False False 2026-02-02 09:04:22.631000+00:00 1 1 3 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 UEXCKDd4Q4OBFhMj Experiment1 None None None None False False 2026-02-02 09:04:42.776000+00:00 1 1 3 1.0 None 1
1 xjhPFj0N2yZsHOB4 Experiment None None None None False True 2026-02-02 09:04:42.770000+00:00 1 1 3 NaN None 1
Run
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
11 DGWYRc8ICFBvG7CR None subset_dataframe 2026-02-02 09:04:53.036178+00:00 2026-02-02 09:04:53.878222+00:00 {'artifact': 'Artifact[dUV2fQQCkIjI0lbJ0000]',... None None --subset False 2026-02-02 09:04:53.037000+00:00 1 1 8 NaN 2.0 3 10.0
10 mv2aBFsnXopmPRUS None None 2026-02-02 09:04:51.867653+00:00 2026-02-02 09:04:53.880440+00:00 {'is_subset': True} None None --subset False 2026-02-02 09:04:51.869000+00:00 1 1 8 13.0 2.0 3 NaN
9 tKEL0nC8EEAEMnr4 None subset_dataframe 2026-02-02 09:04:48.269396+00:00 2026-02-02 09:04:49.088670+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-02 09:04:48.270000+00:00 1 1 7 NaN NaN 3 1.0
8 0HMyp8RSLC6xXIm9 None subset_dataframe 2026-02-02 09:04:47.385920+00:00 2026-02-02 09:04:48.204813+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-02 09:04:47.386000+00:00 1 1 7 NaN NaN 3 1.0
7 IDshWVEdnlIBADr7 None None 2026-02-02 09:04:45.436744+00:00 2026-02-02 09:04:46.804844+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-02 09:04:45.438000+00:00 1 1 6 10.0 2.0 3 NaN
6 XZ9A4lXtuyZR8FvM None None 2026-02-02 09:04:41.034414+00:00 2026-02-02 09:04:42.175919+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-02 09:04:41.035000+00:00 1 1 5 9.0 2.0 3 NaN
5 3aTAu3ngnVYVVhOZ None main 2026-02-02 09:04:36.797605+00:00 2026-02-02 09:04:37.960115+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet False 2026-02-02 09:04:36.799000+00:00 1 1 4 8.0 2.0 3 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
3 ITfIVTF2OdiE /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-02-02 09:04:19.371000+00:00 1 1 3 None
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id space_id environment_id created_by_id
id
8 y2J2nRqJCsNu0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-02-02 09:04:51.865000+00:00 1 1 None 3
7 zJUcL6jNopDr0000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-02-02 09:04:47.381000+00:00 1 1 None 3
6 K7ZABrKhUE3f0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-02-02 09:04:45.434000+00:00 1 1 None 3
5 YclFwMY23Khi0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-02-02 09:04:41.032000+00:00 1 1 None 3
4 bDpEWkJtV6Xe0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-02-02 09:04:36.793000+00:00 1 1 None 3
3 EyLEa4lR3VVR0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-02-02 09:04:31.751000+00:00 1 1 None 3
2 KYWVhGrdlkjh0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-02-02 09:04:27.674000+00:00 1 1 None 3

Manage notebook templates

A notebook acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

  • label using ULabel or Record, e.g., transform.records.add(template_label)

  • tag with an indicative version string, e.g., transform.version = "T1"; transform.save()

Saving a notebook as an artifact

Sometimes you might want to save a notebook as an artifact. This is how you can do it:

lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact

A few checks at the end of this notebook:

assert run.params == {
    "input_dir": "./mydataset",
    "learning_rate": 0.01,
    "preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}, run.params
assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()