Manage notebooks, scripts & workflows .md .md

This guide walks from tracking data lineage in a notebook to tracking parameters in workflows.

Note: To run examples, if you don’t have a lamindb instance, create one:

!lamin init --storage ./test-track
Hide code cell output
 initialized lamindb: testuser1/test-track

Manage notebooks and scripts

Call track() to save your notebook or script as a transform and start tracking inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

You find your notebooks and scripts in the Transform registry along with pipelines & functions:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code             # source code
transform.runs.to_dataframe()     # all runs in a dataframe
transform.latest_run.report       # report of latest run
transform.latest_run.environment  # environment of latest run

You can use the CLI to load a transform into your current (development) directory:

lamin load --key my_analyses/my_notebook.ipynb

Here is how you’d load the notebook from the video into your local directory:

lamin load https://lamin.ai/laminlabs/lamindata/transform/F4L3oC6QsZvQ

Organize local development

If no development directory is set, script & notebook keys equal their filenames. Otherwise, they represent the relative path in the development directory.

The exception is packaged source code, whose keys have the form pypackages/{package_name}/path/to/file.py.

To set the development directory to your current shell development directory, run:

lamin settings set dev-dir .

You can see the current status by running:

lamin info

Use projects

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create & save a project
ln.track(project="My project")  # pass project
open("sample.fasta", "w").write(">seq1\nACGT\n")  # create a dataset
ln.Artifact("sample.fasta", key="sample.fasta").save()  # auto-labeled by project
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('FgIYAnaNRjW10000', key='track.ipynb'), started new Run('re90G6DAJ8zYaTjA') at 2026-02-11 19:55:36 UTC
 notebook imports: lamindb==2.1.2
 recommendation: to identify the notebook across renames, pass the uid: ln.track("FgIYAnaNRjW1", project="My project")
Artifact(uid='gRr5E7LMMBfDHaFL0000', version_tag=None, is_latest=True, key='sample.fasta', description=None, suffix='.fasta', kind=None, otype=None, size=11, hash='83rEPcAoBHmYiIuyBYrFKg', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=3, run_id=1, schema_id=None, created_by_id=3, created_at=2026-02-11 19:55:40 UTC, is_locked=False)

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 gRr5E7LMMBfDHaFL0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None None True False 2026-02-11 19:55:40.307000+00:00 1 1 3 1 None 3

Access entities linked to a project:

my_project.artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 gRr5E7LMMBfDHaFL0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None None None True False 2026-02-11 19:55:40.307000+00:00 1 1 3 1 None 3

The same works for my_project.transforms or my_project.runs.

Use spaces

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Sync code with git

To sync scripts or workflows with their correponding files in a git repo, either export an environment variable:

export LAMINDB_SYNC_GIT_REPO = <YOUR-GIT-REPO-URL>

Or set the following setting:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

If you work on a single project in your lamindb instance, it makes sense to set LaminDB’s dev-dir to the root of the local git repo clone.

dbs/
  project1/
    .git/
    script1.py
    notebook1.ipynb
  ...

If you work on multiple projects in your lamindb instance, you can use the dev-dir as the local root and nest git repositories in it.

dbs/
  database1/
    repo1/
      .git/
    repo2/
      .git/
  ...

Manage workflows

Here we’ll manage workflows with lamindb’s flow() and step() decorators, which works out-of-the-box with the majority of Python workflow managers:

tool

workflow decorator

step/task decorator

notes

lamindb

@flow

@step

inspired by prefect

prefect

@flow

@task

two decorators

redun

@task (on main)

@task

single decorator for everything

dagster

@job or @asset

@op or @asset

asset-centric; @asset is primary

flyte

@workflow

@task

also @dynamic for runtime DAGs

airflow

@dag

@task

TaskFlow API (modern); also supports operators

zenml

@pipeline

@step

inspired by prefect

If you’re looking for more in-depth examples or for integrating with non-decorator-based workflow managers such as Nextflow or Snakemake, see Manage computational pipelines.

tool

workflow

step/task

notes

nextflow

workflow keyword

process keyword

groovy-based DSL

snakemake

rule keyword

rule keyword

file-based DSL

metaflow

FlowSpec

@step

class-based

kedro

Pipeline()

node()

function-based

A one-step workflow

Decorate a function with flow() to track it as a workflow:

my_workflow.py
import lamindb as ln


@ln.flow()
def ingest_dataset(key: str) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet")

Let’s run the workflow:

!python scripts/my_workflow.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('hugGnf3HXMcE0000', key='my_workflow.py'), started new Run('owv1BDhmrX1HmQ0J') at 2026-02-11 19:55:41 UTC
→ params: key='my_analysis/dataset.parquet'
 recommendation: to identify the script across renames, pass the uid: ln.track("hugGnf3HXMcE", params={...})
 writing the in-memory object into cache

Query the workflow via its filename:

transform = ln.Transform.get(key="my_workflow.py")
transform.describe()
Hide code cell output
Transform: my_workflow.py (0000)
├── uid: hugGnf3HXMcE0000                                     
hash: uJ3fsnfaNN6EZ7Q0d8SQtw         type: script         
branch: main                         space: all           
created_at: 2026-02-11 19:55:41 UTC  created_by: testuser1
└── source_code: 
    import lamindb as ln
    
    
    @ln.flow()
    def ingest_dataset(key: str) -> ln.Artifact:
        df = ln.examples.datasets.mini_immuno.get_dataset1()
        artifact = ln.Artifact.from_dataframe(df, key=key).save()
        return artifact
    
    
    if __name__ == "__main__":
        ingest_dataset(key="my_analysis/dataset.parquet")

The run stored the parameter value for key:

transform.latest_run.describe()
Hide code cell output
Run: owv1BDh (my_workflow.py)
├── uid: owv1BDhmrX1HmQ0J                transform: my_workflow.py (0000)    
started_at: 2026-02-11 19:55:41 UTC  finished_at: 2026-02-11 19:55:43 UTC
status: completed                                                        
branch: main                         space: all                          
created_at: 2026-02-11 19:55:41 UTC  created_by: testuser1               
├── report: 3Gl9EmT
→ connected lamindb: testuser1/test-track
→ created Transform('hugGnf3HXMcE0000', key='my_workflow.py'), started new Run(' …
→ params: key='my_analysis/dataset.parquet'
• recommendation: to identify the script across renames, pass the uid: ln.track( …
│ …
├── environment: LRnFKYr
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset.parquet

It links output artifacts:

transform.latest_run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
3 R7MgprqH6wdrDY0h0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3 None True False 2026-02-11 19:55:43.686000+00:00 1 1 3 2 None 3

You can query for all runs that ran with that parameter:

ln.Run.filter(
    params__key="my_analysis/dataset.parquet",
).to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
2 owv1BDhmrX1HmQ0J None ingest_dataset 2026-02-11 19:55:41.433997+00:00 2026-02-11 19:55:43.691054+00:00 {'key': 'my_analysis/dataset.parquet'} None None None False 2026-02-11 19:55:41.728000+00:00 1 1 2 4 2 3 None

You can also pass complex parameters and features, see: Track parameters & features.

A multi-step workflow

Here, the workflow calls an additional processing step:

my_workflow_with_step.py
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> ln.Artifact:
    df = artifact.load()
    new_data = df.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key).save()


@ln.flow()
def ingest_dataset(key: str, subset: bool = False) -> ln.Artifact:
    df = ln.examples.datasets.mini_immuno.get_dataset1()
    artifact = ln.Artifact.from_dataframe(df, key=key).save()
    if subset:
        artifact = subset_dataframe(artifact)
    return artifact


if __name__ == "__main__":
    ingest_dataset(key="my_analysis/dataset.parquet", subset=True)

Let’s run the workflow:

!python scripts/my_workflow_with_step.py
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('mqdUNA1MQDxB0000', key='my_workflow_with_step.py'), started new Run('wzxS2sllQNQ5Srj1') at 2026-02-11 19:55:45 UTC
→ params: key='my_analysis/dataset.parquet', subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("mqdUNA1MQDxB", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='R7MgprqH6wdrDY0h0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-11 19:55:43 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
 loaded Transform('mqdUNA1MQDxB0000', key='my_workflow_with_step.py'), started new Run('ilkQudwJlRdK8ajT') at 2026-02-11 19:55:47 UTC
→ params: artifact='Artifact[R7MgprqH6wdrDY0h0000]', subset_rows=2, subset_cols=2
 recommendation: to identify the script across renames, pass the uid: ln.track("mqdUNA1MQDxB", params={...})
 writing the in-memory object into cache

The lineage of the subsetted artifact resolves the subsetting step:

subsetted_artifact = ln.Artifact.get(key="my_analysis/dataset_subsetted.parquet")
subsetted_artifact.view_lineage()
Hide code cell output
_images/a91bb302f1dfe2104c2b4086e2a19c9e82d0bce9d5ebdbaf5a3b806c85f47080.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run
Hide code cell output
Run(uid='ilkQudwJlRdK8ajT', name=None, entrypoint='subset_dataframe', started_at=2026-02-11 19:55:47 UTC, finished_at=2026-02-11 19:55:48 UTC, params={'artifact': 'Artifact[R7MgprqH6wdrDY0h0000]', 'subset_rows': 2, 'subset_cols': 2}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=3, report_id=None, environment_id=2, created_by_id=3, initiated_by_run_id=3, created_at=2026-02-11 19:55:47 UTC, is_locked=False)

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run
Hide code cell output
Run(uid='wzxS2sllQNQ5Srj1', name=None, entrypoint='ingest_dataset', started_at=2026-02-11 19:55:45 UTC, finished_at=2026-02-11 19:55:48 UTC, params={'key': 'my_analysis/dataset.parquet', 'subset': True}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=3, report_id=6, environment_id=2, created_by_id=3, initiated_by_run_id=None, created_at=2026-02-11 19:55:45 UTC, is_locked=False)

These are the parameters of the run:

subsetted_artifact.run.params
Hide code cell output
{'artifact': 'Artifact[R7MgprqH6wdrDY0h0000]',
 'subset_rows': 2,
 'subset_cols': 2}

These are the input artifacts:

subsetted_artifact.run.input_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
3 R7MgprqH6wdrDY0h0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3 None True False 2026-02-11 19:55:43.686000+00:00 1 1 3 2 None 3

These are output artifacts:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
5 Kv1prUInlxOSFAZH0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2 None True False 2026-02-11 19:55:48.282000+00:00 1 1 3 4 None 3

A workflow with CLI arguments

Let’s use click to parse CLI arguments:

my_workflow_with_click.py
import click
import lamindb as ln


@click.command()
@click.option("--key", required=True)
@ln.flow()
def main(key: str):
    df = ln.examples.datasets.mini_immuno.get_dataset2()
    ln.Artifact.from_dataframe(df, key=key).save()


if __name__ == "__main__":
    main()

Let’s run the workflow:

!python scripts/my_workflow_with_click.py --key my_analysis/dataset2.parquet
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --key my_analysis/dataset2.parquet
 created Transform('P5qADgQv6vVS0000', key='my_workflow_with_click.py'), started new Run('4XGC3FWREPVIG0RC') at 2026-02-11 19:55:49 UTC
→ params: key='my_analysis/dataset2.parquet'
 recommendation: to identify the script across renames, pass the uid: ln.track("P5qADgQv6vVS", params={...})
 writing the in-memory object into cache

CLI arguments are tracked and accessible via run.cli_args:

run = ln.Run.filter(transform__key="my_workflow_with_click.py").first()
run.describe()
Hide code cell output
Run: 4XGC3FW (my_workflow_with_click.py)
├── uid: 4XGC3FWREPVIG0RC                transform: my_workflow_with_click.py (0000)    
                                     |   description: CLI: my_workflow_with_click.py
started_at: 2026-02-11 19:55:49 UTC  finished_at: 2026-02-11 19:55:52 UTC           
status: completed                                                                   
branch: main                         space: all                                     
created_at: 2026-02-11 19:55:50 UTC  created_by: testuser1                          
├── cli_args: 
--key my_analysis/dataset2.parquet
├── report: xpj6jrQ
→ connected lamindb: testuser1/test-track
→ created Transform('P5qADgQv6vVS0000', key='my_workflow_with_click.py'), starte …
→ params: key='my_analysis/dataset2.parquet'
• recommendation: to identify the script across renames, pass the uid: ln.track( …
│ …
├── environment: LRnFKYr
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    └── key: my_analysis/dataset2.parquet

Note that it doesn’t matter whether you use click, argparse, or any other CLI argument parser.

Track parameters & features

We just saw that the function decorators @ln.flow() and @ln.step() track parameter values automatically. Here is how to pass parameters to ln.track():

run_track_with_params.py
import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run_track_with_params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --input-dir ./mydataset --learning-rate 0.01 --downsample
 created Transform('ba0Gki8was1H0000', key='run_track_with_params.py'), started new Run('L7zDIDax0ZoKgiVI') at 2026-02-11 19:55:53 UTC
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
 recommendation: to identify the script across renames, pass the uid: ln.track("ba0Gki8was1H", params={...})

Query for all runs that match certain parameters:

ln.Run.filter(
    params__learning_rate=0.01,
    params__preprocess_params__downsample=True,
).to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
6 L7zDIDax0ZoKgiVI None None 2026-02-11 19:55:53.534586+00:00 2026-02-11 19:55:54.950247+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-11 19:55:53.831000+00:00 1 1 5 9 2 3 None

Describe & get parameters:

run = ln.Run.filter(params__learning_rate=0.01).order_by("-started_at").first()
run.describe()
run.params
Hide code cell output
Run: L7zDIDa (run_track_with_params.py)
├── uid: L7zDIDax0ZoKgiVI                transform: run_track_with_params.py (0000)    
                                     |   description: CLI: run_track_with_params.py
started_at: 2026-02-11 19:55:53 UTC  finished_at: 2026-02-11 19:55:54 UTC          
status: completed                                                                  
branch: main                         space: all                                    
created_at: 2026-02-11 19:55:53 UTC  created_by: testuser1                         
├── cli_args: 
--input-dir ./mydataset --learning-rate 0.01 --downsample
├── report: P5ifmw8
→ connected lamindb: testuser1/test-track
→ created Transform('ba0Gki8was1H0000', key='run_track_with_params.py'), started …
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downs …
• recommendation: to identify the script across renames, pass the uid: ln.track( …
├── environment: LRnFKYr
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
└── Params
    ├── input_dir: ./mydataset
    ├── learning_rate: 0.01
    └── preprocess_params: {'downsample': True, 'normalization': 'the_good_one'}
{'input_dir': './mydataset',
 'learning_rate': 0.01,
 'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}

You can also access the CLI arguments used to start the run directly:

run.cli_args
Hide code cell output
'--input-dir ./mydataset --learning-rate 0.01 --downsample'

You can also track run features in analogy to artifact features.

In contrast to params, features are validated against the Feature registry and allow to express relationships with entities in your registries.

Let’s first define labels & features.

experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment_label = ln.Record(name="Experiment1", type=experiment_type).save()
ln.Feature(name="s3_folder", dtype=str).save()
ln.Feature(name="experiment", dtype=experiment_type).save()
Hide code cell output
Feature(uid='rDXqPgUAUn14', is_type=False, name='experiment', _dtype_str='cat[Record[o0svcUzSV57P6wFD]]', unit=None, description=None, array_rank=0, array_size=0, array_shape=None, synonyms=None, default_value=None, nullable=True, coerce=None, branch_id=1, space_id=1, created_by_id=3, run_id=1, type_id=None, created_at=2026-02-11 19:55:56 UTC, is_locked=False)
!python scripts/run_track_with_features_and_params.py  --s3-folder s3://my-bucket/my-folder --experiment Experiment1
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --s3-folder s3://my-bucket/my-folder --experiment Experiment1
 created Transform('zizzfw7OPhDC0000', key='run_track_with_features_and_params.py'), started new Run('VwFqyBEOsmmu8Hdu') at 2026-02-11 19:55:57 UTC
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
 recommendation: to identify the script across renames, pass the uid: ln.track("zizzfw7OPhDC", params={...})
ln.Run.filter(s3_folder="s3://my-bucket/my-folder").to_dataframe()
Hide code cell output
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
7 VwFqyBEOsmmu8Hdu None None 2026-02-11 19:55:57.479767+00:00 2026-02-11 19:55:59.284339+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-11 19:55:57.779000+00:00 1 1 6 10 2 3 None

Describe & get feature values.

run2 = ln.Run.filter(
    s3_folder="s3://my-bucket/my-folder", experiment="Experiment1"
).last()
run2.describe()
run2.features.get_values()
Hide code cell output
Run: VwFqyBE (run_track_with_features_and_params.py)
├── uid: VwFqyBEOsmmu8Hdu                transform: run_track_with_features_and_params.py (0000)    
                                     |   description: CLI: run_track_with_features_and_params.py
started_at: 2026-02-11 19:55:57 UTC  finished_at: 2026-02-11 19:55:59 UTC                       
status: completed                                                                               
branch: main                         space: all                                                 
created_at: 2026-02-11 19:55:57 UTC  created_by: testuser1                                      
├── cli_args: 
--s3-folder s3://my-bucket/my-folder --experiment Experiment1
├── report: LC5OCHS
→ connected lamindb: testuser1/test-track
→ created Transform('zizzfw7OPhDC0000', key='run_track_with_features_and_params. …
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
│ …
├── environment: LRnFKYr
aiobotocore==2.26.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.3
aioitertools==0.13.0
│ …
├── Params
│   └── example_param: 42
└── Features
    └── experiment                     Record[Experiment]                   Experiment1                            
        s3_folder                      str                                  s3://my-bucket/my-folder               
{'experiment': 'Experiment1', 's3_folder': 's3://my-bucket/my-folder'}

Manage functions in scripts and notebooks

If you want more-fined-grained data lineage tracking in a script or notebook where you called ln.track(), you can also use the step() decorator.

In a notebook

@ln.step()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_dataframe(df, key=input_artifact_key).save()
Hide code cell output
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='R7MgprqH6wdrDY0h0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-11 19:55:43 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key, subset_rows=1)
Hide code cell output
 ignoring transform with same filename in different folder:
    FgIYAnaNRjW10000 → track.ipynb
 created Transform('CMvCLjM6ZdG30000', key='track.ipynb'), started new Run('KWdrzO2xAsAgMVAl') at 2026-02-11 19:56:00 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=1, subset_cols=2
 writing the in-memory object into cache
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
_images/3ab3bf47a7444cfa2854d3bc1ddc1a7066158a711fa2b0003758823266630f08.svg

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
 loaded Transform('CMvCLjM6ZdG30000', key='track.ipynb'), started new Run('a3iKkFfKlX8O8bmu') at 2026-02-11 19:56:01 UTC
→ params: input_artifact_key='my_analysis/dataset.parquet', output_artifact_key='my_analysis/dataset_subsetted.parquet', subset_rows=2, subset_cols=3
 writing the in-memory object into cache
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'
_images/f8919b4f463289c13188cc1bece90fd3edaaa4db2b2f1540f78dcc78dde030c6.svg

We created a new run:

subsetted_artifact.run
Hide code cell output
Run(uid='a3iKkFfKlX8O8bmu', name=None, entrypoint='subset_dataframe', started_at=2026-02-11 19:56:01 UTC, finished_at=2026-02-11 19:56:02 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 3}, reference=None, reference_type=None, cli_args=None, branch_id=1, space_id=1, transform_id=7, report_id=None, environment_id=None, created_by_id=3, initiated_by_run_id=1, created_at=2026-02-11 19:56:01 UTC, is_locked=False)

With new parameters:

subsetted_artifact.run.params
Hide code cell output
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_rows': 2,
 'subset_cols': 3}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Kv1prUInlxOSFAZH0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2 None True False 2026-02-11 19:56:02.161000+00:00 1 1 3 9 None 3

In a script

run_script_with_step.py
import argparse
import lamindb as ln


@ln.step()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"is_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_dataframe(
            df, key="my_analysis/dataset.parquet"
        ).save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()
!python scripts/run_script_with_step.py --subset
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --subset
 created Transform('fuNHAEO1YduA0000', key='run_script_with_step.py'), started new Run('hfJJjsdbBDkjRvPf') at 2026-02-11 19:56:03 UTC
→ params: is_subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("fuNHAEO1YduA", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='R7MgprqH6wdrDY0h0000', version_tag=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='xnLdi2kUCdOAe61uR8O7CA', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=3, run_id=2, schema_id=None, created_by_id=3, created_at=2026-02-11 19:55:43 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
 script invoked with: --subset
 loaded Transform('fuNHAEO1YduA0000', key='run_script_with_step.py'), started new Run('hDMyOX8eRwolc55d') at 2026-02-11 19:56:05 UTC
→ params: artifact='Artifact[R7MgprqH6wdrDY0h0000]', subset_rows=2, subset_cols=2
 recommendation: to identify the script across renames, pass the uid: ln.track("fuNHAEO1YduA", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='Kv1prUInlxOSFAZH0000', version_tag=None, is_latest=False, key='my_analysis/dataset_subsetted.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=3696, hash='IHbRUtKzvpoCjL0E6F72qw', n_files=None, n_observations=2, branch_id=1, space_id=1, storage_id=3, run_id=4, schema_id=None, created_by_id=3, created_at=2026-02-11 19:55:48 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Kv1prUInlxOSFAZH0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2.0 None True False 2026-02-11 19:56:02.161000+00:00 1 1 3 9 None 3
11 Kv1prUInlxOSFAZH0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 mmgoaxh-b6n5jIsE-MPRlQ None 1.0 None False False 2026-02-11 19:56:01.360000+00:00 1 1 3 8 None 3
7 SrJ8R89fabdfrc9V0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 zuZ1rL4nKln6JhkAMpzocQ None 3.0 None True False 2026-02-11 19:55:52.101000+00:00 1 1 3 5 None 3
5 Kv1prUInlxOSFAZH0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2.0 None False False 2026-02-11 19:55:48.282000+00:00 1 1 3 4 None 3
3 R7MgprqH6wdrDY0h0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3.0 None True False 2026-02-11 19:55:43.686000+00:00 1 1 3 2 None 3
1 gRr5E7LMMBfDHaFL0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN None True False 2026-02-11 19:55:40.307000+00:00 1 1 3 1 None 3
Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value nullable coerce is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
2 rDXqPgUAUn14 experiment cat[Record[o0svcUzSV57P6wFD]] None None 0 0 None None None True None False False 2026-02-11 19:55:56.430000+00:00 1 1 3 1 None
1 CuA4MBRyscle s3_folder str None None 0 0 None None None True None False False 2026-02-11 19:55:56.421000+00:00 1 1 3 1 None
JsonValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-02-11 19:55:58.186000+00:00 1 1 3 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
1 VEyWijX2tw3P My project None None None None None False False 2026-02-11 19:55:35.423000+00:00 1 1 3 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 yP6sSGuukrKI9e2a Experiment1 None None None None False False 2026-02-11 19:55:56.414000+00:00 1 1 3 1.0 None 1
1 o0svcUzSV57P6wFD Experiment None None None None False True 2026-02-11 19:55:56.407000+00:00 1 1 3 NaN None 1
Run
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
11 hDMyOX8eRwolc55d None subset_dataframe 2026-02-11 19:56:05.503331+00:00 2026-02-11 19:56:06.243805+00:00 {'artifact': 'Artifact[R7MgprqH6wdrDY0h0000]',... None None --subset False 2026-02-11 19:56:05.504000+00:00 1 1 8 NaN 2.0 3 10.0
10 hfJJjsdbBDkjRvPf None None 2026-02-11 19:56:03.286825+00:00 2026-02-11 19:56:06.246455+00:00 {'is_subset': True} None None --subset False 2026-02-11 19:56:03.584000+00:00 1 1 8 13.0 2.0 3 NaN
9 a3iKkFfKlX8O8bmu None subset_dataframe 2026-02-11 19:56:01.425864+00:00 2026-02-11 19:56:02.168730+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-11 19:56:01.426000+00:00 1 1 7 NaN NaN 3 1.0
8 KWdrzO2xAsAgMVAl None subset_dataframe 2026-02-11 19:56:00.624316+00:00 2026-02-11 19:56:01.367386+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-11 19:56:00.624000+00:00 1 1 7 NaN NaN 3 1.0
7 VwFqyBEOsmmu8Hdu None None 2026-02-11 19:55:57.479767+00:00 2026-02-11 19:55:59.284339+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-11 19:55:57.779000+00:00 1 1 6 10.0 2.0 3 NaN
6 L7zDIDax0ZoKgiVI None None 2026-02-11 19:55:53.534586+00:00 2026-02-11 19:55:54.950247+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-11 19:55:53.831000+00:00 1 1 5 9.0 2.0 3 NaN
5 4XGC3FWREPVIG0RC None main 2026-02-11 19:55:49.882160+00:00 2026-02-11 19:55:52.106013+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet False 2026-02-11 19:55:50.182000+00:00 1 1 4 8.0 2.0 3 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
3 uq74RnwQEHvV /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-02-11 19:55:34.106000+00:00 1 1 3 None
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id space_id environment_id created_by_id
id
8 fuNHAEO1YduA0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-02-11 19:56:03.284000+00:00 1 1 None 3
7 CMvCLjM6ZdG30000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-02-11 19:56:00.620000+00:00 1 1 None 3
6 zizzfw7OPhDC0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-02-11 19:55:57.477000+00:00 1 1 None 3
5 ba0Gki8was1H0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-02-11 19:55:53.532000+00:00 1 1 None 3
4 P5qADgQv6vVS0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-02-11 19:55:49.879000+00:00 1 1 None 3
3 mqdUNA1MQDxB0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-02-11 19:55:45.132000+00:00 1 1 None 3
2 hugGnf3HXMcE0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-02-11 19:55:41.430000+00:00 1 1 None 3

The database

See the state of the database after we ran these different examples:

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version_tag is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12 Kv1prUInlxOSFAZH0002 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 PwHr73jTbnTE-aEhk75hag None 2.0 None True False 2026-02-11 19:56:02.161000+00:00 1 1 3 9 None 3
11 Kv1prUInlxOSFAZH0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3669 mmgoaxh-b6n5jIsE-MPRlQ None 1.0 None False False 2026-02-11 19:56:01.360000+00:00 1 1 3 8 None 3
7 SrJ8R89fabdfrc9V0000 my_analysis/dataset2.parquet None .parquet dataset DataFrame 7054 zuZ1rL4nKln6JhkAMpzocQ None 3.0 None True False 2026-02-11 19:55:52.101000+00:00 1 1 3 5 None 3
5 Kv1prUInlxOSFAZH0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 IHbRUtKzvpoCjL0E6F72qw None 2.0 None False False 2026-02-11 19:55:48.282000+00:00 1 1 3 4 None 3
3 R7MgprqH6wdrDY0h0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 xnLdi2kUCdOAe61uR8O7CA None 3.0 None True False 2026-02-11 19:55:43.686000+00:00 1 1 3 2 None 3
1 gRr5E7LMMBfDHaFL0000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg None NaN None True False 2026-02-11 19:55:40.307000+00:00 1 1 3 1 None 3
Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value nullable coerce is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
2 rDXqPgUAUn14 experiment cat[Record[o0svcUzSV57P6wFD]] None None 0 0 None None None True None False False 2026-02-11 19:55:56.430000+00:00 1 1 3 1 None
1 CuA4MBRyscle s3_folder str None None 0 0 None None None True None False False 2026-02-11 19:55:56.421000+00:00 1 1 3 1 None
JsonValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2026-02-11 19:55:58.186000+00:00 1 1 3 None 1
Project
uid name description abbr url start_date end_date is_locked is_type created_at branch_id space_id created_by_id run_id type_id
id
1 VEyWijX2tw3P My project None None None None None False False 2026-02-11 19:55:35.423000+00:00 1 1 3 None None
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 yP6sSGuukrKI9e2a Experiment1 None None None None False False 2026-02-11 19:55:56.414000+00:00 1 1 3 1.0 None 1
1 o0svcUzSV57P6wFD Experiment None None None None False True 2026-02-11 19:55:56.407000+00:00 1 1 3 NaN None 1
Run
uid name entrypoint started_at finished_at params reference reference_type cli_args is_locked created_at branch_id space_id transform_id report_id environment_id created_by_id initiated_by_run_id
id
11 hDMyOX8eRwolc55d None subset_dataframe 2026-02-11 19:56:05.503331+00:00 2026-02-11 19:56:06.243805+00:00 {'artifact': 'Artifact[R7MgprqH6wdrDY0h0000]',... None None --subset False 2026-02-11 19:56:05.504000+00:00 1 1 8 NaN 2.0 3 10.0
10 hfJJjsdbBDkjRvPf None None 2026-02-11 19:56:03.286825+00:00 2026-02-11 19:56:06.246455+00:00 {'is_subset': True} None None --subset False 2026-02-11 19:56:03.584000+00:00 1 1 8 13.0 2.0 3 NaN
9 a3iKkFfKlX8O8bmu None subset_dataframe 2026-02-11 19:56:01.425864+00:00 2026-02-11 19:56:02.168730+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-11 19:56:01.426000+00:00 1 1 7 NaN NaN 3 1.0
8 KWdrzO2xAsAgMVAl None subset_dataframe 2026-02-11 19:56:00.624316+00:00 2026-02-11 19:56:01.367386+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None None False 2026-02-11 19:56:00.624000+00:00 1 1 7 NaN NaN 3 1.0
7 VwFqyBEOsmmu8Hdu None None 2026-02-11 19:55:57.479767+00:00 2026-02-11 19:55:59.284339+00:00 {'example_param': 42} None None --s3-folder s3://my-bucket/my-folder --experim... False 2026-02-11 19:55:57.779000+00:00 1 1 6 10.0 2.0 3 NaN
6 L7zDIDax0ZoKgiVI None None 2026-02-11 19:55:53.534586+00:00 2026-02-11 19:55:54.950247+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None --input-dir ./mydataset --learning-rate 0.01 -... False 2026-02-11 19:55:53.831000+00:00 1 1 5 9.0 2.0 3 NaN
5 4XGC3FWREPVIG0RC None main 2026-02-11 19:55:49.882160+00:00 2026-02-11 19:55:52.106013+00:00 {'key': 'my_analysis/dataset2.parquet'} None None --key my_analysis/dataset2.parquet False 2026-02-11 19:55:50.182000+00:00 1 1 4 8.0 2.0 3 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
3 uq74RnwQEHvV /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2026-02-11 19:55:34.106000+00:00 1 1 3 None
Transform
uid key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id space_id environment_id created_by_id
id
8 fuNHAEO1YduA0000 run_script_with_step.py CLI: run_script_with_step.py script import argparse\nimport lamindb as ln\n\n\n@ln... HJbjZyWWczP-VmzKQsSORg None None None True False 2026-02-11 19:56:03.284000+00:00 1 1 None 3
7 CMvCLjM6ZdG30000 track.ipynb None function @ln.step()\ndef subset_dataframe(\n input_a... 5kfRAQLCPwxrvAjspfdp2Q None None None True False 2026-02-11 19:56:00.620000+00:00 1 1 None 3
6 zizzfw7OPhDC0000 run_track_with_features_and_params.py CLI: run_track_with_features_and_params.py script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2026-02-11 19:55:57.477000+00:00 1 1 None 3
5 ba0Gki8was1H0000 run_track_with_params.py CLI: run_track_with_params.py script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2026-02-11 19:55:53.532000+00:00 1 1 None 3
4 P5qADgQv6vVS0000 my_workflow_with_click.py CLI: my_workflow_with_click.py script import click\nimport lamindb as ln\n\n\n@click... 0eX8wmaAWkuuAvACWwL1Xg None None None True False 2026-02-11 19:55:49.879000+00:00 1 1 None 3
3 mqdUNA1MQDxB0000 my_workflow_with_step.py None script import lamindb as ln\n\n\[email protected]()\ndef subs... Ncx6UswxtCN3FZD86kgcVQ None None None True False 2026-02-11 19:55:45.132000+00:00 1 1 None 3
2 hugGnf3HXMcE0000 my_workflow.py None script import lamindb as ln\n\n\[email protected]()\ndef inge... uJ3fsnfaNN6EZ7Q0d8SQtw None None None True False 2026-02-11 19:55:41.430000+00:00 1 1 None 3

Using transform versions as templates

A transform acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook or script, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

  • label using ULabel or Record, e.g., transform.records.add(template_label)

  • tag with an indicative version string, e.g., transform.version = "T1"; transform.save()

Saving a notebook as an artifact

Sometimes you might want to save a notebook as an artifact. This is how you can do it:

lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact

A few checks at the end of this notebook:

assert run.params == {
    "input_dir": "./mydataset",
    "learning_rate": 0.01,
    "preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}, run.params
assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()