Track notebooks, scripts & functions

For tracking pipelines, see: docs:pipelines.

# pip install lamindb
!lamin init --storage ./test-track
Hide code cell output
 initialized lamindb: testuser1/test-track

Track a notebook or script

Call track() to register your notebook or script as a transform and start capturing inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

Here is how a notebook with run report looks on the hub.

Explore it here.

You find your notebooks and scripts in the Transform registry (along with pipelines & functions). Run stores executions. You can use all usual ways of querying to obtain one or multiple transform records, e.g.:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code  # source code
transform.runs  # all runs
transform.latest_run.report  # report of latest run
transform.latest_run.environment  # environment of latest run

To load a notebook or script from the hub, search or filter the transform page and use the CLI.

lamin load https://lamin.ai/laminlabs/lamindata/transform/13VINnFk89PE

Organize local development

If no development directory is set, script & notebooks keys equal their filenames. Otherwise, script & notebooks keys equal the relative path in the development directory.

To set the development directory to your current shell development directory, run:

lamin settings set dev-dir .

You can see the current status by running:

lamin info

Sync scripts with git

To sync scripts with with a git repo, either export an environment variable:

export LAMINDB_SYNC_GIT_REPO = <YOUR-GIT-REPO-URL>

Or set the following setting:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

If you work on a single project in your lamindb instance, it makes sense to set LaminDB’s dev-dir to the root of the local git repo clone. If you work on multiple projects in your lamindb instance, you can use the dev-dir as the local root and nest git repositories in it.

Use projects

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create a project

ln.track(project="My project")  # auto-link entities to "My project"

ln.Artifact(
    ln.examples.datasets.file_fcs(), key="my_file.fcs"
).save()  # save an artifact
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('Gz67KQC61bkz0000', key='track.ipynb'), started new Run('3bLYytDb9x59PZQ9') at 2025-11-14 00:09:40 UTC
 notebook imports: lamindb==1.16.1
 recommendation: to identify the notebook across renames, pass the uid: ln.track("Gz67KQC61bkz", project="My project")
Artifact(uid='faY6InD33gqFvzHP0000', version=None, is_latest=True, key='my_file.fcs', description=None, suffix='.fcs', kind=None, otype=None, size=19330507, hash='rCPvmZB19xs4zHZ7p_-Wrg', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2025-11-14 00:09:43 UTC, is_locked=False)

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).to_dataframe()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 faY6InD33gqFvzHP0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None None None True False 2025-11-14 00:09:43.817000+00:00 1 1 1 1 None 1

Access entities linked to a project.

display(my_project.artifacts.to_dataframe())
display(my_project.transforms.to_dataframe())
display(my_project.runs.to_dataframe())
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 faY6InD33gqFvzHP0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None None None True False 2025-11-14 00:09:43.817000+00:00 1 1 1 1 None 1
uid key description type source_code hash reference reference_type version is_latest is_locked created_at branch_id space_id created_by_id _template_id
id
1 Gz67KQC61bkz0000 track.ipynb Track notebooks, scripts & functions notebook None None None None None True False 2025-11-14 00:09:40.976000+00:00 1 1 1 None
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
1 3bLYytDb9x59PZQ9 None 2025-11-14 00:09:40.985637+00:00 None None None None False 2025-11-14 00:09:40.986000+00:00 1 1 1 None None None 1 None

Use spaces

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Track parameters & features

In addition to tracking source code, run reports & environments, you can track run parameters & features.

Let’s look at the following script, which has a few parameters.

run_track_with_params.py
import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run_track_with_params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --input-dir ./mydataset --learning-rate 0.01 --downsample
 created Transform('E56SfrCn1PsO0000', key='run_track_with_params.py'), started new Run('VWkp1ai5B8lIBBF5') at 2025-11-14 00:09:46 UTC
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
 recommendation: to identify the script across renames, pass the uid: ln.track("E56SfrCn1PsO", params={...})

Query for all runs that match certain parameters:

ln.Run.filter(
    params__learning_rate=0.01,
    params__preprocess_params__downsample=True,
).to_dataframe()
Hide code cell output
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
2 VWkp1ai5B8lIBBF5 None 2025-11-14 00:09:46.580372+00:00 2025-11-14 00:09:47.701835+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None False 2025-11-14 00:09:46.581000+00:00 1 1 2 3 None 2 1 None

Describe & get parameters:

run = ln.Run.filter(params__learning_rate=0.01).order_by("-started_at").first()
run.describe()
run.params
Hide code cell output
Run: VWkp1ai (run_track_with_params.py)
├── uid: VWkp1ai5B8lIBBF5                transform: run_track_with_params.py (0000)
started_at: 2025-11-14 00:09:46 UTC  finished_at: 2025-11-14 00:09:47 UTC      
status: completed                                                              
branch: main                         space: all                                
created_at: 2025-11-14 00:09:46 UTC  created_by: testuser1                     
├── cli_args: 
--input-dir ./mydataset --learning-rate 0.01 --downsample
├── report: KkEHM4L
→ connected lamindb: testuser1/test-track
→ created Transform('E56SfrCn1PsO0000', key='run_track_with_params.py'), started …
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downs …
• recommendation: to identify the script across renames, pass the uid: ln.track( …
├── environment: qJzZ8KE
aiobotocore==2.25.2
aiohappyeyeballs==2.6.1
aiohttp==3.13.2
aioitertools==0.13.0
│ …
└── Params
    ├── input_dir: ./mydataset
    ├── learning_rate: 0.01
    └── preprocess_params: {'downsample': True, 'normalization': 'the_good_one'}
{'input_dir': './mydataset',
 'learning_rate': 0.01,
 'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}

You can also access the CLI arguments used to start the run directly:

run.cli_args
Hide code cell output
'--input-dir ./mydataset --learning-rate 0.01 --downsample'

You can also track run features in analogy to artifact features.

In contrast to params, features are validated against the Feature registry and allow to express relationships with entities in your registries.

Let’s first define labels & features.

experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment_label = ln.Record(name="Experiment1", type=experiment_type).save()
ln.Feature(name="s3_folder", dtype=str).save()
ln.Feature(name="experiment", dtype=experiment_type).save()
Hide code cell output
Feature(uid='5p0NT5CDrK5P', name='experiment', dtype='cat[Record[Experiment]]', is_type=None, unit=None, description=None, array_rank=0, array_size=0, array_shape=None, proxy_dtype=None, synonyms=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, created_at=2025-11-14 00:09:48 UTC, is_locked=False)
!python scripts/run_track_with_features_and_params.py  --s3-folder s3://my-bucket/my-folder --experiment Experiment1
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --s3-folder s3://my-bucket/my-folder --experiment Experiment1
 created Transform('56plLJkzwtzK0000', key='run_track_with_features_and_params.py'), started new Run('k7G0TPJw6GmILmzd') at 2025-11-14 00:09:50 UTC
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
 recommendation: to identify the script across renames, pass the uid: ln.track("56plLJkzwtzK", params={...})
ln.Run.filter(s3_folder="s3://my-bucket/my-folder").to_dataframe()
Hide code cell output
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
3 k7G0TPJw6GmILmzd None 2025-11-14 00:09:50.774497+00:00 2025-11-14 00:09:51.884975+00:00 {'example_param': 42} None None False 2025-11-14 00:09:50.775000+00:00 1 1 3 4 None 2 1 None

Describe & get feature values.

run2 = ln.Run.filter(
    s3_folder="s3://my-bucket/my-folder", experiment="Experiment1"
).last()
run2.describe()
run2.features.get_values()
Hide code cell output
Run: k7G0TPJ (run_track_with_features_and_params.py)
├── uid: k7G0TPJw6GmILmzd                transform: run_track_with_features_and_params.py (0000)
started_at: 2025-11-14 00:09:50 UTC  finished_at: 2025-11-14 00:09:51 UTC                   
status: completed                                                                           
branch: main                         space: all                                             
created_at: 2025-11-14 00:09:50 UTC  created_by: testuser1                                  
├── cli_args: 
--s3-folder s3://my-bucket/my-folder --experiment Experiment1
├── report: JX1xSxP
→ connected lamindb: testuser1/test-track
→ created Transform('56plLJkzwtzK0000', key='run_track_with_features_and_params. …
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
│ …
├── environment: qJzZ8KE
aiobotocore==2.25.2
aiohappyeyeballs==2.6.1
aiohttp==3.13.2
aioitertools==0.13.0
│ …
├── Params
│   └── example_param: 42
└── Features
    └── experiment                      Record[Experiment]                 Experiment1                             
        s3_folder                       str                                s3://my-bucket/my-folder                
{'experiment': 'Experiment1', 's3_folder': 's3://my-bucket/my-folder'}

Track functions

If you want more-fined-grained data lineage tracking, use the tracked() decorator.

@ln.tracked()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_dataframe(df, key=input_artifact_key).save()
 writing the in-memory object into cache

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key)
Hide code cell output
 writing the in-memory object into cache

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
_images/4dbaaf99315d470310be684ce53b5d8b31f087fd5dda294479fb4f82d3721b88.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run
Run(uid='LyRcr7niQoFQAPuG', name=None, started_at=2025-11-14 00:09:52 UTC, finished_at=2025-11-14 00:09:52 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 2}, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=4, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=1, created_at=2025-11-14 00:09:52 UTC, is_locked=False)

This is the function that created it:

subsetted_artifact.run.transform
Transform(uid='0W5o4dv3lb8t0000', version=None, is_latest=True, key='track.ipynb/subset_dataframe.py', description=None, type='function', hash='CUqkJpolJY1Q1tqyCoWIWg', reference=None, reference_type=None, branch_id=1, space_id=1, created_by_id=1, created_at=2025-11-14 00:09:52 UTC, is_locked=False)

This is the source code of this function:

subsetted_artifact.run.transform.source_code
'@ln.tracked()\ndef subset_dataframe(\n    input_artifact_key: str,\n    output_artifact_key: str,\n    subset_rows: int = 2,\n    subset_cols: int = 2,\n) -> None:\n    artifact = ln.Artifact.get(key=input_artifact_key)\n    dataset = artifact.load()\n    new_data = dataset.iloc[:subset_rows, :subset_cols]\n    ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()\n'

These are all versions of this function:

subsetted_artifact.run.transform.versions.to_dataframe()
uid key description type source_code hash reference reference_type version is_latest is_locked created_at branch_id space_id created_by_id _template_id
id
4 0W5o4dv3lb8t0000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... CUqkJpolJY1Q1tqyCoWIWg None None None True False 2025-11-14 00:09:52.489000+00:00 1 1 1 None

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run
Run(uid='3bLYytDb9x59PZQ9', name=None, started_at=2025-11-14 00:09:40 UTC, finished_at=None, params=None, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=1, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=None, created_at=2025-11-14 00:09:40 UTC, is_locked=False)

This is the transform of the initiating run:

subsetted_artifact.run.initiated_by_run.transform
Transform(uid='Gz67KQC61bkz0000', version=None, is_latest=True, key='track.ipynb', description='Track notebooks, scripts & functions', type='notebook', hash=None, reference=None, reference_type=None, branch_id=1, space_id=1, created_by_id=1, created_at=2025-11-14 00:09:40 UTC, is_locked=False)

These are the parameters of the run:

subsetted_artifact.run.params
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_rows': 2,
 'subset_cols': 2}

These are the input artifacts:

subsetted_artifact.run.input_artifacts.to_dataframe()
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
5 ohM7UtLDryuBcENu0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 ug6ICnjB8oyqescoUDbYKg None 3 None True False 2025-11-14 00:09:52.466000+00:00 1 1 1 1 None 1

These are output artifacts:

subsetted_artifact.run.output_artifacts.to_dataframe()
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
6 btfxqsuE0bd5k3Ny0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 siPfGX_YztG7sm3oNnHRUw None 2 None True False 2025-11-14 00:09:52.515000+00:00 1 1 1 4 None 1

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
 writing the in-memory object into cache
 creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'
_images/0922c0563dd52d125d91b551fca21097bdc86432ae8e4141c867e39de5d4555e.svg

We created a new run:

subsetted_artifact.run
Run(uid='5ZkbG4WbPwN2eSDd', name=None, started_at=2025-11-14 00:09:52 UTC, finished_at=2025-11-14 00:09:52 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 3}, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=4, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=1, created_at=2025-11-14 00:09:52 UTC, is_locked=False)

With new parameters:

subsetted_artifact.run.params
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_rows': 2,
 'subset_cols': 3}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.to_dataframe()
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
7 btfxqsuE0bd5k3Ny0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 L3pK_0XXK30OIkqCY2_H9w None 2 None True False 2025-11-14 00:09:52.928000+00:00 1 1 1 5 None 1

See the state of the database:

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
7 btfxqsuE0bd5k3Ny0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 L3pK_0XXK30OIkqCY2_H9w None 2.0 None True False 2025-11-14 00:09:52.928000+00:00 1 1 1 5 None 1
6 btfxqsuE0bd5k3Ny0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 siPfGX_YztG7sm3oNnHRUw None 2.0 None False False 2025-11-14 00:09:52.515000+00:00 1 1 1 4 None 1
5 ohM7UtLDryuBcENu0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 ug6ICnjB8oyqescoUDbYKg None 3.0 None True False 2025-11-14 00:09:52.466000+00:00 1 1 1 1 None 1
1 faY6InD33gqFvzHP0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None NaN None True False 2025-11-14 00:09:43.817000+00:00 1 1 1 1 None 1
Feature
uid name dtype is_type unit description array_rank array_size array_shape proxy_dtype synonyms is_locked created_at branch_id space_id created_by_id run_id type_id
id
2 5p0NT5CDrK5P experiment cat[Record[Experiment]] None None None 0 0 None None None False 2025-11-14 00:09:48.264000+00:00 1 1 1 1 None
1 Ab6chMrC6SX7 s3_folder str None None None 0 0 None None None False 2025-11-14 00:09:48.254000+00:00 1 1 1 1 None
FeatureValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2025-11-14 00:09:50.794000+00:00 1 1 1 None 1
Project
uid name description is_type abbr url start_date end_date is_locked created_at branch_id space_id created_by_id run_id type_id
id
1 j9R1SWCpZz41 My project None False None None None None False 2025-11-14 00:09:40.063000+00:00 1 1 1 None None
Record
uid name is_type description reference reference_type is_locked created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 ObkZ847MyZpxYN6e Experiment1 False None None None False 2025-11-14 00:09:48.247000+00:00 1 1 1 1.0 None 1
1 0QkhnZ09WAY1CFRg Experiment True None None None False 2025-11-14 00:09:48.239000+00:00 1 1 1 NaN None 1
Run
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
5 5ZkbG4WbPwN2eSDd None 2025-11-14 00:09:52.906449+00:00 2025-11-14 00:09:52.936915+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None False 2025-11-14 00:09:52.907000+00:00 1 1 4 NaN None NaN 1 1.0
4 LyRcr7niQoFQAPuG None 2025-11-14 00:09:52.494283+00:00 2025-11-14 00:09:52.522082+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None False 2025-11-14 00:09:52.494000+00:00 1 1 4 NaN None NaN 1 1.0
3 k7G0TPJw6GmILmzd None 2025-11-14 00:09:50.774497+00:00 2025-11-14 00:09:51.884975+00:00 {'example_param': 42} None None False 2025-11-14 00:09:50.775000+00:00 1 1 3 4.0 None 2.0 1 NaN
2 VWkp1ai5B8lIBBF5 None 2025-11-14 00:09:46.580372+00:00 2025-11-14 00:09:47.701835+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None False 2025-11-14 00:09:46.581000+00:00 1 1 2 3.0 None 2.0 1 NaN
1 3bLYytDb9x59PZQ9 None 2025-11-14 00:09:40.985637+00:00 NaT None None None False 2025-11-14 00:09:40.986000+00:00 1 1 1 NaN None NaN 1 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
1 sAOSHaHJBZ3o /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2025-11-14 00:09:36.891000+00:00 1 1 1 None
Transform
uid key description type source_code hash reference reference_type version is_latest is_locked created_at branch_id space_id created_by_id _template_id
id
4 0W5o4dv3lb8t0000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... CUqkJpolJY1Q1tqyCoWIWg None None None True False 2025-11-14 00:09:52.489000+00:00 1 1 1 None
3 56plLJkzwtzK0000 run_track_with_features_and_params.py None script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2025-11-14 00:09:50.772000+00:00 1 1 1 None
2 E56SfrCn1PsO0000 run_track_with_params.py None script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2025-11-14 00:09:46.577000+00:00 1 1 1 None
1 Gz67KQC61bkz0000 track.ipynb Track notebooks, scripts & functions notebook None None None None None True False 2025-11-14 00:09:40.976000+00:00 1 1 1 None

In a script

run_workflow.py
import argparse
import lamindb as ln


@ln.tracked()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_dataframe(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"is_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_dataframe(
            df, key="my_analysis/dataset.parquet"
        ).save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()
!python scripts/run_workflow.py --subset
Hide code cell output
 connected lamindb: testuser1/test-track
 script invoked with: --subset
 created Transform('Ao4H3f2xkXuQ0000', key='run_workflow.py'), started new Run('NNoXiQUvESmFDuJB') at 2025-11-14 00:09:55 UTC
→ params: is_subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("Ao4H3f2xkXuQ", params={...})
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='ohM7UtLDryuBcENu0000', version=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=10354, hash='ug6ICnjB8oyqescoUDbYKg', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2025-11-14 00:09:52 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
! cannot infer feature type of: None, returning '?
! skipping param run because dtype not JSON serializable
 writing the in-memory object into cache
 returning artifact with same hash: Artifact(uid='btfxqsuE0bd5k3Ny0001', version=None, is_latest=True, key='my_analysis/dataset_subsetted.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=4314, hash='L3pK_0XXK30OIkqCY2_H9w', n_files=None, n_observations=2, branch_id=1, space_id=1, storage_id=1, run_id=5, schema_id=None, created_by_id=1, created_at=2025-11-14 00:09:52 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
7 btfxqsuE0bd5k3Ny0001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 4314 L3pK_0XXK30OIkqCY2_H9w None 2.0 None True False 2025-11-14 00:09:52.928000+00:00 1 1 1 5 None 1
6 btfxqsuE0bd5k3Ny0000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3696 siPfGX_YztG7sm3oNnHRUw None 2.0 None False False 2025-11-14 00:09:52.515000+00:00 1 1 1 4 None 1
5 ohM7UtLDryuBcENu0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 10354 ug6ICnjB8oyqescoUDbYKg None 3.0 None True False 2025-11-14 00:09:52.466000+00:00 1 1 1 1 None 1
1 faY6InD33gqFvzHP0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None NaN None True False 2025-11-14 00:09:43.817000+00:00 1 1 1 1 None 1
Feature
uid name dtype is_type unit description array_rank array_size array_shape proxy_dtype synonyms is_locked created_at branch_id space_id created_by_id run_id type_id
id
2 5p0NT5CDrK5P experiment cat[Record[Experiment]] None None None 0 0 None None None False 2025-11-14 00:09:48.264000+00:00 1 1 1 1 None
1 Ab6chMrC6SX7 s3_folder str None None None 0 0 None None None False 2025-11-14 00:09:48.254000+00:00 1 1 1 1 None
FeatureValue
value hash is_locked created_at branch_id space_id created_by_id run_id feature_id
id
1 s3://my-bucket/my-folder E-3iWq1AziFBjh_cbyr5ZA False 2025-11-14 00:09:50.794000+00:00 1 1 1 None 1
Project
uid name description is_type abbr url start_date end_date is_locked created_at branch_id space_id created_by_id run_id type_id
id
1 j9R1SWCpZz41 My project None False None None None None False 2025-11-14 00:09:40.063000+00:00 1 1 1 None None
Record
uid name is_type description reference reference_type is_locked created_at branch_id space_id created_by_id type_id schema_id run_id
id
2 ObkZ847MyZpxYN6e Experiment1 False None None None False 2025-11-14 00:09:48.247000+00:00 1 1 1 1.0 None 1
1 0QkhnZ09WAY1CFRg Experiment True None None None False 2025-11-14 00:09:48.239000+00:00 1 1 1 NaN None 1
Run
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
7 r0pDpN5a6vG2AUVH None 2025-11-14 00:09:56.829252+00:00 2025-11-14 00:09:56.850457+00:00 {'artifact': 'Artifact[ohM7UtLDryuBcENu0000]',... None None False 2025-11-14 00:09:56.830000+00:00 1 1 6 NaN None NaN 1 6.0
6 NNoXiQUvESmFDuJB None 2025-11-14 00:09:55.722677+00:00 2025-11-14 00:09:56.852208+00:00 {'is_subset': True} None None False 2025-11-14 00:09:55.723000+00:00 1 1 5 8.0 None 2.0 1 NaN
5 5ZkbG4WbPwN2eSDd None 2025-11-14 00:09:52.906449+00:00 2025-11-14 00:09:52.936915+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None False 2025-11-14 00:09:52.907000+00:00 1 1 4 NaN None NaN 1 1.0
4 LyRcr7niQoFQAPuG None 2025-11-14 00:09:52.494283+00:00 2025-11-14 00:09:52.522082+00:00 {'input_artifact_key': 'my_analysis/dataset.pa... None None False 2025-11-14 00:09:52.494000+00:00 1 1 4 NaN None NaN 1 1.0
3 k7G0TPJw6GmILmzd None 2025-11-14 00:09:50.774497+00:00 2025-11-14 00:09:51.884975+00:00 {'example_param': 42} None None False 2025-11-14 00:09:50.775000+00:00 1 1 3 4.0 None 2.0 1 NaN
2 VWkp1ai5B8lIBBF5 None 2025-11-14 00:09:46.580372+00:00 2025-11-14 00:09:47.701835+00:00 {'input_dir': './mydataset', 'learning_rate': ... None None False 2025-11-14 00:09:46.581000+00:00 1 1 2 3.0 None 2.0 1 NaN
1 3bLYytDb9x59PZQ9 None 2025-11-14 00:09:40.985637+00:00 NaT None None None False 2025-11-14 00:09:40.986000+00:00 1 1 1 NaN None NaN 1 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
1 sAOSHaHJBZ3o /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 False 2025-11-14 00:09:36.891000+00:00 1 1 1 None
Transform
uid key description type source_code hash reference reference_type version is_latest is_locked created_at branch_id space_id created_by_id _template_id
id
6 5kI0sKLtWyBc0000 run_workflow.py/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n arti... 9NYMDP5l5Iuu9F8VrO3vWQ None None None True False 2025-11-14 00:09:56.827000+00:00 1 1 1 None
5 Ao4H3f2xkXuQ0000 run_workflow.py None script import argparse\nimport lamindb as ln\n\n\n@ln... fwij4oyLV27mmm9f2GVY_A None None None True False 2025-11-14 00:09:55.720000+00:00 1 1 1 None
4 0W5o4dv3lb8t0000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... CUqkJpolJY1Q1tqyCoWIWg None None None True False 2025-11-14 00:09:52.489000+00:00 1 1 1 None
3 56plLJkzwtzK0000 run_track_with_features_and_params.py None script import argparse\nimport lamindb as ln\n\n\nif ... 9MjLyvM1QzE2nPIPDRzBwg None None None True False 2025-11-14 00:09:50.772000+00:00 1 1 1 None
2 E56SfrCn1PsO0000 run_track_with_params.py None script import argparse\nimport lamindb as ln\n\nif __... 5RBz7zJICeKE1OSmg7gEdQ None None None True False 2025-11-14 00:09:46.577000+00:00 1 1 1 None
1 Gz67KQC61bkz0000 track.ipynb Track notebooks, scripts & functions notebook None None None None None True False 2025-11-14 00:09:40.976000+00:00 1 1 1 None

Manage notebook templates

A notebook acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

  • label using Record, e.g., transform.records.add(template_label)

  • tag with an indicative version string, e.g., transform.version = "T1"; transform.save()

Saving a notebook as an artifact

Sometimes you might want to save a notebook as an artifact. This is how you can do it:

lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact

A few checks at the end of this notebook:

assert run.params == {
    "input_dir": "./mydataset",
    "learning_rate": 0.01,
    "preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}, run.params
assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()