Track notebooks, scripts & functions

For tracking pipelines, see: Pipelines – workflow managers.

# pip install 'lamindb[jupyter]'
!lamin init --storage ./test-track
Hide code cell output
 initialized lamindb: testuser1/test-track

Track a notebook or script

Call track() to register your notebook or script as a transform and start capturing inputs & outputs of a run.

import lamindb as ln

ln.track()  # initiate a tracked notebook/script run

# your code automatically tracks inputs & outputs

ln.finish()  # mark run as finished, save execution report, source code & environment

Here is how a notebook with run report looks on the hub.

Explore it here.

You find your notebooks and scripts in the Transform registry (along with pipelines & functions). Run stores executions. You can use all usual ways of querying to obtain one or multiple transform records, e.g.:

transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code  # source code
transform.runs  # all runs
transform.latest_run.report  # report of latest run
transform.latest_run.environment  # environment of latest run

To load a notebook or script from the hub, search or filter the transform page and use the CLI.

lamin load https://lamin.ai/laminlabs/lamindata/transform/13VINnFk89PE

Use projects

You can link the entities created during a run to a project.

import lamindb as ln

my_project = ln.Project(name="My project").save()  # create a project

ln.track(project="My project")  # auto-link entities to "My project"

ln.Artifact(ln.core.datasets.file_fcs(), key="my_file.fcs").save()  # save an artifact
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('TKQXkTqKXPCp0000'), started new Run('uvUULVsx...') at 2025-05-29 10:19:45 UTC
 notebook imports: lamindb==1.6a1
 recommendation: to identify the notebook across renames, pass the uid: ln.track("TKQXkTqKXPCp", project="My project")
Artifact(uid='gKt5dYYnOTPHda4F0000', is_latest=True, key='my_file.fcs', suffix='.fcs', size=19330507, hash='rCPvmZB19xs4zHZ7p_-Wrg', branch_id=1, space_id=1, storage_id=1, run_id=1, created_by_id=1, created_at=2025-05-29 10:19:47 UTC)

Filter entities by project, e.g., artifacts:

ln.Artifact.filter(projects=my_project).df()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
1 gKt5dYYnOTPHda4F0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None None md5 True False 1 1 None None True 1 2025-05-29 10:19:47.927000+00:00 1 None 1

Access entities linked to a project.

display(my_project.artifacts.df())
display(my_project.transforms.df())
display(my_project.runs.df())
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
1 gKt5dYYnOTPHda4F0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None None md5 True False 1 1 None None True 1 2025-05-29 10:19:47.927000+00:00 1 None 1
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux branch_id
id
1 TKQXkTqKXPCp0000 track.ipynb Track notebooks, scripts & functions notebook None None None None 1 None None True 2025-05-29 10:19:45.913000+00:00 1 None 1
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux branch_id
id
1 uvUULVsxq1nLvvaS None 2025-05-29 10:19:45.924069+00:00 None None None None 0 1 1 None None None None 2025-05-29 10:19:45.924000+00:00 1 None 1

Use spaces

You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.

ln.track(space="Our team space")

Track parameters

In addition to tracking source code, run reports & environments, you can track run parameters.

Track run parameters

First, define valid parameters, e.g.:

ln.Feature(name="input_dir", dtype=str).save()
ln.Feature(name="learning_rate", dtype=float).save()
ln.Feature(name="preprocess_params", dtype="dict").save()
Hide code cell output
Feature(uid='aOqd7QnDraQe', name='preprocess_params', dtype='dict', array_rank=0, array_size=0, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-05-29 10:19:48 UTC)

If you hadn’t defined these parameters, you’d get a ValidationError in the following script.

run-track-with-params.py
import argparse
import lamindb as ln

if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--input-dir", type=str)
    p.add_argument("--downsample", action="store_true")
    p.add_argument("--learning-rate", type=float)
    args = p.parse_args()
    params = {
        "input_dir": args.input_dir,
        "learning_rate": args.learning_rate,
        "preprocess_params": {
            "downsample": args.downsample,  # nested parameter names & values in dictionaries are not validated
            "normalization": "the_good_one",
        },
    }
    ln.track(params=params)

    # your code

    ln.finish()

Run the script.

!python scripts/run-track-with-params.py  --input-dir ./mydataset --learning-rate 0.01 --downsample
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('i5F0Ukhy6xrg0000'), started new Run('ywkultuc...') at 2025-05-29 10:19:50 UTC
→ params: input_dir=./mydataset, learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
 recommendation: to identify the script across renames, pass the uid: ln.track("i5F0Ukhy6xrg", params={...})
 finished Run('ywkultuc') after 1s at 2025-05-29 10:19:51 UTC

Query by run parameters

Query for all runs that match a certain parameters:

ln.Run.filter(
    learning_rate=0.01, input_dir="./mydataset", preprocess_params__downsample=True
).df()
Hide code cell output
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux branch_id
id
2 ywkultucvUzn6UdO None 2025-05-29 10:19:50.653376+00:00 2025-05-29 10:19:51.745997+00:00 None None True 0 1 2 3 None 2 None 2025-05-29 10:19:50.654000+00:00 1 None 1

Note that:

  • preprocess_params__downsample=True traverses the dictionary preprocess_params to find the key "downsample" and match it to True

  • nested keys like "downsample" in a dictionary do not appear in Feature and hence, do not get validated

Access parameters of a run

Below is how you get the parameter values that were used for a given run.

run = ln.Run.filter(learning_rate=0.01).order_by("-started_at").first()
run.features.get_values()
Hide code cell output
{'input_dir': './mydataset',
 'learning_rate': 0.01,
 'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}
Here is how it looks on the hub.
image

Explore parameter values

If you want to query all parameter values together with other feature values, use FeatureValue.

ln.models.FeatureValue.df(include=["feature__name", "created_by__handle"])
Hide code cell output
value hash feature__name created_by__handle
id
1 ./mydataset 71I4KdtOlqWZYoR9KaVTvw input_dir testuser1
2 0.01 BIF-_RHBU2Sm7COXgAOIYg learning_rate testuser1
3 {'downsample': True, 'normalization': 'the_goo... 4ehQH8UO25aNM181K_gloQ preprocess_params testuser1

Track functions

If you want more-fined-grained data lineage tracking, use the tracked() decorator.

In a notebook

ln.Feature(name="subset_rows", dtype="int").save()  # define parameters
ln.Feature(name="subset_cols", dtype="int").save()
ln.Feature(name="input_artifact_key", dtype="str").save()
ln.Feature(name="output_artifact_key", dtype="str").save()
Feature(uid='eIxOu3OUEEOP', name='output_artifact_key', dtype='str', array_rank=0, array_size=0, branch_id=1, space_id=1, created_by_id=1, run_id=1, created_at=2025-05-29 10:19:52 UTC)

Define a function and decorate it with tracked():

@ln.tracked()
def subset_dataframe(
    input_artifact_key: str,
    output_artifact_key: str,
    subset_rows: int = 2,
    subset_cols: int = 2,
) -> None:
    artifact = ln.Artifact.get(key=input_artifact_key)
    dataset = artifact.load()
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    ln.Artifact.from_df(new_data, key=output_artifact_key).save()

Prepare a test dataset:

df = ln.core.datasets.small_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_df(df, key=input_artifact_key).save()

Run the function with default params:

ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key)

Query for the output:

subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
_images/7556b065c73cad2a3cd237475ef5b29c9fe7b0c67d1e8f39605dd4cbccf7b60f.svg

This is the run that created the subsetted_artifact:

subsetted_artifact.run
Run(uid='tZUhOLq62659MKuT', started_at=2025-05-29 10:19:52 UTC, finished_at=2025-05-29 10:19:52 UTC, branch_id=1, space_id=1, transform_id=3, created_by_id=1, initiated_by_run_id=1, created_at=2025-05-29 10:19:52 UTC)

This is the function that created it:

subsetted_artifact.run.transform
Transform(uid='YrSTlpY0RCk00000', is_latest=True, key='track.ipynb/subset_dataframe.py', type='function', hash='F_wwrfFs6zmzMGVilG2Prg', branch_id=1, space_id=1, created_by_id=1, created_at=2025-05-29 10:19:52 UTC)

This is the source code of this function:

subsetted_artifact.run.transform.source_code
'@ln.tracked()\ndef subset_dataframe(\n    input_artifact_key: str,\n    output_artifact_key: str,\n    subset_rows: int = 2,\n    subset_cols: int = 2,\n) -> None:\n    artifact = ln.Artifact.get(key=input_artifact_key)\n    dataset = artifact.load()\n    new_data = dataset.iloc[:subset_rows, :subset_cols]\n    ln.Artifact.from_df(new_data, key=output_artifact_key).save()\n'

These are all versions of this function:

subsetted_artifact.run.transform.versions.df()
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux branch_id
id
3 YrSTlpY0RCk00000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... F_wwrfFs6zmzMGVilG2Prg None None 1 None None True 2025-05-29 10:19:52.330000+00:00 1 None 1

This is the initating run that triggered the function call:

subsetted_artifact.run.initiated_by_run
Run(uid='uvUULVsxq1nLvvaS', started_at=2025-05-29 10:19:45 UTC, branch_id=1, space_id=1, transform_id=1, created_by_id=1, created_at=2025-05-29 10:19:45 UTC)

This is the transform of the initiating run:

subsetted_artifact.run.initiated_by_run.transform
Transform(uid='TKQXkTqKXPCp0000', is_latest=True, key='track.ipynb', description='Track notebooks, scripts & functions', type='notebook', branch_id=1, space_id=1, created_by_id=1, created_at=2025-05-29 10:19:45 UTC)

These are the parameters of the run:

subsetted_artifact.run.features.get_values()
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_cols': 2,
 'subset_rows': 2}

These input artifacts:

subsetted_artifact.run.input_artifacts.df()
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
4 gHp3TtsX3RNT9U1d0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 9108 D2ZSlO6x7-OIfdf0MkTzRQ None 3 md5 True False 1 1 None None True 1 2025-05-29 10:19:52.309000+00:00 1 None 1

These are output artifacts:

subsetted_artifact.run.output_artifacts.df()
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
5 UscBcrAtzZpUzKV80000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3238 dNHL-WWN3PCVS9pyW8pKHA None 2 md5 True False 1 1 None None True 3 2025-05-29 10:19:52.382000+00:00 1 None 1

Re-run the function with a different parameter:

subsetted_artifact = subset_dataframe(
    input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Hide code cell output
 creating new artifact version for key='my_analysis/dataset_subsetted.parquet' (storage: '/home/runner/work/lamindb/lamindb/docs/test-track')
_images/ed76f7cb86696dc69aa58a70a6ad872b8d209260b96bfe7ef420ed8e066372f9.svg

We created a new run:

subsetted_artifact.run
Run(uid='24AT5Jq51dbCiiw5', started_at=2025-05-29 10:19:52 UTC, finished_at=2025-05-29 10:19:53 UTC, branch_id=1, space_id=1, transform_id=3, created_by_id=1, initiated_by_run_id=1, created_at=2025-05-29 10:19:52 UTC)

With new parameters:

subsetted_artifact.run.features.get_values()
{'input_artifact_key': 'my_analysis/dataset.parquet',
 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
 'subset_cols': 3,
 'subset_rows': 2}

And a new version of the output artifact:

subsetted_artifact.run.output_artifacts.df()
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
6 UscBcrAtzZpUzKV80001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3852 gBAN-0lok9-D61VnFuHrAA None 2 md5 True False 1 1 None None True 4 2025-05-29 10:19:53.042000+00:00 1 None 1

See the state of the database:

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
6 UscBcrAtzZpUzKV80001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3852 gBAN-0lok9-D61VnFuHrAA None 2.0 md5 True False 1 1 None None True 4.0 2025-05-29 10:19:53.042000+00:00 1 None 1
5 UscBcrAtzZpUzKV80000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3238 dNHL-WWN3PCVS9pyW8pKHA None 2.0 md5 True False 1 1 None None False 3.0 2025-05-29 10:19:52.382000+00:00 1 None 1
4 gHp3TtsX3RNT9U1d0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 9108 D2ZSlO6x7-OIfdf0MkTzRQ None 3.0 md5 True False 1 1 None None True 1.0 2025-05-29 10:19:52.309000+00:00 1 None 1
3 r3dQJYCiYFwVTcEY0000 None log streams of run ywkultucvUzn6UdO .txt __lamindb_run__ None 0 1B2M2Y8AsgTpgAmY7PhCfg None NaN md5 True False 1 1 None None True NaN 2025-05-29 10:19:51.750000+00:00 1 None 1
2 I0nWY20NcECnGNG80000 None requirements.txt .txt __lamindb_run__ None 3992 0xW_aLpu4vmJOrumPkmK3Q None NaN md5 True False 1 1 None None True NaN 2025-05-29 10:19:51.743000+00:00 1 None 1
1 gKt5dYYnOTPHda4F0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None NaN md5 True False 1 1 None None True 1.0 2025-05-29 10:19:47.927000+00:00 1 None 1
Feature
uid name dtype is_type unit description array_rank array_size array_shape proxy_dtype synonyms _expect_many _curation space_id type_id run_id created_at created_by_id _aux branch_id
id
7 eIxOu3OUEEOP output_artifact_key str None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:52.254000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
6 9W2Dv2m2n3JP input_artifact_key str None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:52.245000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
5 N3jku7V4oY3l subset_cols int None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:52.233000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
4 9oV4wAkzf6rM subset_rows int None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:52.223000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
3 aOqd7QnDraQe preprocess_params dict None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:48.087000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
2 4GpwrS78pQD9 learning_rate float None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:48.078000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
1 lSa8RKg4djld input_dir str None None None 0 0 None None None None None 1 None 1 2025-05-29 10:19:48.069000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
FeatureValue
value hash space_id feature_id run_id created_at created_by_id _aux branch_id
id
1 ./mydataset 71I4KdtOlqWZYoR9KaVTvw 1 1 NaN 2025-05-29 10:19:50.670000+00:00 1 None 1
2 0.01 BIF-_RHBU2Sm7COXgAOIYg 1 2 NaN 2025-05-29 10:19:50.672000+00:00 1 None 1
3 {'downsample': True, 'normalization': 'the_goo... 4ehQH8UO25aNM181K_gloQ 1 3 NaN 2025-05-29 10:19:50.674000+00:00 1 None 1
4 2 yB5yjZ1ML2NvBn-JzBSGLA 1 4 1.0 2025-05-29 10:19:52.352000+00:00 1 None 1
5 2 yB5yjZ1ML2NvBn-JzBSGLA 1 5 1.0 2025-05-29 10:19:52.354000+00:00 1 None 1
6 my_analysis/dataset.parquet 1ImgyYl4KlCl3XCd-aQE9Q 1 6 1.0 2025-05-29 10:19:52.357000+00:00 1 None 1
7 my_analysis/dataset_subsetted.parquet G9luXJ51Hi4-Csrifos0Lw 1 7 1.0 2025-05-29 10:19:52.359000+00:00 1 None 1
Project
uid name is_type abbr url start_date end_date _status_code space_id type_id run_id created_at created_by_id _aux branch_id
id
1 iLouzfX8BtU2 My project False None None None None 0 1 None None 2025-05-29 10:19:45.045000+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux branch_id
id
1 uvUULVsxq1nLvvaS None 2025-05-29 10:19:45.924069+00:00 NaT None None None 0 1 1 NaN None NaN NaN 2025-05-29 10:19:45.924000+00:00 1 None 1
2 ywkultucvUzn6UdO None 2025-05-29 10:19:50.653376+00:00 2025-05-29 10:19:51.745997+00:00 None None True 0 1 2 3.0 None 2.0 NaN 2025-05-29 10:19:50.654000+00:00 1 None 1
3 tZUhOLq62659MKuT None 2025-05-29 10:19:52.335063+00:00 2025-05-29 10:19:52.387657+00:00 None None None 0 1 3 NaN None NaN 1.0 2025-05-29 10:19:52.335000+00:00 1 None 1
4 24AT5Jq51dbCiiw5 None 2025-05-29 10:19:52.997163+00:00 2025-05-29 10:19:53.048450+00:00 None None None 0 1 3 NaN None NaN 1.0 2025-05-29 10:19:52.997000+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux branch_id
id
1 EWfiiSeyG5xo /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 1 None 2025-05-29 10:19:41.976000+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux branch_id
id
3 YrSTlpY0RCk00000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... F_wwrfFs6zmzMGVilG2Prg None None 1 None None True 2025-05-29 10:19:52.330000+00:00 1 None 1
2 i5F0Ukhy6xrg0000 run-track-with-params.py run-track-with-params.py script import argparse\nimport lamindb as ln\n\nif __... nRUs3ZjuVTbKtBmSXpVQ5A None None 1 None None True 2025-05-29 10:19:50.650000+00:00 1 None 1
1 TKQXkTqKXPCp0000 track.ipynb Track notebooks, scripts & functions notebook None None None None 1 None None True 2025-05-29 10:19:45.913000+00:00 1 None 1

In a script

run-workflow.py
import argparse
import lamindb as ln

ln.Param(name="run_workflow_subset", dtype=bool).save()


@ln.tracked()
def subset_dataframe(
    artifact: ln.Artifact,
    subset_rows: int = 2,
    subset_cols: int = 2,
    run: ln.Run | None = None,
) -> ln.Artifact:
    dataset = artifact.load(is_run_input=run)
    new_data = dataset.iloc[:subset_rows, :subset_cols]
    new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
    return ln.Artifact.from_df(new_data, key=new_key, run=run).save()


if __name__ == "__main__":
    p = argparse.ArgumentParser()
    p.add_argument("--subset", action="store_true")
    args = p.parse_args()

    params = {"run_workflow_subset": args.subset}

    ln.track(params=params)

    if args.subset:
        df = ln.core.datasets.small_dataset1(otype="DataFrame")
        artifact = ln.Artifact.from_df(df, key="my_analysis/dataset.parquet").save()
        subsetted_artifact = subset_dataframe(artifact)

    ln.finish()
!python scripts/run-workflow.py --subset
Hide code cell output
 connected lamindb: testuser1/test-track
 created Transform('mJyK6DMmFSjh0000'), started new Run('X6Ok6k5K...') at 2025-05-29 10:19:56 UTC
→ params: run_workflow_subset=True
 recommendation: to identify the script across renames, pass the uid: ln.track("mJyK6DMmFSjh", params={...})
 returning existing artifact with same hash: Artifact(uid='gHp3TtsX3RNT9U1d0000', is_latest=True, key='my_analysis/dataset.parquet', suffix='.parquet', kind='dataset', otype='DataFrame', size=9108, hash='D2ZSlO6x7-OIfdf0MkTzRQ', n_observations=3, branch_id=1, space_id=1, storage_id=1, run_id=1, created_by_id=1, created_at=2025-05-29 10:19:52 UTC); to track this artifact as an input, use: ln.Artifact.get()
 returning existing artifact with same hash: Artifact(uid='UscBcrAtzZpUzKV80001', is_latest=True, key='my_analysis/dataset_subsetted.parquet', suffix='.parquet', kind='dataset', otype='DataFrame', size=3852, hash='gBAN-0lok9-D61VnFuHrAA', n_observations=2, branch_id=1, space_id=1, storage_id=1, run_id=4, created_by_id=1, created_at=2025-05-29 10:19:53 UTC); to track this artifact as an input, use: ln.Artifact.get()
 returning existing artifact with same hash: Artifact(uid='r3dQJYCiYFwVTcEY0000', is_latest=True, description='log streams of run ywkultucvUzn6UdO', suffix='.txt', kind='__lamindb_run__', size=0, hash='1B2M2Y8AsgTpgAmY7PhCfg', branch_id=1, space_id=1, storage_id=1, created_by_id=1, created_at=2025-05-29 10:19:51 UTC); to track this artifact as an input, use: ln.Artifact.get()
! updated description from log streams of run ywkultucvUzn6UdO to log streams of run X6Ok6k5KLWCH3YVi
 finished Run('X6Ok6k5K') after 0s at 2025-05-29 10:19:57 UTC
ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
6 UscBcrAtzZpUzKV80001 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3852 gBAN-0lok9-D61VnFuHrAA None 2.0 md5 True False 1 1 None None True 4.0 2025-05-29 10:19:53.042000+00:00 1 None 1
5 UscBcrAtzZpUzKV80000 my_analysis/dataset_subsetted.parquet None .parquet dataset DataFrame 3238 dNHL-WWN3PCVS9pyW8pKHA None 2.0 md5 True False 1 1 None None False 3.0 2025-05-29 10:19:52.382000+00:00 1 None 1
4 gHp3TtsX3RNT9U1d0000 my_analysis/dataset.parquet None .parquet dataset DataFrame 9108 D2ZSlO6x7-OIfdf0MkTzRQ None 3.0 md5 True False 1 1 None None True 1.0 2025-05-29 10:19:52.309000+00:00 1 None 1
3 r3dQJYCiYFwVTcEY0000 None log streams of run X6Ok6k5KLWCH3YVi .txt __lamindb_run__ None 0 1B2M2Y8AsgTpgAmY7PhCfg None NaN md5 True False 1 1 None None True NaN 2025-05-29 10:19:51.750000+00:00 1 None 1
2 I0nWY20NcECnGNG80000 None requirements.txt .txt __lamindb_run__ None 3992 0xW_aLpu4vmJOrumPkmK3Q None NaN md5 True False 1 1 None None True NaN 2025-05-29 10:19:51.743000+00:00 1 None 1
1 gKt5dYYnOTPHda4F0000 my_file.fcs None .fcs None None 19330507 rCPvmZB19xs4zHZ7p_-Wrg None NaN md5 True False 1 1 None None True 1.0 2025-05-29 10:19:47.927000+00:00 1 None 1
Feature
uid name dtype is_type unit description array_rank array_size array_shape proxy_dtype synonyms _expect_many _curation space_id type_id run_id created_at created_by_id _aux branch_id
id
8 nfNwjo69qpTA run_workflow_subset bool None None None 0 0 None None None None None 1 None NaN 2025-05-29 10:19:55.800000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
7 eIxOu3OUEEOP output_artifact_key str None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:52.254000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
6 9W2Dv2m2n3JP input_artifact_key str None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:52.245000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
5 N3jku7V4oY3l subset_cols int None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:52.233000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
4 9oV4wAkzf6rM subset_rows int None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:52.223000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
3 aOqd7QnDraQe preprocess_params dict None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:48.087000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
2 4GpwrS78pQD9 learning_rate float None None None 0 0 None None None None None 1 None 1.0 2025-05-29 10:19:48.078000+00:00 1 {'af': {'0': None, '1': True, '2': False}} 1
FeatureValue
value hash space_id feature_id run_id created_at created_by_id _aux branch_id
id
1 ./mydataset 71I4KdtOlqWZYoR9KaVTvw 1 1 NaN 2025-05-29 10:19:50.670000+00:00 1 None 1
2 0.01 BIF-_RHBU2Sm7COXgAOIYg 1 2 NaN 2025-05-29 10:19:50.672000+00:00 1 None 1
3 {'downsample': True, 'normalization': 'the_goo... 4ehQH8UO25aNM181K_gloQ 1 3 NaN 2025-05-29 10:19:50.674000+00:00 1 None 1
4 2 yB5yjZ1ML2NvBn-JzBSGLA 1 4 1.0 2025-05-29 10:19:52.352000+00:00 1 None 1
5 2 yB5yjZ1ML2NvBn-JzBSGLA 1 5 1.0 2025-05-29 10:19:52.354000+00:00 1 None 1
6 my_analysis/dataset.parquet 1ImgyYl4KlCl3XCd-aQE9Q 1 6 1.0 2025-05-29 10:19:52.357000+00:00 1 None 1
7 my_analysis/dataset_subsetted.parquet G9luXJ51Hi4-Csrifos0Lw 1 7 1.0 2025-05-29 10:19:52.359000+00:00 1 None 1
Project
uid name is_type abbr url start_date end_date _status_code space_id type_id run_id created_at created_by_id _aux branch_id
id
1 iLouzfX8BtU2 My project False None None None None 0 1 None None 2025-05-29 10:19:45.045000+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux branch_id
id
1 uvUULVsxq1nLvvaS None 2025-05-29 10:19:45.924069+00:00 NaT None None None 0 1 1 NaN None NaN NaN 2025-05-29 10:19:45.924000+00:00 1 None 1
2 ywkultucvUzn6UdO None 2025-05-29 10:19:50.653376+00:00 2025-05-29 10:19:51.745997+00:00 None None True 0 1 2 3.0 None 2.0 NaN 2025-05-29 10:19:50.654000+00:00 1 None 1
3 tZUhOLq62659MKuT None 2025-05-29 10:19:52.335063+00:00 2025-05-29 10:19:52.387657+00:00 None None None 0 1 3 NaN None NaN 1.0 2025-05-29 10:19:52.335000+00:00 1 None 1
4 24AT5Jq51dbCiiw5 None 2025-05-29 10:19:52.997163+00:00 2025-05-29 10:19:53.048450+00:00 None None None 0 1 3 NaN None NaN 1.0 2025-05-29 10:19:52.997000+00:00 1 None 1
5 X6Ok6k5KLWCH3YVi None 2025-05-29 10:19:56.290934+00:00 2025-05-29 10:19:57.201598+00:00 None None True 0 1 4 3.0 None 2.0 NaN 2025-05-29 10:19:56.291000+00:00 1 None 1
6 NlWF9g0ls2s4kJs5 None 2025-05-29 10:19:57.157657+00:00 2025-05-29 10:19:57.197733+00:00 None None None 0 1 5 NaN None NaN 5.0 2025-05-29 10:19:57.158000+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux branch_id
id
1 EWfiiSeyG5xo /home/runner/work/lamindb/lamindb/docs/test-track None local None 73KPGC58ahU9 1 None 2025-05-29 10:19:41.976000+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux branch_id
id
5 ZzhWkBHdVJk30000 run-workflow.py/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n arti... Dqbr_hMfHs17EhbPXP_PyQ None None 1 None None True 2025-05-29 10:19:57.155000+00:00 1 None 1
4 mJyK6DMmFSjh0000 run-workflow.py run-workflow.py script import argparse\nimport lamindb as ln\n\nln.Pa... yqr8j5hTUulVRzv4J-o9SQ None None 1 None None True 2025-05-29 10:19:56.288000+00:00 1 None 1
3 YrSTlpY0RCk00000 track.ipynb/subset_dataframe.py None function @ln.tracked()\ndef subset_dataframe(\n inpu... F_wwrfFs6zmzMGVilG2Prg None None 1 None None True 2025-05-29 10:19:52.330000+00:00 1 None 1
2 i5F0Ukhy6xrg0000 run-track-with-params.py run-track-with-params.py script import argparse\nimport lamindb as ln\n\nif __... nRUs3ZjuVTbKtBmSXpVQ5A None None 1 None None True 2025-05-29 10:19:50.650000+00:00 1 None 1
1 TKQXkTqKXPCp0000 track.ipynb Track notebooks, scripts & functions notebook None None None None 1 None None True 2025-05-29 10:19:45.913000+00:00 1 None 1

Sync scripts with git

To sync with your git commit, add the following line to your script:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>
synced-with-git.py
import lamindb as ln

ln.settings.sync_git_repo = "https://github.com/..."
ln.track()
# your code
ln.finish()
You’ll now see the GitHub emoji clickable on the hub.

Manage notebook templates

A notebook acts like a template upon using lamin load to load it. Consider you run:

lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000

Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.

Additionally, you can:

  • label using ULabel, e.g., transform.ulabels.add(template_label)

  • tag with an indicative version string, e.g., transform.version = "T1"; transform.save()

Saving a notebook as an artifact

Sometimes you might want to save a notebook as an artifact. This is how you can do it:

lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact
Hide code cell content
assert run.features.get_values() == {
    "input_dir": "./mydataset",
    "learning_rate": 0.01,
    "preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}

assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()

# clean up test instance
!rm -r ./test-track
!lamin delete --force test-track
 deleting instance testuser1/test-track