Track notebooks, scripts & functions¶
For tracking pipelines, see: docs:pipelines.
# pip install lamindb
!lamin init --storage ./test-track
Show code cell output
→ initialized lamindb: testuser1/test-track
Track a notebook or script¶
Call track() to register your notebook or script as a transform and start capturing inputs & outputs of a run.
import lamindb as ln
ln.track() # initiate a tracked notebook/script run
# your code automatically tracks inputs & outputs
ln.finish() # mark run as finished, save execution report, source code & environment
Here is how a notebook with run report looks on the hub.
Explore it here.
You find your notebooks and scripts in the Transform registry (along with pipelines & functions). Run stores executions.
You can use all usual ways of querying to obtain one or multiple transform records, e.g.:
transform = ln.Transform.get(key="my_analyses/my_notebook.ipynb")
transform.source_code # source code
transform.runs # all runs
transform.latest_run.report # report of latest run
transform.latest_run.environment # environment of latest run
To load a notebook or script from the hub, search or filter the transform page and use the CLI.
lamin load https://lamin.ai/laminlabs/lamindata/transform/13VINnFk89PE
Organize local development¶
If no development directory is set, script & notebooks keys equal their filenames. Otherwise, script & notebooks keys equal the relative path in the development directory.
To set the development directory to your current shell development directory, run:
lamin settings set dev-dir .
You can see the current status by running:
lamin info
Sync scripts with git¶
To sync scripts with with a git repo, either export an environment variable:
export LAMINDB_SYNC_GIT_REPO = <YOUR-GIT-REPO-URL>
Or set the following setting:
ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>
If you work on a single project in your lamindb instance, it makes sense to set LaminDB’s dev-dir to the root of the local git repo clone.
If you work on multiple projects in your lamindb instance, you can use the dev-dir as the local root and nest git repositories in it.
Use projects¶
You can link the entities created during a run to a project.
import lamindb as ln
my_project = ln.Project(name="My project").save() # create a project
ln.track(project="My project") # auto-link entities to "My project"
ln.Artifact(
ln.examples.datasets.file_fcs(), key="my_file.fcs"
).save() # save an artifact
Show code cell output
→ connected lamindb: testuser1/test-track
→ created Transform('NsUIB9UUK0LO0000', key='track.ipynb'), started new Run('GH1KOCjpxPWDC7G2') at 2025-11-05 21:33:13 UTC
→ notebook imports: lamindb==1.15.0
• recommendation: to identify the notebook across renames, pass the uid: ln.track("NsUIB9UUK0LO", project="My project")
Artifact(uid='ZNpsjozpAihYHBUv0000', version=None, is_latest=True, key='my_file.fcs', description=None, suffix='.fcs', kind=None, otype=None, size=19330507, hash='rCPvmZB19xs4zHZ7p_-Wrg', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2025-11-05 21:33:15 UTC, is_locked=False)
Filter entities by project, e.g., artifacts:
ln.Artifact.filter(projects=my_project).to_dataframe()
Show code cell output
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 1 | ZNpsjozpAihYHBUv0000 | my_file.fcs | None | .fcs | None | None | 19330507 | rCPvmZB19xs4zHZ7p_-Wrg | None | None | None | True | False | 2025-11-05 21:33:15.431000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
Access entities linked to a project.
display(my_project.artifacts.to_dataframe())
display(my_project.transforms.to_dataframe())
display(my_project.runs.to_dataframe())
Show code cell output
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 1 | ZNpsjozpAihYHBUv0000 | my_file.fcs | None | .fcs | None | None | 19330507 | rCPvmZB19xs4zHZ7p_-Wrg | None | None | None | True | False | 2025-11-05 21:33:15.431000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
| uid | key | description | type | source_code | hash | reference | reference_type | version | is_latest | is_locked | created_at | branch_id | space_id | created_by_id | _template_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||
| 1 | NsUIB9UUK0LO0000 | track.ipynb | Track notebooks, scripts & functions | notebook | None | None | None | None | None | True | False | 2025-11-05 21:33:13.177000+00:00 | 1 | 1 | 1 | None |
| uid | name | started_at | finished_at | params | reference | reference_type | is_locked | created_at | branch_id | space_id | transform_id | report_id | _logfile_id | environment_id | created_by_id | initiated_by_run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||
| 1 | GH1KOCjpxPWDC7G2 | None | 2025-11-05 21:33:13.185094+00:00 | None | None | None | None | False | 2025-11-05 21:33:13.185000+00:00 | 1 | 1 | 1 | None | None | None | 1 | None |
Use spaces¶
You can write the entities created during a run into a space that you configure on LaminHub. This is particularly useful if you want to restrict access to a space. Note that this doesn’t affect bionty entities who should typically be commonly accessible.
ln.track(space="Our team space")
Track parameters & features¶
In addition to tracking source code, run reports & environments, you can track run parameters & features.
Let’s look at the following script, which has a few parameters.
import argparse
import lamindb as ln
if __name__ == "__main__":
p = argparse.ArgumentParser()
p.add_argument("--input-dir", type=str)
p.add_argument("--downsample", action="store_true")
p.add_argument("--learning-rate", type=float)
args = p.parse_args()
params = {
"input_dir": args.input_dir,
"learning_rate": args.learning_rate,
"preprocess_params": {
"downsample": args.downsample,
"normalization": "the_good_one",
},
}
ln.track(params=params)
# your code
ln.finish()
Run the script.
!python scripts/run_track_with_params.py --input-dir ./mydataset --learning-rate 0.01 --downsample
Show code cell output
→ connected lamindb: testuser1/test-track
→ created Transform('jTSyIY1c5q580000', key='run_track_with_params.py'), started new Run('HbyLpugWPxxF8c1n') at 2025-11-05 21:33:17 UTC
→ params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downsample': True, 'normalization': 'the_good_one'}
• recommendation: to identify the script across renames, pass the uid: ln.track("jTSyIY1c5q58", params={...})
Query for all runs that match certain parameters:
ln.Run.filter(
params__learning_rate=0.01,
params__preprocess_params__downsample=True,
).to_dataframe()
Show code cell output
| uid | name | started_at | finished_at | params | reference | reference_type | is_locked | created_at | branch_id | space_id | transform_id | report_id | _logfile_id | environment_id | created_by_id | initiated_by_run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||
| 2 | HbyLpugWPxxF8c1n | None | 2025-11-05 21:33:17.903804+00:00 | 2025-11-05 21:33:19.027712+00:00 | {'input_dir': './mydataset', 'learning_rate': ... | None | None | False | 2025-11-05 21:33:17.904000+00:00 | 1 | 1 | 2 | 3 | None | 2 | 1 | None |
Describe & get parameters:
run = ln.Run.filter(params__learning_rate=0.01).order_by("-started_at").first()
run.describe()
run.params
Show code cell output
Run: HbyLpug (run_track_with_params.py) ├── uid: HbyLpugWPxxF8c1n transform: run_track_with_params.py (0000) │ started_at: 2025-11-05 21:33:17 UTC finished_at: 2025-11-05 21:33:19 UTC │ status: completed │ branch: main space: all │ created_at: 2025-11-05 21:33:17 UTC created_by: testuser1 ├── Params │ ├── input_dir: ./mydataset │ ├── learning_rate: 0.01 │ └── preprocess_params: {'downsample': True, 'normalization': 'the_good_one'} ├── report: xNkImBr │ │ → connected lamindb: testuser1/test-track │ │ → created Transform('jTSyIY1c5q580000', key='run_track_with_params.py'), started … │ │ → params: input_dir='./mydataset', learning_rate=0.01, preprocess_params={'downs … │ │ • recommendation: to identify the script across renames, pass the uid: ln.track( … └── environment: AjdcpwV │ aiobotocore==2.25.1 │ aiohappyeyeballs==2.6.1 │ aiohttp==3.13.2 │ aioitertools==0.12.0 │ …
{'input_dir': './mydataset',
'learning_rate': 0.01,
'preprocess_params': {'downsample': True, 'normalization': 'the_good_one'}}
You can also track run features in analogy to artifact features.
In contrast to params, features are validated against the Feature registry and allow to express relationships with entities in your registries.
Let’s first define labels & features.
experiment_type = ln.Record(name="Experiment", is_type=True).save()
experiment_label = ln.Record(name="Experiment1", type=experiment_type).save()
ln.Feature(name="s3_folder", dtype=str).save()
ln.Feature(name="experiment", dtype=experiment_type).save()
Show code cell output
Feature(uid='rfawEgXITG7Y', name='experiment', dtype='cat[Record[Experiment]]', is_type=None, unit=None, description=None, array_rank=0, array_size=0, array_shape=None, proxy_dtype=None, synonyms=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, created_at=2025-11-05 21:33:19 UTC, is_locked=False)
!python scripts/run_track_with_features_and_params.py --s3-folder s3://my-bucket/my-folder --experiment Experiment1
Show code cell output
→ connected lamindb: testuser1/test-track
→ created Transform('qfXlgJu7IUvm0000', key='run_track_with_features_and_params.py'), started new Run('cMLJQuzQGVF7ujZ0') at 2025-11-05 21:33:21 UTC
→ params: example_param=42
→ features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1'
• recommendation: to identify the script across renames, pass the uid: ln.track("qfXlgJu7IUvm", params={...})
ln.Run.filter(s3_folder="s3://my-bucket/my-folder").to_dataframe()
Show code cell output
| uid | name | started_at | finished_at | params | reference | reference_type | is_locked | created_at | branch_id | space_id | transform_id | report_id | _logfile_id | environment_id | created_by_id | initiated_by_run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||
| 3 | cMLJQuzQGVF7ujZ0 | None | 2025-11-05 21:33:21.964465+00:00 | 2025-11-05 21:33:23.052524+00:00 | {'example_param': 42} | None | None | False | 2025-11-05 21:33:21.965000+00:00 | 1 | 1 | 3 | 4 | None | 2 | 1 | None |
Describe & get feature values.
run2 = ln.Run.filter(
s3_folder="s3://my-bucket/my-folder", experiment="Experiment1"
).last()
run2.describe()
run2.features.get_values()
Show code cell output
Run: cMLJQuz (run_track_with_features_and_params.py) ├── uid: cMLJQuzQGVF7ujZ0 transform: run_track_with_features_and_params.py (0000) │ started_at: 2025-11-05 21:33:21 UTC finished_at: 2025-11-05 21:33:23 UTC │ status: completed │ branch: main space: all │ created_at: 2025-11-05 21:33:21 UTC created_by: testuser1 ├── Params │ └── example_param: 42 ├── Features │ └── experiment Record[Experiment] Experiment1 │ s3_folder str s3://my-bucket/my-folder ├── report: WN2A3bn │ │ → connected lamindb: testuser1/test-track │ │ → created Transform('qfXlgJu7IUvm0000', key='run_track_with_features_and_params. … │ │ → params: example_param=42 │ │ → features: s3_folder='s3://my-bucket/my-folder', experiment='Experiment1' │ │ … └── environment: AjdcpwV │ aiobotocore==2.25.1 │ aiohappyeyeballs==2.6.1 │ aiohttp==3.13.2 │ aioitertools==0.12.0 │ …
{'experiment': 'Experiment1', 's3_folder': 's3://my-bucket/my-folder'}
Track functions¶
If you want more-fined-grained data lineage tracking, use the tracked() decorator.
@ln.tracked()
def subset_dataframe(
input_artifact_key: str,
output_artifact_key: str,
subset_rows: int = 2,
subset_cols: int = 2,
) -> None:
artifact = ln.Artifact.get(key=input_artifact_key)
dataset = artifact.load()
new_data = dataset.iloc[:subset_rows, :subset_cols]
ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()
Prepare a test dataset:
df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
input_artifact_key = "my_analysis/dataset.parquet"
artifact = ln.Artifact.from_dataframe(df, key=input_artifact_key).save()
→ writing the in-memory object into cache
Run the function with default params:
ouput_artifact_key = input_artifact_key.replace(".parquet", "_subsetted.parquet")
subset_dataframe(input_artifact_key, ouput_artifact_key)
Show code cell output
→ writing the in-memory object into cache
Query for the output:
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
This is the run that created the subsetted_artifact:
subsetted_artifact.run
Run(uid='TZ78Fuha16Et2dVF', name=None, started_at=2025-11-05 21:33:23 UTC, finished_at=2025-11-05 21:33:23 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 2}, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=4, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=1, created_at=2025-11-05 21:33:23 UTC, is_locked=False)
This is the function that created it:
subsetted_artifact.run.transform
Transform(uid='fxXqxn3AbO8W0000', version=None, is_latest=True, key='track.ipynb/subset_dataframe.py', description=None, type='function', hash='CUqkJpolJY1Q1tqyCoWIWg', reference=None, reference_type=None, branch_id=1, space_id=1, created_by_id=1, created_at=2025-11-05 21:33:23 UTC, is_locked=False)
This is the source code of this function:
subsetted_artifact.run.transform.source_code
'@ln.tracked()\ndef subset_dataframe(\n input_artifact_key: str,\n output_artifact_key: str,\n subset_rows: int = 2,\n subset_cols: int = 2,\n) -> None:\n artifact = ln.Artifact.get(key=input_artifact_key)\n dataset = artifact.load()\n new_data = dataset.iloc[:subset_rows, :subset_cols]\n ln.Artifact.from_dataframe(new_data, key=output_artifact_key).save()\n'
These are all versions of this function:
subsetted_artifact.run.transform.versions.to_dataframe()
| uid | key | description | type | source_code | hash | reference | reference_type | version | is_latest | is_locked | created_at | branch_id | space_id | created_by_id | _template_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||
| 4 | fxXqxn3AbO8W0000 | track.ipynb/subset_dataframe.py | None | function | @ln.tracked()\ndef subset_dataframe(\n inpu... | CUqkJpolJY1Q1tqyCoWIWg | None | None | None | True | False | 2025-11-05 21:33:23.658000+00:00 | 1 | 1 | 1 | None |
This is the initating run that triggered the function call:
subsetted_artifact.run.initiated_by_run
Run(uid='GH1KOCjpxPWDC7G2', name=None, started_at=2025-11-05 21:33:13 UTC, finished_at=None, params=None, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=1, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=None, created_at=2025-11-05 21:33:13 UTC, is_locked=False)
This is the transform of the initiating run:
subsetted_artifact.run.initiated_by_run.transform
Transform(uid='NsUIB9UUK0LO0000', version=None, is_latest=True, key='track.ipynb', description='Track notebooks, scripts & functions', type='notebook', hash=None, reference=None, reference_type=None, branch_id=1, space_id=1, created_by_id=1, created_at=2025-11-05 21:33:13 UTC, is_locked=False)
These are the parameters of the run:
subsetted_artifact.run.params
{'input_artifact_key': 'my_analysis/dataset.parquet',
'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
'subset_rows': 2,
'subset_cols': 2}
These are the input artifacts:
subsetted_artifact.run.input_artifacts.to_dataframe()
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 5 | 8Evkq51yeMs59ehI0000 | my_analysis/dataset.parquet | None | .parquet | dataset | DataFrame | 9868 | wvfEBPwHL3XHiAb-o8fU6Q | None | 3 | None | True | False | 2025-11-05 21:33:23.636000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
These are output artifacts:
subsetted_artifact.run.output_artifacts.to_dataframe()
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 6 | XPpcBJ7ChYBYcurQ0000 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3238 | UM8d9C-x_2fbc_46BScp8A | None | 2 | None | True | False | 2025-11-05 21:33:23.681000+00:00 | 1 | 1 | 1 | 4 | None | 1 |
Re-run the function with a different parameter:
subsetted_artifact = subset_dataframe(
input_artifact_key, ouput_artifact_key, subset_cols=3
)
subsetted_artifact = ln.Artifact.get(key=ouput_artifact_key)
subsetted_artifact.view_lineage()
Show code cell output
→ writing the in-memory object into cache
→ creating new artifact version for key 'my_analysis/dataset_subsetted.parquet' in storage '/home/runner/work/lamindb/lamindb/docs/test-track'
We created a new run:
subsetted_artifact.run
Run(uid='dPdmxb0XgHfkBR0F', name=None, started_at=2025-11-05 21:33:24 UTC, finished_at=2025-11-05 21:33:24 UTC, params={'input_artifact_key': 'my_analysis/dataset.parquet', 'output_artifact_key': 'my_analysis/dataset_subsetted.parquet', 'subset_rows': 2, 'subset_cols': 3}, reference=None, reference_type=None, branch_id=1, space_id=1, transform_id=4, report_id=None, environment_id=None, created_by_id=1, initiated_by_run_id=1, created_at=2025-11-05 21:33:24 UTC, is_locked=False)
With new parameters:
subsetted_artifact.run.params
{'input_artifact_key': 'my_analysis/dataset.parquet',
'output_artifact_key': 'my_analysis/dataset_subsetted.parquet',
'subset_rows': 2,
'subset_cols': 3}
And a new version of the output artifact:
subsetted_artifact.run.output_artifacts.to_dataframe()
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 7 | XPpcBJ7ChYBYcurQ0001 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3852 | 7WGuLVamVyBMhPb2qRE_tA | None | 2 | None | True | False | 2025-11-05 21:33:24.145000+00:00 | 1 | 1 | 1 | 5 | None | 1 |
See the state of the database:
ln.view()
Show code cell output
Artifact
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 7 | XPpcBJ7ChYBYcurQ0001 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3852 | 7WGuLVamVyBMhPb2qRE_tA | None | 2.0 | None | True | False | 2025-11-05 21:33:24.145000+00:00 | 1 | 1 | 1 | 5 | None | 1 |
| 6 | XPpcBJ7ChYBYcurQ0000 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3238 | UM8d9C-x_2fbc_46BScp8A | None | 2.0 | None | False | False | 2025-11-05 21:33:23.681000+00:00 | 1 | 1 | 1 | 4 | None | 1 |
| 5 | 8Evkq51yeMs59ehI0000 | my_analysis/dataset.parquet | None | .parquet | dataset | DataFrame | 9868 | wvfEBPwHL3XHiAb-o8fU6Q | None | 3.0 | None | True | False | 2025-11-05 21:33:23.636000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
| 1 | ZNpsjozpAihYHBUv0000 | my_file.fcs | None | .fcs | None | None | 19330507 | rCPvmZB19xs4zHZ7p_-Wrg | None | NaN | None | True | False | 2025-11-05 21:33:15.431000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
Feature
| uid | name | dtype | is_type | unit | description | array_rank | array_size | array_shape | proxy_dtype | synonyms | is_locked | created_at | branch_id | space_id | created_by_id | run_id | type_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||
| 2 | rfawEgXITG7Y | experiment | cat[Record[Experiment]] | None | None | None | 0 | 0 | None | None | None | False | 2025-11-05 21:33:19.571000+00:00 | 1 | 1 | 1 | 1 | None |
| 1 | oZanpf4iDBvn | s3_folder | str | None | None | None | 0 | 0 | None | None | None | False | 2025-11-05 21:33:19.563000+00:00 | 1 | 1 | 1 | 1 | None |
FeatureValue
| value | hash | is_locked | created_at | branch_id | space_id | created_by_id | run_id | feature_id | |
|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||
| 1 | s3://my-bucket/my-folder | E-3iWq1AziFBjh_cbyr5ZA | False | 2025-11-05 21:33:21.981000+00:00 | 1 | 1 | 1 | None | 1 |
Project
| uid | name | description | is_type | abbr | url | start_date | end_date | is_locked | created_at | branch_id | space_id | created_by_id | run_id | type_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||
| 1 | v8wpORq3rUbx | My project | None | False | None | None | None | None | False | 2025-11-05 21:33:12.288000+00:00 | 1 | 1 | 1 | None | None |
Record
| uid | name | is_type | description | reference | reference_type | is_locked | created_at | branch_id | space_id | created_by_id | type_id | schema_id | run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||
| 2 | vRnr9CiGgvYvRulP | Experiment1 | False | None | None | None | False | 2025-11-05 21:33:19.556000+00:00 | 1 | 1 | 1 | 1.0 | None | 1 |
| 1 | AlxqRCB6Hjq58tV4 | Experiment | True | None | None | None | False | 2025-11-05 21:33:19.551000+00:00 | 1 | 1 | 1 | NaN | None | 1 |
Run
| uid | name | started_at | finished_at | params | reference | reference_type | is_locked | created_at | branch_id | space_id | transform_id | report_id | _logfile_id | environment_id | created_by_id | initiated_by_run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||
| 5 | dPdmxb0XgHfkBR0F | None | 2025-11-05 21:33:24.123504+00:00 | 2025-11-05 21:33:24.152096+00:00 | {'input_artifact_key': 'my_analysis/dataset.pa... | None | None | False | 2025-11-05 21:33:24.124000+00:00 | 1 | 1 | 4 | NaN | None | NaN | 1 | 1.0 |
| 4 | TZ78Fuha16Et2dVF | None | 2025-11-05 21:33:23.662631+00:00 | 2025-11-05 21:33:23.688634+00:00 | {'input_artifact_key': 'my_analysis/dataset.pa... | None | None | False | 2025-11-05 21:33:23.663000+00:00 | 1 | 1 | 4 | NaN | None | NaN | 1 | 1.0 |
| 3 | cMLJQuzQGVF7ujZ0 | None | 2025-11-05 21:33:21.964465+00:00 | 2025-11-05 21:33:23.052524+00:00 | {'example_param': 42} | None | None | False | 2025-11-05 21:33:21.965000+00:00 | 1 | 1 | 3 | 4.0 | None | 2.0 | 1 | NaN |
| 2 | HbyLpugWPxxF8c1n | None | 2025-11-05 21:33:17.903804+00:00 | 2025-11-05 21:33:19.027712+00:00 | {'input_dir': './mydataset', 'learning_rate': ... | None | None | False | 2025-11-05 21:33:17.904000+00:00 | 1 | 1 | 2 | 3.0 | None | 2.0 | 1 | NaN |
| 1 | GH1KOCjpxPWDC7G2 | None | 2025-11-05 21:33:13.185094+00:00 | NaT | None | None | None | False | 2025-11-05 21:33:13.185000+00:00 | 1 | 1 | 1 | NaN | None | NaN | 1 | NaN |
Storage
| uid | root | description | type | region | instance_uid | is_locked | created_at | branch_id | space_id | created_by_id | run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||
| 1 | wOBrh4mOTrHA | /home/runner/work/lamindb/lamindb/docs/test-track | None | local | None | 73KPGC58ahU9 | False | 2025-11-05 21:33:09.257000+00:00 | 1 | 1 | 1 | None |
Transform
| uid | key | description | type | source_code | hash | reference | reference_type | version | is_latest | is_locked | created_at | branch_id | space_id | created_by_id | _template_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||
| 4 | fxXqxn3AbO8W0000 | track.ipynb/subset_dataframe.py | None | function | @ln.tracked()\ndef subset_dataframe(\n inpu... | CUqkJpolJY1Q1tqyCoWIWg | None | None | None | True | False | 2025-11-05 21:33:23.658000+00:00 | 1 | 1 | 1 | None |
| 3 | qfXlgJu7IUvm0000 | run_track_with_features_and_params.py | None | script | import argparse\nimport lamindb as ln\n\n\nif ... | 9MjLyvM1QzE2nPIPDRzBwg | None | None | None | True | False | 2025-11-05 21:33:21.962000+00:00 | 1 | 1 | 1 | None |
| 2 | jTSyIY1c5q580000 | run_track_with_params.py | None | script | import argparse\nimport lamindb as ln\n\nif __... | 5RBz7zJICeKE1OSmg7gEdQ | None | None | None | True | False | 2025-11-05 21:33:17.901000+00:00 | 1 | 1 | 1 | None |
| 1 | NsUIB9UUK0LO0000 | track.ipynb | Track notebooks, scripts & functions | notebook | None | None | None | None | None | True | False | 2025-11-05 21:33:13.177000+00:00 | 1 | 1 | 1 | None |
In a script¶
import argparse
import lamindb as ln
@ln.tracked()
def subset_dataframe(
artifact: ln.Artifact,
subset_rows: int = 2,
subset_cols: int = 2,
run: ln.Run | None = None,
) -> ln.Artifact:
dataset = artifact.load(is_run_input=run)
new_data = dataset.iloc[:subset_rows, :subset_cols]
new_key = artifact.key.replace(".parquet", "_subsetted.parquet")
return ln.Artifact.from_dataframe(new_data, key=new_key, run=run).save()
if __name__ == "__main__":
p = argparse.ArgumentParser()
p.add_argument("--subset", action="store_true")
args = p.parse_args()
params = {"is_subset": args.subset}
ln.track(params=params)
if args.subset:
df = ln.examples.datasets.mini_immuno.get_dataset1(otype="DataFrame")
artifact = ln.Artifact.from_dataframe(
df, key="my_analysis/dataset.parquet"
).save()
subsetted_artifact = subset_dataframe(artifact)
ln.finish()
!python scripts/run_workflow.py --subset
Show code cell output
→ connected lamindb: testuser1/test-track
→ created Transform('Z0Yx1gPOS3xm0000', key='run_workflow.py'), started new Run('aq3c1QYU63k2q1NL') at 2025-11-05 21:33:26 UTC
→ params: is_subset=True
• recommendation: to identify the script across renames, pass the uid: ln.track("Z0Yx1gPOS3xm", params={...})
→ writing the in-memory object into cache
→ returning artifact with same hash: Artifact(uid='8Evkq51yeMs59ehI0000', version=None, is_latest=True, key='my_analysis/dataset.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=9868, hash='wvfEBPwHL3XHiAb-o8fU6Q', n_files=None, n_observations=3, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2025-11-05 21:33:23 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
! cannot infer feature type of: None, returning '?
! skipping param run because dtype not JSON serializable
→ writing the in-memory object into cache
→ returning artifact with same hash: Artifact(uid='XPpcBJ7ChYBYcurQ0001', version=None, is_latest=True, key='my_analysis/dataset_subsetted.parquet', description=None, suffix='.parquet', kind='dataset', otype='DataFrame', size=3852, hash='7WGuLVamVyBMhPb2qRE_tA', n_files=None, n_observations=2, branch_id=1, space_id=1, storage_id=1, run_id=5, schema_id=None, created_by_id=1, created_at=2025-11-05 21:33:24 UTC, is_locked=False); to track this artifact as an input, use: ln.Artifact.get()
ln.view()
Show code cell output
Artifact
| uid | key | description | suffix | kind | otype | size | hash | n_files | n_observations | version | is_latest | is_locked | created_at | branch_id | space_id | storage_id | run_id | schema_id | created_by_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| 7 | XPpcBJ7ChYBYcurQ0001 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3852 | 7WGuLVamVyBMhPb2qRE_tA | None | 2.0 | None | True | False | 2025-11-05 21:33:24.145000+00:00 | 1 | 1 | 1 | 5 | None | 1 |
| 6 | XPpcBJ7ChYBYcurQ0000 | my_analysis/dataset_subsetted.parquet | None | .parquet | dataset | DataFrame | 3238 | UM8d9C-x_2fbc_46BScp8A | None | 2.0 | None | False | False | 2025-11-05 21:33:23.681000+00:00 | 1 | 1 | 1 | 4 | None | 1 |
| 5 | 8Evkq51yeMs59ehI0000 | my_analysis/dataset.parquet | None | .parquet | dataset | DataFrame | 9868 | wvfEBPwHL3XHiAb-o8fU6Q | None | 3.0 | None | True | False | 2025-11-05 21:33:23.636000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
| 1 | ZNpsjozpAihYHBUv0000 | my_file.fcs | None | .fcs | None | None | 19330507 | rCPvmZB19xs4zHZ7p_-Wrg | None | NaN | None | True | False | 2025-11-05 21:33:15.431000+00:00 | 1 | 1 | 1 | 1 | None | 1 |
Feature
| uid | name | dtype | is_type | unit | description | array_rank | array_size | array_shape | proxy_dtype | synonyms | is_locked | created_at | branch_id | space_id | created_by_id | run_id | type_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||
| 2 | rfawEgXITG7Y | experiment | cat[Record[Experiment]] | None | None | None | 0 | 0 | None | None | None | False | 2025-11-05 21:33:19.571000+00:00 | 1 | 1 | 1 | 1 | None |
| 1 | oZanpf4iDBvn | s3_folder | str | None | None | None | 0 | 0 | None | None | None | False | 2025-11-05 21:33:19.563000+00:00 | 1 | 1 | 1 | 1 | None |
FeatureValue
| value | hash | is_locked | created_at | branch_id | space_id | created_by_id | run_id | feature_id | |
|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||
| 1 | s3://my-bucket/my-folder | E-3iWq1AziFBjh_cbyr5ZA | False | 2025-11-05 21:33:21.981000+00:00 | 1 | 1 | 1 | None | 1 |
Project
| uid | name | description | is_type | abbr | url | start_date | end_date | is_locked | created_at | branch_id | space_id | created_by_id | run_id | type_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||
| 1 | v8wpORq3rUbx | My project | None | False | None | None | None | None | False | 2025-11-05 21:33:12.288000+00:00 | 1 | 1 | 1 | None | None |
Record
| uid | name | is_type | description | reference | reference_type | is_locked | created_at | branch_id | space_id | created_by_id | type_id | schema_id | run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||
| 2 | vRnr9CiGgvYvRulP | Experiment1 | False | None | None | None | False | 2025-11-05 21:33:19.556000+00:00 | 1 | 1 | 1 | 1.0 | None | 1 |
| 1 | AlxqRCB6Hjq58tV4 | Experiment | True | None | None | None | False | 2025-11-05 21:33:19.551000+00:00 | 1 | 1 | 1 | NaN | None | 1 |
Run
| uid | name | started_at | finished_at | params | reference | reference_type | is_locked | created_at | branch_id | space_id | transform_id | report_id | _logfile_id | environment_id | created_by_id | initiated_by_run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||
| 7 | Ex21tzOjjMkYeMhl | None | 2025-11-05 21:33:27.764323+00:00 | 2025-11-05 21:33:27.784058+00:00 | {'artifact': 'Artifact[8Evkq51yeMs59ehI0000]',... | None | None | False | 2025-11-05 21:33:27.765000+00:00 | 1 | 1 | 6 | NaN | None | NaN | 1 | 6.0 |
| 6 | aq3c1QYU63k2q1NL | None | 2025-11-05 21:33:26.742823+00:00 | 2025-11-05 21:33:27.785663+00:00 | {'is_subset': True} | None | None | False | 2025-11-05 21:33:26.743000+00:00 | 1 | 1 | 5 | 8.0 | None | 2.0 | 1 | NaN |
| 5 | dPdmxb0XgHfkBR0F | None | 2025-11-05 21:33:24.123504+00:00 | 2025-11-05 21:33:24.152096+00:00 | {'input_artifact_key': 'my_analysis/dataset.pa... | None | None | False | 2025-11-05 21:33:24.124000+00:00 | 1 | 1 | 4 | NaN | None | NaN | 1 | 1.0 |
| 4 | TZ78Fuha16Et2dVF | None | 2025-11-05 21:33:23.662631+00:00 | 2025-11-05 21:33:23.688634+00:00 | {'input_artifact_key': 'my_analysis/dataset.pa... | None | None | False | 2025-11-05 21:33:23.663000+00:00 | 1 | 1 | 4 | NaN | None | NaN | 1 | 1.0 |
| 3 | cMLJQuzQGVF7ujZ0 | None | 2025-11-05 21:33:21.964465+00:00 | 2025-11-05 21:33:23.052524+00:00 | {'example_param': 42} | None | None | False | 2025-11-05 21:33:21.965000+00:00 | 1 | 1 | 3 | 4.0 | None | 2.0 | 1 | NaN |
| 2 | HbyLpugWPxxF8c1n | None | 2025-11-05 21:33:17.903804+00:00 | 2025-11-05 21:33:19.027712+00:00 | {'input_dir': './mydataset', 'learning_rate': ... | None | None | False | 2025-11-05 21:33:17.904000+00:00 | 1 | 1 | 2 | 3.0 | None | 2.0 | 1 | NaN |
| 1 | GH1KOCjpxPWDC7G2 | None | 2025-11-05 21:33:13.185094+00:00 | NaT | None | None | None | False | 2025-11-05 21:33:13.185000+00:00 | 1 | 1 | 1 | NaN | None | NaN | 1 | NaN |
Storage
| uid | root | description | type | region | instance_uid | is_locked | created_at | branch_id | space_id | created_by_id | run_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||
| 1 | wOBrh4mOTrHA | /home/runner/work/lamindb/lamindb/docs/test-track | None | local | None | 73KPGC58ahU9 | False | 2025-11-05 21:33:09.257000+00:00 | 1 | 1 | 1 | None |
Transform
| uid | key | description | type | source_code | hash | reference | reference_type | version | is_latest | is_locked | created_at | branch_id | space_id | created_by_id | _template_id | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||
| 6 | k2S5AI75Bdr40000 | run_workflow.py/subset_dataframe.py | None | function | @ln.tracked()\ndef subset_dataframe(\n arti... | 9NYMDP5l5Iuu9F8VrO3vWQ | None | None | None | True | False | 2025-11-05 21:33:27.762000+00:00 | 1 | 1 | 1 | None |
| 5 | Z0Yx1gPOS3xm0000 | run_workflow.py | None | script | import argparse\nimport lamindb as ln\n\n\n@ln... | fwij4oyLV27mmm9f2GVY_A | None | None | None | True | False | 2025-11-05 21:33:26.740000+00:00 | 1 | 1 | 1 | None |
| 4 | fxXqxn3AbO8W0000 | track.ipynb/subset_dataframe.py | None | function | @ln.tracked()\ndef subset_dataframe(\n inpu... | CUqkJpolJY1Q1tqyCoWIWg | None | None | None | True | False | 2025-11-05 21:33:23.658000+00:00 | 1 | 1 | 1 | None |
| 3 | qfXlgJu7IUvm0000 | run_track_with_features_and_params.py | None | script | import argparse\nimport lamindb as ln\n\n\nif ... | 9MjLyvM1QzE2nPIPDRzBwg | None | None | None | True | False | 2025-11-05 21:33:21.962000+00:00 | 1 | 1 | 1 | None |
| 2 | jTSyIY1c5q580000 | run_track_with_params.py | None | script | import argparse\nimport lamindb as ln\n\nif __... | 5RBz7zJICeKE1OSmg7gEdQ | None | None | None | True | False | 2025-11-05 21:33:17.901000+00:00 | 1 | 1 | 1 | None |
| 1 | NsUIB9UUK0LO0000 | track.ipynb | Track notebooks, scripts & functions | notebook | None | None | None | None | None | True | False | 2025-11-05 21:33:13.177000+00:00 | 1 | 1 | 1 | None |
Manage notebook templates¶
A notebook acts like a template upon using lamin load to load it. Consider you run:
lamin load https://lamin.ai/account/instance/transform/Akd7gx7Y9oVO0000
Upon running the returned notebook, you’ll automatically create a new version and be able to browse it via the version dropdown on the UI.
Additionally, you can:
label using
Record, e.g.,transform.records.add(template_label)tag with an indicative
versionstring, e.g.,transform.version = "T1"; transform.save()
Saving a notebook as an artifact
Sometimes you might want to save a notebook as an artifact. This is how you can do it:
lamin save template1.ipynb --key templates/template1.ipynb --description "Template for analysis type 1" --registry artifact
A few checks at the end of this notebook:
assert run.params == {
"input_dir": "./mydataset",
"learning_rate": 0.01,
"preprocess_params": {"downsample": True, "normalization": "the_good_one"},
}, run.params
assert my_project.artifacts.exists()
assert my_project.transforms.exists()
assert my_project.runs.exists()