Track notebooks & scripts

In addition to tracking Python scripts, LaminDB tracks interactive analyses performed in notebooks.

By calling track() in a notebook or script, input data, and output data get automatically registered associated with the run.


Provenance tracking of notebooks & scripts is analogous to tracking pipelines, scripts & UI data manipulation, see Project flow.


Install the lamindb Python package:

pip install 'lamindb[jupyter]'
!lamin init --storage ./test-track
Hide code cell output
💡 connected lamindb: testuser1/test-track
import lamindb as ln

ln.settings.verbosity = "hint"
💡 connected lamindb: testuser1/test-track

Initiate tracking

Call track() to auto-generate IDs to track data lineage. Copy these into your cell above track().

ln.settings.transform.stem_uid = "9priar0hoE5u"
ln.settings.transform.version = "1"
💡 notebook imports: lamindb==0.74.1
💡 saved: Transform(uid='9priar0hoE5u5zKv', version='1', name='Track notebooks & scripts', key='track', type='notebook', created_by_id=1, updated_at='2024-07-06 13:06:23 UTC')
💡 saved: Run(uid='aCfUcGDU4pGjwBIJCrCG', transform_id=1, created_by_id=1)
💡 tracked pip freeze > /home/runner/.cache/lamindb/run_env_pip_aCfUcGDU4pGjwBIJCrCG.txt
Run(uid='aCfUcGDU4pGjwBIJCrCG', started_at='2024-07-06 13:06:23 UTC', is_consecutive=True, transform_id=1, created_by_id=1)

LaminDB now automatically tracks all input and output data.

Save run reports and source artifact

If you want to save a notebook including its run report & source artifact, run:


See how a transform with execution reports looks in LaminHub:

Query for a notebook or script

In the API, filter the Transform registry to obtain a transform record:

import lamindb as ln

transform = ln.Transform.filter(name="Track notebooks & scripts").one()
# Your notebook is linked with to its source code (stripped of its output cells) and execution report (with the notebook's output cells)

On LaminHub, use the search or filter in the Transform view.

Sync script transforms with GitHub

To sync with your git commit, add the following line to your script:

ln.settings.sync_git_repo = <YOUR-GIT-REPO-URL>

A tracked Python script typically looks like this:


# initiate tracking
ln.settings.transform.stem_uid = "9priar0hoE5u"
ln.settings.transform.version = "1"
ln.settings.sync_git_repo = ""
run = ln.track()
# you may tag your transform so that it's easier to find
ulabel = ln.ULabel.filter(name="guide").one()

# load input artifacts
artifact = ln.Artifact.filter(...).one()
output_data = ...

# save output artifacts
output_artifact = ln.Artifact(output_data, ...).save()

# save the script as transform.source_code

See how a tracked and git-synced script looks in LaminHub:

Hide code cell content
# clean up test instance
!lamin delete --force test-track
!rm -r test-track
💡 deleting instance testuser1/test-track
rm: cannot remove 'test-track': No such file or directory