Snakemake .md .md

Warning

This notebook is a demo for Python scripting that you could run before and after Snakemake runs. Typically, you would include lamindb directly within your Snakemake workflow.

Snakemake is a popular workflow manager in bioinformatics. This guide is based on the example of the rna-seq-star-deseq2 pipeline.

First we clone the Snakemake pipeline with git. Because the test datasets come with the repo and, for simplicity, we want to avoid moving them into another directory, we initialize a LaminDB instance in the same directory.

# pip install lamindb snakemake
!git clone https://github.com/snakemake-workflows/rna-seq-star-deseq2 --single-branch --branch v3.1.0
!lamin init --storage ./rna-seq-star-deseq2
Hide code cell output
Cloning into 'rna-seq-star-deseq2'...
remote: Enumerating objects: 874, done.
remote: Counting objects:   0% (1/322)
remote: Counting objects:   1% (4/322)
remote: Counting objects:   2% (7/322)
remote: Counting objects:   3% (10/322)
remote: Counting objects:   4% (13/322)
remote: Counting objects:   5% (17/322)
remote: Counting objects:   6% (20/322)
remote: Counting objects:   7% (23/322)
remote: Counting objects:   8% (26/322)
remote: Counting objects:   9% (29/322)
remote: Counting objects:  10% (33/322)
remote: Counting objects:  11% (36/322)
remote: Counting objects:  12% (39/322)
remote: Counting objects:  13% (42/322)
remote: Counting objects:  14% (46/322)
remote: Counting objects:  15% (49/322)
remote: Counting objects:  16% (52/322)
remote: Counting objects:  17% (55/322)
remote: Counting objects:  18% (58/322)
remote: Counting objects:  19% (62/322)
remote: Counting objects:  20% (65/322)
remote: Counting objects:  21% (68/322)
remote: Counting objects:  22% (71/322)
remote: Counting objects:  23% (75/322)
remote: Counting objects:  24% (78/322)
remote: Counting objects:  25% (81/322)
remote: Counting objects:  26% (84/322)
remote: Counting objects:  27% (87/322)
remote: Counting objects:  28% (91/322)
remote: Counting objects:  29% (94/322)
remote: Counting objects:  30% (97/322)
remote: Counting objects:  31% (100/322)
remote: Counting objects:  32% (104/322)
remote: Counting objects:  33% (107/322)
remote: Counting objects:  34% (110/322)
remote: Counting objects:  35% (113/322)
remote: Counting objects:  36% (116/322)
remote: Counting objects:  37% (120/322)
remote: Counting objects:  38% (123/322)
remote: Counting objects:  39% (126/322)
remote: Counting objects:  40% (129/322)
remote: Counting objects:  41% (133/322)
remote: Counting objects:  42% (136/322)
remote: Counting objects:  43% (139/322)
remote: Counting objects:  44% (142/322)
remote: Counting objects:  45% (145/322)
remote: Counting objects:  46% (149/322)
remote: Counting objects:  47% (152/322)
remote: Counting objects:  48% (155/322)
remote: Counting objects:  49% (158/322)
remote: Counting objects:  50% (161/322)
remote: Counting objects:  51% (165/322)
remote: Counting objects:  52% (168/322)
remote: Counting objects:  53% (171/322)
remote: Counting objects:  54% (174/322)
remote: Counting objects:  55% (178/322)
remote: Counting objects:  56% (181/322)
remote: Counting objects:  57% (184/322)
remote: Counting objects:  58% (187/322)
remote: Counting objects:  59% (190/322)
remote: Counting objects:  60% (194/322)
remote: Counting objects:  61% (197/322)
remote: Counting objects:  62% (200/322)
remote: Counting objects:  63% (203/322)
remote: Counting objects:  64% (207/322)
remote: Counting objects:  65% (210/322)
remote: Counting objects:  66% (213/322)
remote: Counting objects:  67% (216/322)
remote: Counting objects:  68% (219/322)
remote: Counting objects:  69% (223/322)
remote: Counting objects:  70% (226/322)
remote: Counting objects:  71% (229/322)
remote: Counting objects:  72% (232/322)
remote: Counting objects:  73% (236/322)
remote: Counting objects:  74% (239/322)
remote: Counting objects:  75% (242/322)
remote: Counting objects:  76% (245/322)
remote: Counting objects:  77% (248/322)
remote: Counting objects:  78% (252/322)
remote: Counting objects:  79% (255/322)
remote: Counting objects:  80% (258/322)
remote: Counting objects:  81% (261/322)
remote: Counting objects:  82% (265/322)
remote: Counting objects:  83% (268/322)
remote: Counting objects:  84% (271/322)
remote: Counting objects:  85% (274/322)
remote: Counting objects:  86% (277/322)
remote: Counting objects:  87% (281/322)
remote: Counting objects:  88% (284/322)
remote: Counting objects:  89% (287/322)
remote: Counting objects:  90% (290/322)
remote: Counting objects:  91% (294/322)
remote: Counting objects:  92% (297/322)
remote: Counting objects:  93% (300/322)
remote: Counting objects:  94% (303/322)
remote: Counting objects:  95% (306/322)
remote: Counting objects:  96% (310/322)
remote: Counting objects:  97% (313/322)
remote: Counting objects:  98% (316/322)
remote: Counting objects:  99% (319/322)
remote: Counting objects: 100% (322/322)
remote: Counting objects: 100% (322/322), done.
remote: Compressing objects:   0% (1/144)
remote: Compressing objects:   1% (2/144)
remote: Compressing objects:   2% (3/144)
remote: Compressing objects:   3% (5/144)
remote: Compressing objects:   4% (6/144)
remote: Compressing objects:   5% (8/144)
remote: Compressing objects:   6% (9/144)
remote: Compressing objects:   7% (11/144)
remote: Compressing objects:   8% (12/144)
remote: Compressing objects:   9% (13/144)
remote: Compressing objects:  10% (15/144)
remote: Compressing objects:  11% (16/144)
remote: Compressing objects:  12% (18/144)
remote: Compressing objects:  13% (19/144)
remote: Compressing objects:  14% (21/144)
remote: Compressing objects:  15% (22/144)
remote: Compressing objects:  16% (24/144)
remote: Compressing objects:  17% (25/144)
remote: Compressing objects:  18% (26/144)
remote: Compressing objects:  19% (28/144)
remote: Compressing objects:  20% (29/144)
remote: Compressing objects:  21% (31/144)
remote: Compressing objects:  22% (32/144)
remote: Compressing objects:  23% (34/144)
remote: Compressing objects:  24% (35/144)
remote: Compressing objects:  25% (36/144)
remote: Compressing objects:  26% (38/144)
remote: Compressing objects:  27% (39/144)
remote: Compressing objects:  28% (41/144)
remote: Compressing objects:  29% (42/144)
remote: Compressing objects:  30% (44/144)
remote: Compressing objects:  31% (45/144)
remote: Compressing objects:  32% (47/144)
remote: Compressing objects:  33% (48/144)
remote: Compressing objects:  34% (49/144)
remote: Compressing objects:  35% (51/144)
remote: Compressing objects:  36% (52/144)
remote: Compressing objects:  37% (54/144)
remote: Compressing objects:  38% (55/144)
remote: Compressing objects:  39% (57/144)
remote: Compressing objects:  40% (58/144)
remote: Compressing objects:  41% (60/144)
remote: Compressing objects:  42% (61/144)
remote: Compressing objects:  43% (62/144)
remote: Compressing objects:  44% (64/144)
remote: Compressing objects:  45% (65/144)
remote: Compressing objects:  46% (67/144)
remote: Compressing objects:  47% (68/144)
remote: Compressing objects:  48% (70/144)
remote: Compressing objects:  49% (71/144)
remote: Compressing objects:  50% (72/144)
remote: Compressing objects:  51% (74/144)
remote: Compressing objects:  52% (75/144)
remote: Compressing objects:  53% (77/144)
remote: Compressing objects:  54% (78/144)
remote: Compressing objects:  55% (80/144)
remote: Compressing objects:  56% (81/144)
remote: Compressing objects:  57% (83/144)
remote: Compressing objects:  58% (84/144)
remote: Compressing objects:  59% (85/144)
remote: Compressing objects:  60% (87/144)
remote: Compressing objects:  61% (88/144)
remote: Compressing objects:  62% (90/144)
remote: Compressing objects:  63% (91/144)
remote: Compressing objects:  64% (93/144)
remote: Compressing objects:  65% (94/144)
remote: Compressing objects:  66% (96/144)
remote: Compressing objects:  67% (97/144)
remote: Compressing objects:  68% (98/144)
remote: Compressing objects:  69% (100/144)
remote: Compressing objects:  70% (101/144)
remote: Compressing objects:  71% (103/144)
remote: Compressing objects:  72% (104/144)
remote: Compressing objects:  73% (106/144)
remote: Compressing objects:  74% (107/144)
remote: Compressing objects:  75% (108/144)
remote: Compressing objects:  76% (110/144)
remote: Compressing objects:  77% (111/144)
remote: Compressing objects:  78% (113/144)
remote: Compressing objects:  79% (114/144)
remote: Compressing objects:  80% (116/144)
remote: Compressing objects:  81% (117/144)
remote: Compressing objects:  82% (119/144)
remote: Compressing objects:  83% (120/144)
remote: Compressing objects:  84% (121/144)
remote: Compressing objects:  85% (123/144)
remote: Compressing objects:  86% (124/144)
remote: Compressing objects:  87% (126/144)
remote: Compressing objects:  88% (127/144)
remote: Compressing objects:  89% (129/144)
remote: Compressing objects:  90% (130/144)
remote: Compressing objects:  91% (132/144)
remote: Compressing objects:  92% (133/144)
remote: Compressing objects:  93% (134/144)
remote: Compressing objects:  94% (136/144)
remote: Compressing objects:  95% (137/144)
remote: Compressing objects:  96% (139/144)
remote: Compressing objects:  97% (140/144)
remote: Compressing objects:  98% (142/144)
remote: Compressing objects:  99% (143/144)
remote: Compressing objects: 100% (144/144)
remote: Compressing objects: 100% (144/144), done.
Receiving objects:   0% (1/874)
Receiving objects:   1% (9/874)
Receiving objects:   2% (18/874)
Receiving objects:   3% (27/874)
Receiving objects:   4% (35/874)
Receiving objects:   5% (44/874)
Receiving objects:   6% (53/874)
Receiving objects:   7% (62/874)
Receiving objects:   8% (70/874)
Receiving objects:   9% (79/874)
Receiving objects:  10% (88/874)
Receiving objects:  11% (97/874)
Receiving objects:  12% (105/874)
Receiving objects:  13% (114/874)
Receiving objects:  14% (123/874)
Receiving objects:  15% (132/874)
Receiving objects:  16% (140/874)
Receiving objects:  17% (149/874)
Receiving objects:  18% (158/874)
Receiving objects:  19% (167/874)
Receiving objects:  20% (175/874)
Receiving objects:  21% (184/874)
Receiving objects:  22% (193/874)
Receiving objects:  23% (202/874)
Receiving objects:  24% (210/874)
Receiving objects:  25% (219/874)
Receiving objects:  26% (228/874)
Receiving objects:  27% (236/874)
Receiving objects:  28% (245/874)
Receiving objects:  29% (254/874)
Receiving objects:  30% (263/874)
Receiving objects:  31% (271/874)
Receiving objects:  32% (280/874)
Receiving objects:  33% (289/874)
Receiving objects:  34% (298/874)
Receiving objects:  35% (306/874)
Receiving objects:  36% (315/874)
Receiving objects:  37% (324/874)
Receiving objects:  38% (333/874)
Receiving objects:  39% (341/874)
Receiving objects:  40% (350/874)
Receiving objects:  41% (359/874)
Receiving objects:  42% (368/874)
Receiving objects:  43% (376/874)
Receiving objects:  44% (385/874)
Receiving objects:  45% (394/874)
Receiving objects:  46% (403/874)
Receiving objects:  47% (411/874)
Receiving objects:  48% (420/874)
Receiving objects:  49% (429/874)
Receiving objects:  50% (437/874)
Receiving objects:  51% (446/874)
Receiving objects:  52% (455/874)
Receiving objects:  53% (464/874)
Receiving objects:  54% (472/874)
Receiving objects:  55% (481/874)
Receiving objects:  56% (490/874)
Receiving objects:  57% (499/874)
Receiving objects:  58% (507/874)
Receiving objects:  59% (516/874)
Receiving objects:  60% (525/874)
remote: Total 874 (delta 241), reused 178 (delta 178), pack-reused 552 (from 1)
Receiving objects:  61% (534/874)
Receiving objects:  62% (542/874)
Receiving objects:  63% (551/874)
Receiving objects:  64% (560/874)
Receiving objects:  65% (569/874)
Receiving objects:  66% (577/874)
Receiving objects:  67% (586/874)
Receiving objects:  68% (595/874)
Receiving objects:  69% (604/874)
Receiving objects:  70% (612/874)
Receiving objects:  71% (621/874)
Receiving objects:  72% (630/874)
Receiving objects:  73% (639/874)
Receiving objects:  74% (647/874)
Receiving objects:  75% (656/874)
Receiving objects:  76% (665/874)
Receiving objects:  77% (673/874)
Receiving objects:  78% (682/874)
Receiving objects:  79% (691/874)
Receiving objects:  80% (700/874)
Receiving objects:  81% (708/874)
Receiving objects:  82% (717/874)
Receiving objects:  83% (726/874)
Receiving objects:  84% (735/874)
Receiving objects:  85% (743/874)
Receiving objects:  86% (752/874)
Receiving objects:  87% (761/874)
Receiving objects:  88% (770/874)
Receiving objects:  89% (778/874)
Receiving objects:  90% (787/874)
Receiving objects:  91% (796/874)
Receiving objects:  92% (805/874)
Receiving objects:  93% (813/874)
Receiving objects:  94% (822/874)
Receiving objects:  95% (831/874)
Receiving objects:  96% (840/874)
Receiving objects:  97% (848/874)
Receiving objects:  98% (857/874)
Receiving objects:  99% (866/874)
Receiving objects: 100% (874/874)
Receiving objects: 100% (874/874), 16.97 MiB | 50.37 MiB/s, done.
Resolving deltas:   0% (0/465)
Resolving deltas:   1% (5/465)
Resolving deltas:   2% (10/465)
Resolving deltas:   3% (14/465)
Resolving deltas:   4% (19/465)
Resolving deltas:   5% (24/465)
Resolving deltas:   6% (28/465)
Resolving deltas:   7% (33/465)
Resolving deltas:   8% (38/465)
Resolving deltas:   9% (42/465)
Resolving deltas:  10% (47/465)
Resolving deltas:  11% (52/465)
Resolving deltas:  12% (58/465)
Resolving deltas:  13% (61/465)
Resolving deltas:  14% (66/465)
Resolving deltas:  15% (70/465)
Resolving deltas:  16% (75/465)
Resolving deltas:  17% (80/465)
Resolving deltas:  18% (84/465)
Resolving deltas:  19% (89/465)
Resolving deltas:  20% (93/465)
Resolving deltas:  21% (98/465)
Resolving deltas:  22% (103/465)
Resolving deltas:  23% (107/465)
Resolving deltas:  24% (112/465)
Resolving deltas:  25% (117/465)
Resolving deltas:  26% (121/465)
Resolving deltas:  27% (126/465)
Resolving deltas:  28% (131/465)
Resolving deltas:  29% (135/465)
Resolving deltas:  30% (140/465)
Resolving deltas:  31% (145/465)
Resolving deltas:  32% (149/465)
Resolving deltas:  33% (154/465)
Resolving deltas:  34% (160/465)
Resolving deltas:  35% (163/465)
Resolving deltas:  36% (168/465)
Resolving deltas:  37% (173/465)
Resolving deltas:  38% (177/465)
Resolving deltas:  39% (183/465)
Resolving deltas:  40% (186/465)
Resolving deltas:  41% (191/465)
Resolving deltas:  42% (196/465)
Resolving deltas:  43% (200/465)
Resolving deltas:  44% (205/465)
Resolving deltas:  45% (210/465)
Resolving deltas:  46% (215/465)
Resolving deltas:  47% (219/465)
Resolving deltas:  48% (224/465)
Resolving deltas:  49% (228/465)
Resolving deltas:  50% (233/465)
Resolving deltas:  51% (238/465)
Resolving deltas:  52% (242/465)
Resolving deltas:  53% (247/465)
Resolving deltas:  54% (252/465)
Resolving deltas:  55% (256/465)
Resolving deltas:  56% (261/465)
Resolving deltas:  57% (266/465)
Resolving deltas:  58% (270/465)
Resolving deltas:  59% (275/465)
Resolving deltas:  60% (279/465)
Resolving deltas:  61% (284/465)
Resolving deltas:  62% (289/465)
Resolving deltas:  63% (293/465)
Resolving deltas:  64% (298/465)
Resolving deltas:  65% (303/465)
Resolving deltas:  66% (307/465)
Resolving deltas:  67% (312/465)
Resolving deltas:  68% (317/465)
Resolving deltas:  69% (321/465)
Resolving deltas:  70% (326/465)
Resolving deltas:  71% (331/465)
Resolving deltas:  72% (335/465)
Resolving deltas:  73% (340/465)
Resolving deltas:  74% (345/465)
Resolving deltas:  75% (349/465)
Resolving deltas:  76% (356/465)
Resolving deltas:  77% (359/465)
Resolving deltas:  78% (363/465)
Resolving deltas:  79% (368/465)
Resolving deltas:  80% (373/465)
Resolving deltas:  81% (377/465)
Resolving deltas:  82% (383/465)
Resolving deltas:  83% (386/465)
Resolving deltas:  84% (391/465)
Resolving deltas:  85% (396/465)
Resolving deltas:  86% (400/465)
Resolving deltas:  87% (405/465)
Resolving deltas:  88% (411/465)
Resolving deltas:  89% (414/465)
Resolving deltas:  90% (419/465)
Resolving deltas:  91% (424/465)
Resolving deltas:  92% (428/465)
Resolving deltas:  93% (434/465)
Resolving deltas:  94% (438/465)
Resolving deltas:  95% (442/465)
Resolving deltas:  96% (447/465)
Resolving deltas:  97% (452/465)
Resolving deltas:  98% (456/465)
Resolving deltas:  99% (461/465)
Resolving deltas: 100% (465/465)
Resolving deltas: 100% (465/465), done.
Note: switching to '5fbe51c94aa1f5c8c8cbdca379eb00436d6491bf'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false
! using anonymous user (to identify, call: lamin login)
 initialized lamindb: anonymous/rna-seq-star-deseq2
import lamindb as ln
import subprocess
from pathlib import Path
Hide code cell output
 connected lamindb: anonymous/rna-seq-star-deseq2

Registering inputs

root_dir = "rna-seq-star-deseq2"
sample_sheet = ln.Artifact(f"{root_dir}/.test/config_basic/samples.tsv").save()
input_fastqs = ln.Artifact.from_dir(f"{root_dir}/.test/ngs-test-data/reads/")
ln.save(input_fastqs)
Hide code cell output
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! there are multiple artifact uids with the same hashes, dropping 2 duplicates out of 10 artifacts:
    DMYyy2vT6QGIrPoH0000
    sgRF6GknTZJpSGRo0000

Track a Snakemake run

Track the Snakemake workflow & run:

transform = ln.Transform(
    key="snakemake-rna-seq-star-deseq2",
    version="2.0.0",
    type="pipeline",
    reference="https://github.com/snakemake-workflows/rna-seq-star-deseq2",
)
ln.track(transform)
Hide code cell output
/tmp/ipykernel_3990/1213032725.py:1: DeprecationWarning: `type` argument of transform was renamed to `kind` and will be removed in a future release.
  transform = ln.Transform(
 created Transform('bT0qjor0geNx0000', key='snakemake-rna-seq-star-deseq2'), started new Run('FhqEJbqFCPAIaodY') at 2026-05-06 03:27:46 UTC

If we call cache() on the input artifacts, they’ll be downloaded into a cache and tracked as run inputs. In this test case however, no download happened because the files are already available locally.

input_sample_sheet_path = sample_sheet.cache()
input_paths = [input_fastq.cache() for input_fastq in input_fastqs]

Let’s run the pipeline.

To make this robust in CI, we target outputs that don’t depend on live Ensembl biomaRt lookups (which can be intermittently unavailable).

subprocess.run(
    [
        "snakemake",
        "--directory",
        "rna-seq-star-deseq2/.test",
        "--snakefile",
        "rna-seq-star-deseq2/workflow/Snakefile",
        "--configfile",
        "rna-seq-star-deseq2/.test/config_basic/config.yaml",
        "--use-conda",
        "--show-failed-logs",
        "--cores",
        "2",
        "--conda-frontend",
        "conda",
        "--conda-cleanup-pkgs",
        "cache",
        "results/counts/all.tsv",
        "results/qc/multiqc_report.html",
    ],
    check=True,
)
Hide code cell output
CompletedProcess(args=['snakemake', '--directory', 'rna-seq-star-deseq2/.test', '--snakefile', 'rna-seq-star-deseq2/workflow/Snakefile', '--configfile', 'rna-seq-star-deseq2/.test/config_basic/config.yaml', '--use-conda', '--show-failed-logs', '--cores', '2', '--conda-frontend', 'conda', '--conda-cleanup-pkgs', 'cache', 'results/counts/all.tsv', 'results/qc/multiqc_report.html'], returncode=0)

Registering outputs

Quality control.

multiqc_file = ln.Artifact(f"{root_dir}/.test/results/qc/multiqc_report.html").save()
How would I register all QC files?
multiqc_results = ln.Artifact.from_dir(f"{root_dir}/results/qc/multiqc_report_data/")
ln.save(multiqc_results)

Count matrix.

count_matrix_path = Path(root_dir) / ".test/results/counts/all.tsv"
if not count_matrix_path.exists():
    raise FileNotFoundError(
        f"Expected output not found: {count_matrix_path}. "
        "Inspect Snakemake logs under rna-seq-star-deseq2/.test/logs/"
    )

count_matrix = ln.Artifact(count_matrix_path).save()

Visualize

View data lineage:

count_matrix.view_lineage()
Hide code cell output
! calling anonymously, will miss private instances
_images/c1d1cb41338f98f7f79c8af33ca7b8634b181536b9ed63a543d623f8ee015c32.svg

Appendix

Linking biological entities

To make the count matrix queryable by biological entities (genes, experimental metadata, etc.), we can now proceed with: Bulk RNA-seq

Linking a Snakemake run ID

Snakemake does not have an easily accessible ID that is associated with a run. Therefore, we need to extract it from the log files.

import pathlib
from datetime import datetime

PATH_TO_DOT_SNAKEMAKE_LOG = "rna-seq-star-deseq2/.test/.snakemake/log"
log_files_file_names = list(
    map(
        lambda lf: str(lf).split("/")[-1],
        list(pathlib.Path(PATH_TO_DOT_SNAKEMAKE_LOG).glob("*.snakemake.log")),
    )
)

timestamps = [
    datetime.strptime(filename.split(".")[0], "%Y-%m-%dT%H%M%S")
    for filename in log_files_file_names
]
snakemake_id = log_files_file_names[timestamps.index(max(timestamps))].split(".")[1]

Let us add the information about the session ID to our run record:

run = ln.context.run  # let's grab the global run record
run.reference = snakemake_id
run.reference_type = "snakemake_id"
run.save()
Hide code cell output
Run(uid='FhqEJbqFCPAIaodY', name=None, description=None, entrypoint=None, started_at=2026-05-06 03:27:46 UTC, finished_at=None, params=None, reference='219178', reference_type='snakemake_id', cli_args=None, branch_id=1, created_on_id=1, space_id=1, transform_id=1, report_id=None, environment_id=None, plan_id=None, created_by_id=1, initiated_by_run_id=None, created_at=2026-05-06 03:27:46 UTC, is_locked=False)