Redun

Here, we’ll see how to track redun workflow runs with LaminDB.

Note

This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.

# pip install lamindb redun git+http://github.com/laminlabs/redun-lamin-fasta
!lamin init --storage ./test-redun-lamin
Hide code cell output
 initialized lamindb: testuser1/test-redun-lamin

Amend the workflow

import lamindb as ln
import json
Hide code cell output
 connected lamindb: testuser1/test-redun-lamin

Let’s amend a redun workflow.py to register input & output artifacts in LaminDB:

  • To track the workflow run in LaminDB, add (see on GitHub):

    ln.track(params=params)
    
  • To register the output file via LaminDB, add (see on GitHub):

    ln.Artifact(output_path, description="results").save()
    

Run redun

Let’s see what the input files are:

!ls ./fasta
Hide code cell output
KLF4.fasta  MYC.fasta  PO5F1.fasta  SOX2.fasta

And call the workflow:

!redun run workflow.py main --input-dir ./fasta --tag run=test-run  1> run_logs.txt 2>run_logs.txt

Inspect the logs:

!cat run_logs.txt
Hide code cell output
 connected lamindb: testuser1/test-redun-lamin
 created Transform('73Wt9j2dxj1H0000'), started new Run('JgTS0g0x...') at 2025-04-15 16:33:24 UTC
→ params: input_dir=./fasta, amino_acid=C, enzyme_regex=[KR], missed_cleavages=0, min_length=4, max_length=75, executor=Executor.default
! folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-[redun] Run    Job finished Run('JgTS0g0x') after 4s at 2025-04-15 16:33:28 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=352c96ea)
000.fasta, hash=699b7006), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job 1fde0a36:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/4OcSccGHlzNrNkOs0000.fasta, hash=95b849e0), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job 01f01797:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/CbSa0ol62HD6vxW70000.fasta, hash=acc3b019), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job 0825f16c:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/r81MZjUjMV8r4ito0000.fasta, hash=c4cb1963), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job a0bfc460:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/w5pWBrVFspTPQ0rE0000.fasta, hash=699b7006), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/w5pWBrVFspTPQ0rE0000.peptides.txt, hash=b3fc99ff), amino_acid='C') on default
[redun] Run    Job a7361857:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/4OcSccGHlzNrNkOs0000.fasta, hash=95b849e0), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/4OcSccGHlzNrNkOs0000.peptides.txt, hash=f4b029ad), amino_acid='C') on default
[redun] Run    Job cb1510ab:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/CbSa0ol62HD6vxW70000.fasta, hash=acc3b019), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/CbSa0ol62HD6vxW70000.peptides.txt, hash=7d01428b), amino_acid='C') on default
[redun] Run    Job 4088c31f:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/r81MZjUjMV8r4ito0000.fasta, hash=c4cb1963), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/r81MZjUjMV8r4ito0000.peptides.txt, hash=581b5516), amino_acid='C') on default
[redun] Run    Job 9c3fa074:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/w5pWBrVFspTPQ0rE0000.count.tsv, hash=19799282)) on default
[redun] Run    Job f0d63475:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/4OcSccGHlzNrNkOs0000.count.tsv, hash=a0b6d945)) on default
[redun] Run    Job aebbf865:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/CbSa0ol62HD6vxW70000.count.tsv, hash=02f7d3dd)) on default
[redun] Run    Job f350da58:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/r81MZjUjMV8r4ito0000.count.tsv, hash=fd974743)) on default
[redun] Run    Job 922d1071:  bioinformatics_pipeline_tutorial.lib.get_report_task(input_counts=[File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/w5pWBrVFspTPQ0rE0000.count.tsv, hash=19799282), File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-l...) on default
[redun] Run    Job 9776eba7:  bioinformatics_pipeline_tutorial.lib.archive_results_task(inputs_plots=[File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/w5pWBrVFspTPQ0rE0000.plot.png, hash=5459e184), File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-la..., input_report=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/protein_report.tsv, hash=f9de8f1f)) on default
[redun] Run    Job 1cde791a:  redun_lamin_fasta.finish(results_archive=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=352c96ea)) on default
[redun] 
[redun] | JOB STATUS 2025/04/15 16:33:29
[redun] | TASK                                                        PENDING RUNNING  FAILED  CACHED    DONE   TOTAL
[redun] | 
[redun] | ALL                                                               0       0       0       0      16      16
[redun] | bioinformatics_pipeline_tutorial.lib.archive_results_task         0       0       0       0       1       1
[redun] | bioinformatics_pipeline_tutorial.lib.count_amino_acids_task       0       0       0       0       4       4
[redun] | bioinformatics_pipeline_tutorial.lib.digest_protein_task          0       0       0       0       4       4
[redun] | bioinformatics_pipeline_tutorial.lib.get_report_task              0       0       0       0       1       1
[redun] | bioinformatics_pipeline_tutorial.lib.plot_count_task              0       0       0       0       4       4
[redun] | redun_lamin_fasta.finish                                          0       0       0       0       1       1
[redun] | redun_lamin_fasta.main                                            0       0       0       0       1       1
[redun] 
[redun] 
[redun] Execution duration: 5.64 seconds

View data lineage:

artifact = ln.Artifact.get(key="data/results.tgz")
artifact.view_lineage()
Hide code cell output
_images/1e3e3a2608757b9800ed6e2d0285bfff4c793b16af368eacf39a85addd53e977.svg

Track the redun execution id

If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:

# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json") as file:
    redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='JgTS0g0x1MpChdqcuW8X', started_at=2025-04-15 16:33:24 UTC, finished_at=2025-04-15 16:33:28 UTC, reference='913e800e-49dc-4f83-83ac-14b03bec5254', reference_type='redun_id', space_id=1, transform_id=1, report_id=7, environment_id=6, created_by_id=1, created_at=2025-04-15 16:33:24 UTC)

Track the redun run report

Attach a run report:

report = ln.Artifact(
    "run_logs.txt",
    description=f"Redun run report of {redun_exec['id']}",
    run=False,
    visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='JgTS0g0x1MpChdqcuW8X', started_at=2025-04-15 16:33:24 UTC, finished_at=2025-04-15 16:33:28 UTC, reference='913e800e-49dc-4f83-83ac-14b03bec5254', reference_type='redun_id', space_id=1, transform_id=1, report_id=8, environment_id=6, created_by_id=1, created_at=2025-04-15 16:33:24 UTC)

View transforms and runs in LaminHub

hub

View the database content

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux _branch_code
id
5 LUoqBphMkMUrIZIC0000 data/results.tgz None .tgz None None 83702 EtTQJPB8es0-uA1tfEosPA None None md5 False False 1 1 None None True 1.0 2025-04-15 16:33:28.262000+00:00 1 None 1
4 r81MZjUjMV8r4ito0000 fasta/KLF4.fasta None .fasta None None 609 LyuoYkWs4SgYcH7P7JLJtA None None md5 True False 1 1 None None True NaN 2025-04-15 16:33:25.477000+00:00 1 None 1
3 CbSa0ol62HD6vxW70000 fasta/MYC.fasta None .fasta None None 536 WGbEtzPw-3bQEGcngO_pHQ None None md5 True False 1 1 None None True NaN 2025-04-15 16:33:25.476000+00:00 1 None 1
2 4OcSccGHlzNrNkOs0000 fasta/SOX2.fasta None .fasta None None 414 C5q_yaFXGk4SAEpfdqBwnQ None None md5 True False 1 1 None None True NaN 2025-04-15 16:33:25.475000+00:00 1 None 1
1 w5pWBrVFspTPQ0rE0000 fasta/PO5F1.fasta None .fasta None None 477 -7iJgveFO9ia0wE1bqVu6g None None md5 True False 1 1 None None True NaN 2025-04-15 16:33:25.474000+00:00 1 None 1
Param
name dtype is_type _expect_many space_id type_id run_id created_at created_by_id _aux _branch_code
id
1 input_dir str None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
2 amino_acid str None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
3 enzyme_regex str None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
4 missed_cleavages int None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
5 min_length int None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
6 max_length int None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
7 executor str None False 1 None None 2025-04-15 16:33:24.119000+00:00 1 None 1
ParamValue
value hash space_id param_id created_at created_by_id _aux _branch_code
id
1 ./fasta None 1 1 2025-04-15 16:33:24.151000+00:00 1 None 1
2 C None 1 2 2025-04-15 16:33:24.151000+00:00 1 None 1
3 [KR] None 1 3 2025-04-15 16:33:24.151000+00:00 1 None 1
4 0 None 1 4 2025-04-15 16:33:24.151000+00:00 1 None 1
5 4 None 1 5 2025-04-15 16:33:24.151000+00:00 1 None 1
6 75 None 1 6 2025-04-15 16:33:24.151000+00:00 1 None 1
7 Executor.default None 1 7 2025-04-15 16:33:24.151000+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux _branch_code
id
1 JgTS0g0x1MpChdqcuW8X None 2025-04-15 16:33:24.131276+00:00 2025-04-15 16:33:28.320463+00:00 913e800e-49dc-4f83-83ac-14b03bec5254 redun_id True 0 1 1 8 None 6 None 2025-04-15 16:33:24.132000+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux _branch_code
id
1 cW9hJxrGsmcm /home/runner/work/redun-lamin/redun-lamin/docs... None local None iQlBPgD8uaqR 1 None 2025-04-15 16:33:06.067000+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux _branch_code
id
1 73Wt9j2dxj1H0000 workflow.py workflow.py script """workflow.py."""\n\n# This code is a copy fr... mHyxrE622q2fluxICu4XDw None None 1 None None True 2025-04-15 16:33:24.129000+00:00 1 None 1

Delete the test instance:

Hide code cell content
!rm -rf test-redun-lamin
!lamin delete --force test-redun-lamin
 deleting instance testuser1/test-redun-lamin