Redun

Here, we’ll see how to track redun workflow runs with LaminDB.

Note

This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.

# pip install lamindb redun git+http://github.com/laminlabs/redun-lamin-fasta
!lamin init --storage ./test-redun-lamin
Hide code cell output
 initialized lamindb: testuser1/test-redun-lamin

Amend the workflow

import lamindb as ln
import json
Hide code cell output
 connected lamindb: testuser1/test-redun-lamin

Let’s amend a redun workflow.py to register input & output artifacts in LaminDB:

  • To track the workflow run in LaminDB, add (see on GitHub):

    ln.track(params=params)
    
  • To register the output file via LaminDB, add (see on GitHub):

    ln.Artifact(output_path, description="results").save()
    

Run redun

Let’s see what the input files are:

!ls ./fasta
Hide code cell output
KLF4.fasta  MYC.fasta  PO5F1.fasta  SOX2.fasta

And call the workflow:

!redun run workflow.py main --input-dir ./fasta --tag run=test-run  1> run_logs.txt 2>run_logs.txt

Inspect the logs:

!cat run_logs.txt
Hide code cell output
 connected lamindb: testuser1/test-redun-lamin
 created Transform('XcS9ynv44R2v0000'), started new Run('xlz3Qitb...') at 2025-05-08 07:31:51 UTC
→ params: input_dir=./fasta, amino_acid=C, enzyme_regex=[KR], missed_cleavages=0, min_length=4, max_length=75, executor=Executor.default
 recommendation: to identify the script across renames, pass the uid: ln.track("XcS9ynv44R2v", params={...})
! folder i[redun] Run    Job 2d026057:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/ finished Run('xlz3Qitb') after 4s at 2025-05-08 07:31:56 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=e4be664a)
  Job 45ec13f4:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/KxVSxtLYQNNjFyU70000.fasta, hash=85e4f9af), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job 2ce8ebab:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/KXFsyjDIgyPQ2dBz0000.fasta, hash=efb00fd0), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job 4268354c:  bioinformatics_pipeline_tutorial.lib.digest_protein_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/gfLTVI8HTCR2v6y50000.fasta, hash=8812418e), enzyme_regex='[KR]', missed_cleavages=0, min_length=4, max_length=75) on default
[redun] Run    Job ea40081c:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/rPWKPJRUDswI6kFZ0000.fasta, hash=565414c8), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/rPWKPJRUDswI6kFZ0000.peptides.txt, hash=38e1c38e), amino_acid='C') on default
[redun] Run    Job 82b5db60:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/KxVSxtLYQNNjFyU70000.fasta, hash=85e4f9af), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/KxVSxtLYQNNjFyU70000.peptides.txt, hash=33035484), amino_acid='C') on default
[redun] Run    Job a6c129c2:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/KXFsyjDIgyPQ2dBz0000.fasta, hash=efb00fd0), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/KXFsyjDIgyPQ2dBz0000.peptides.txt, hash=9b722d12), amino_acid='C') on default
[redun] Run    Job 0327d20f:  bioinformatics_pipeline_tutorial.lib.count_amino_acids_task(input_fasta=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/gfLTVI8HTCR2v6y50000.fasta, hash=8812418e), input_peptides=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/gfLTVI8HTCR2v6y50000.peptides.txt, hash=5e6fbca0), amino_acid='C') on default
[redun] Run    Job 0c88a411:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/rPWKPJRUDswI6kFZ0000.count.tsv, hash=7f78b6ab)) on default
[redun] Run    Job 4b046ec4:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/KxVSxtLYQNNjFyU70000.count.tsv, hash=2ec3e07c)) on default
[redun] Run    Job 1a88bff5:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/KXFsyjDIgyPQ2dBz0000.count.tsv, hash=0c2c7886)) on default
[redun] Run    Job 857cff6c:  bioinformatics_pipeline_tutorial.lib.plot_count_task(input_count=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/gfLTVI8HTCR2v6y50000.count.tsv, hash=e926a6af)) on default
[redun] Run    Job 551d6c3e:  bioinformatics_pipeline_tutorial.lib.get_report_task(input_counts=[File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/rPWKPJRUDswI6kFZ0000.count.tsv, hash=7f78b6ab), File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-l...) on default
[redun] Run    Job 030894cf:  bioinformatics_pipeline_tutorial.lib.archive_results_task(inputs_plots=[File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/rPWKPJRUDswI6kFZ0000.plot.png, hash=0d2b9ceb), File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-la..., input_report=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/protein_report.tsv, hash=2b4d660c)) on default
[redun] Run    Job ceee79a1:  redun_lamin_fasta.finish(results_archive=File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=e4be664a)) on default
[redun] 
[redun] | JOB STATUS 2025/05/08 07:31:57
[redun] | TASK                                                        PENDING RUNNING  FAILED  CACHED    DONE   TOTAL
[redun] | 
[redun] | ALL                                                               0       0       0       0      16      16
[redun] | bioinformatics_pipeline_tutorial.lib.archive_results_task         0       0       0       0       1       1
[redun] | bioinformatics_pipeline_tutorial.lib.count_amino_acids_task       0       0       0       0       4       4
[redun] | bioinformatics_pipeline_tutorial.lib.digest_protein_task          0       0       0       0       4       4
[redun] | bioinformatics_pipeline_tutorial.lib.get_report_task              0       0       0       0       1       1
[redun] | bioinformatics_pipeline_tutorial.lib.plot_count_task              0       0       0       0       4       4
[redun] | redun_lamin_fasta.finish                                          0       0       0       0       1       1
[redun] | redun_lamin_fasta.main                                            0       0       0       0       1       1
[redun] 
[redun] 
[redun] Execution duration: 5.83 seconds

View data lineage:

artifact = ln.Artifact.get(key="data/results.tgz")
artifact.view_lineage()
Hide code cell output
_images/389715405069bdea26d97153d4fc0cf938886a203f462738b0fe544bcd94c216.svg

Track the redun execution id

If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:

# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json") as file:
    redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='xlz3Qitb0PFTYcrImOOJ', started_at=2025-05-08 07:31:51 UTC, finished_at=2025-05-08 07:31:56 UTC, reference='bf00aa36-19ed-459e-86f7-bb6af6f1f3b4', reference_type='redun_id', space_id=1, transform_id=1, report_id=7, environment_id=6, created_by_id=1, created_at=2025-05-08 07:31:51 UTC)

Track the redun run report

Attach a run report:

report = ln.Artifact(
    "run_logs.txt",
    description=f"Redun run report of {redun_exec['id']}",
    run=False,
    visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='xlz3Qitb0PFTYcrImOOJ', started_at=2025-05-08 07:31:51 UTC, finished_at=2025-05-08 07:31:56 UTC, reference='bf00aa36-19ed-459e-86f7-bb6af6f1f3b4', reference_type='redun_id', space_id=1, transform_id=1, report_id=8, environment_id=6, created_by_id=1, created_at=2025-05-08 07:31:51 UTC)

View transforms and runs in LaminHub

hub

View the database content

ln.view()
Hide code cell output
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux _branch_code
id
5 W2b2a2sqqeE2wGZq0000 data/results.tgz None .tgz None None 83528 OGNUiUCkc1mbaseT3Wm4Eg None None md5 False False 1 1 None None True 1.0 2025-05-08 07:31:55.960000+00:00 1 None 1
3 KXFsyjDIgyPQ2dBz0000 fasta/KLF4.fasta None .fasta None None 609 LyuoYkWs4SgYcH7P7JLJtA None None md5 True False 1 1 None None True NaN 2025-05-08 07:31:53.090000+00:00 1 None 1
4 gfLTVI8HTCR2v6y50000 fasta/PO5F1.fasta None .fasta None None 477 -7iJgveFO9ia0wE1bqVu6g None None md5 True False 1 1 None None True NaN 2025-05-08 07:31:53.090000+00:00 1 None 1
2 KxVSxtLYQNNjFyU70000 fasta/SOX2.fasta None .fasta None None 414 C5q_yaFXGk4SAEpfdqBwnQ None None md5 True False 1 1 None None True NaN 2025-05-08 07:31:53.089000+00:00 1 None 1
1 rPWKPJRUDswI6kFZ0000 fasta/MYC.fasta None .fasta None None 536 WGbEtzPw-3bQEGcngO_pHQ None None md5 True False 1 1 None None True NaN 2025-05-08 07:31:53.088000+00:00 1 None 1
Param
name dtype is_type _expect_many space_id type_id run_id created_at created_by_id _aux _branch_code
id
1 input_dir str None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
2 amino_acid str None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
3 enzyme_regex str None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
4 missed_cleavages int None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
5 min_length int None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
6 max_length int None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
7 executor str None False 1 None None 2025-05-08 07:31:51.637000+00:00 1 None 1
ParamValue
value hash space_id param_id created_at created_by_id _aux _branch_code
id
1 ./fasta None 1 1 2025-05-08 07:31:51.663000+00:00 1 None 1
2 C None 1 2 2025-05-08 07:31:51.665000+00:00 1 None 1
3 [KR] None 1 3 2025-05-08 07:31:51.667000+00:00 1 None 1
4 0 None 1 4 2025-05-08 07:31:51.669000+00:00 1 None 1
5 4 None 1 5 2025-05-08 07:31:51.671000+00:00 1 None 1
6 75 None 1 6 2025-05-08 07:31:51.673000+00:00 1 None 1
7 Executor.default None 1 7 2025-05-08 07:31:51.675000+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux _branch_code
id
1 xlz3Qitb0PFTYcrImOOJ None 2025-05-08 07:31:51.654361+00:00 2025-05-08 07:31:56.009635+00:00 bf00aa36-19ed-459e-86f7-bb6af6f1f3b4 redun_id True 0 1 1 8 None 6 None 2025-05-08 07:31:51.655000+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux _branch_code
id
1 RiUc3vC34g62 /home/runner/work/redun-lamin/redun-lamin/docs... None local None iQlBPgD8uaqR 1 None 2025-05-08 07:31:34.670000+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux _branch_code
id
1 XcS9ynv44R2v0000 workflow.py workflow.py script """workflow.py."""\n\n# This code is a copy fr... mHyxrE622q2fluxICu4XDw None None 1 None None True 2025-05-08 07:31:51.651000+00:00 1 None 1

Delete the test instance:

Hide code cell content
!rm -rf test-redun-lamin
!lamin delete --force test-redun-lamin
 deleting instance testuser1/test-redun-lamin