Redun

Here, we’ll see how to track redun workflow runs with LaminDB.

Note

This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.

!lamin init --storage ./test-redun-lamin --schema bionty
Hide code cell output
 connected lamindb: testuser1/test-redun-lamin

Amend the workflow

import lamindb as ln
import json
 connected lamindb: testuser1/test-redun-lamin

Let’s amend a redun workflow.py to register input & output artifacts in LaminDB:

  • To track the workflow run in LaminDB, add (see on GitHub):

    ln.track(params=params)
    
  • To register the output file via LaminDB, add (see on GitHub):

    ln.Artifact(output_path, description="results").save()
    

Run redun

Let’s see what the input files are:

!ls ./fasta
KLF4.fasta  MYC.fasta  PO5F1.fasta  SOX2.fasta

And call the workflow:

!redun run workflow.py main --input-dir ./fasta --tag run=test-run  1> redun_stdout.txt 2>redun_stderr.txt

Inspect the output:

!cat redun_stdout.txt
 connected lamindb: testuser1/test-redun-lamin
 running outside of synched git repo, cloning https://github.com/laminlabs/redun-lamin into /home/runner/.cache/lamindb/redun-lamin
 created Transform('taasWKaw'), started new Run('aIpjba9d') at 2024-12-20 15:03:51 UTC
→ params: input_dir='./fasta' executor='Executor.default' amino_acid='C' enzyme_regex='[KR]' max_length='75' min_length='4' missed_cleavages='0'
! folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/fasta
?25l
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   0% -:--:--
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   1% -:--:--
downloading... ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━   2% 0:00:07
downloading... ━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  12% 0:00:02
downloading... ━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  24% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━  93% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
 finished Run('aIpjba9d') after 0d 0h 0m 8s at 2024-12-20 15:03:59 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=364ca84d)

And the error log:

!tail -1 redun_stderr.txt
[redun] Execution duration: 11.15 seconds

View data lineage:

artifact = ln.Artifact.filter(description="results", suffix=".tgz").one()
artifact.view_lineage()
_images/c0f58c48333a9d0bb047f813c39fc4adc6a1da9269187b51d49600127f54358b.svg

Track the redun execution id

If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:

# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json", "r") as file:
    redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='aIpjba9damBrWNzOKMoq', started_at=2024-12-20 15:03:51 UTC, finished_at=2024-12-20 15:03:59 UTC, is_consecutive=True, reference='f9f901c3-fdb5-4670-9666-578b02c220a9', reference_type='redun_id', transform_id=1, environment_id=5, created_by_id=1, created_at=2024-12-20 15:03:51 UTC)

Track the redun run report

Attach a run report:

report = ln.Artifact(
    "redun_stderr.txt",
    description=f"Redun run report of {redun_exec['id']}",
    run=False,
    visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='aIpjba9damBrWNzOKMoq', started_at=2024-12-20 15:03:51 UTC, finished_at=2024-12-20 15:03:59 UTC, is_consecutive=True, reference='f9f901c3-fdb5-4670-9666-578b02c220a9', reference_type='redun_id', transform_id=1, report_id=7, environment_id=5, created_by_id=1, created_at=2024-12-20 15:03:51 UTC)

View transforms and runs in LaminHub

hub

View the database content

ln.view()
****************
* module: core *
****************
Artifact
uid key description suffix type size hash n_objects n_observations _hash_type _accessor visibility _key_is_virtual storage_id transform_id version is_latest run_id created_at created_by_id
id
6 MB5sJFAS1Y0JJR4I0000 data/results.tgz results .tgz None 83555 fw6hYw5G2RrD_updzhn1ZA None None md5 None 1 False 1 1.0 None True 1.0 2024-12-20 15:04:01.863639+00:00 1
4 D7Dkt9lYIKYLtq830000 fasta/PO5F1.fasta None .fasta None 477 -7iJgveFO9ia0wE1bqVu6g None None md5 None 1 True 1 NaN None True NaN 2024-12-20 15:03:52.215146+00:00 1
3 Gj1ow5VEqaJArr380000 fasta/SOX2.fasta None .fasta None 414 C5q_yaFXGk4SAEpfdqBwnQ None None md5 None 1 True 1 NaN None True NaN 2024-12-20 15:03:52.214469+00:00 1
2 B6bhvTHZwltWZSBg0000 fasta/KLF4.fasta None .fasta None 609 LyuoYkWs4SgYcH7P7JLJtA None None md5 None 1 True 1 NaN None True NaN 2024-12-20 15:03:52.213619+00:00 1
1 QF4pLRnpbdLMaI6l0000 fasta/MYC.fasta None .fasta None 536 WGbEtzPw-3bQEGcngO_pHQ None None md5 None 1 True 1 NaN None True NaN 2024-12-20 15:03:52.212164+00:00 1
ParamValue
value param_id created_at created_by_id
id
1 ./fasta 1 2024-12-20 15:03:51.585943+00:00 1
2 Executor.default 2 2024-12-20 15:03:51.585987+00:00 1
3 C 3 2024-12-20 15:03:51.586018+00:00 1
4 [KR] 4 2024-12-20 15:03:51.586046+00:00 1
5 75 5 2024-12-20 15:03:51.586080+00:00 1
6 4 6 2024-12-20 15:03:51.586113+00:00 1
7 0 7 2024-12-20 15:03:51.586144+00:00 1
Run
uid started_at finished_at is_consecutive reference reference_type transform_id report_id environment_id parent_id created_at created_by_id
id
1 aIpjba9damBrWNzOKMoq 2024-12-20 15:03:51.560191+00:00 2024-12-20 15:03:59.616101+00:00 True f9f901c3-fdb5-4670-9666-578b02c220a9 redun_id 1 7 5 None 2024-12-20 15:03:51.560264+00:00 1
Storage
uid root description type region instance_uid run_id created_at created_by_id
id
1 eIMKjUlVmNf5 /home/runner/work/redun-lamin/redun-lamin/docs... None local None iQlBPgD8uaqR None 2024-12-20 15:03:38.828144+00:00 1
Transform
uid name key description type source_code hash reference reference_type _source_code_artifact_id version is_latest created_at created_by_id
id
1 taasWKawCiNA0000 workflow.py workflow.py None script """workflow.py"""\n\n# This code is a copy fro... B36u9mvhSeZwmt4wniwBNg https://github.com/laminlabs/redun-lamin/blob/... url None 0.1.0 True 2024-12-20 15:03:51.557163+00:00 1
ULabel
uid name description reference reference_type run_id created_at created_by_id
id
1 R7FcEjMW redun None None None 1 2024-12-20 15:03:52.176877+00:00 1
User
uid handle name created_at
id
1 DzTjkKse testuser1 Test User1 2024-12-20 15:03:38.824310+00:00
******************
* module: bionty *
******************
Organism
uid name ontology_id scientific_name synonyms description source_id run_id created_at created_by_id
id
1 1dpCL6Td human NCBITaxon:9606 homo_sapiens None None 1 1 2024-12-20 15:03:55.864044+00:00 1
Protein
uid name uniprotkb_id synonyms description length gene_symbol ensembl_gene_ids source_id organism_id run_id created_at created_by_id
id
4 3qNrC4hwnDC9 PO5F1_HUMAN POU domain, class 5, transcription... Q01860 Octamer-binding protein 3|Oct-3|Octamer-bindin... class 5, transcription factor 1 360 POU5F1 ENST00000259915.13 [Q01860-1];ENST00000376243.... 22 1 1 2024-12-20 15:03:59.591628+00:00 1
3 38rbzWPtKmb2 SOX2_HUMAN Transcription factor SOX-2 P48431 317 SOX2 ENST00000325404.3; 22 1 1 2024-12-20 15:03:58.396878+00:00 1
2 6ThKerPbf6DR KLF4_HUMAN Krueppel-like factor 4 O43474 Epithelial zinc finger protein EZF|Gut-enriche... 513 KLF4 ENST00000374672.5 [O43474-1]; 22 1 1 2024-12-20 15:03:57.065757+00:00 1
1 36jnmKHdiT9m MYC_HUMAN Myc proto-oncogene protein P01106 Class E basic helix-loop-helix protein 39|bHLH... 454 MYC ENST00000377970.6 [P01106-1];ENST00000524013.2... 22 1 1 2024-12-20 15:03:55.874460+00:00 1
Source
uid entity organism name in_db currently_used description url md5 source_website dataframe_artifact_id version run_id created_at created_by_id
id
101 5JnV BioSample all ncbi False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... 918db9bd1734b97c596c67d9654a4126 https://www.ncbi.nlm.nih.gov/biosample/docs/at... None 2023-09 None 2024-12-20 15:03:38.960159+00:00 1
100 MJRq bionty.Ethnicity human hancestro False True Human Ancestry Ontology https://github.com/EBISPOT/hancestro/raw/3.0/h... 76dd9efda9c2abd4bc32fc57c0b755dd https://github.com/EBISPOT/hancestro None 3.0 None 2024-12-20 15:03:38.960094+00:00 1
99 6vJm bionty.DevelopmentalStage mouse mmusdv False False Mouse Developmental Stages http://aber-owl.net/media/ontologies/MMUSDV/9/... 5bef72395d853c7f65450e6c2a1fc653 https://github.com/obophenotype/developmental-... None 2020-03-10 None 2024-12-20 15:03:38.960029+00:00 1
98 10va bionty.DevelopmentalStage mouse mmusdv False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... https://github.com/obophenotype/developmental-... None 2024-05-28 None 2024-12-20 15:03:38.959964+00:00 1
97 7Zm9 bionty.DevelopmentalStage human hsapdv False False Human Developmental Stages http://aber-owl.net/media/ontologies/HSAPDV/11... 52181d59df84578ed69214a5cb614036 https://github.com/obophenotype/developmental-... None 2020-03-10 None 2024-12-20 15:03:38.959899+00:00 1
96 1GbF bionty.DevelopmentalStage human hsapdv False True Human Developmental Stages https://github.com/obophenotype/developmental-... https://github.com/obophenotype/developmental-... None 2024-05-28 None 2024-12-20 15:03:38.959834+00:00 1
95 1atB Drug all chebi False False Chemical Entities of Biological Interest s3://bionty-assets/df_all__chebi__2024-07-27__... https://www.ebi.ac.uk/chebi/ None 2024-07-27 None 2024-12-20 15:03:38.959768+00:00 1

Delete the test instance:

!rm -rf /Users/falexwolf/repos/redun-lamin/docs/test-redun-lamin
!lamin delete --force test-redun-lamin
Hide code cell output
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.15/x64/bin/lamin", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
    return super().__call__(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 209, in delete
    return delete(instance, force=force)
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 102, in delete
    n_objects = check_storage_is_empty(
  File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 836, in check_storage_is_empty
    raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb' contains 6 objects - delete them prior to deleting the instance