Redun¶
Here, we’ll see how to track redun workflow runs with LaminDB.
Note
This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.
!lamin init --storage ./test-redun-lamin --schema bionty
Show code cell output
→ connected lamindb: testuser1/test-redun-lamin
Amend the workflow¶
import lamindb as ln
import json
→ connected lamindb: testuser1/test-redun-lamin
Let’s amend a redun workflow.py
to register input & output artifacts in LaminDB:
To track the workflow run in LaminDB, add (see on GitHub):
ln.track(params=params)
To register the output file via LaminDB, add (see on GitHub):
ln.Artifact(output_path, description="results").save()
Run redun¶
Let’s see what the input files are:
!ls ./fasta
KLF4.fasta MYC.fasta PO5F1.fasta SOX2.fasta
And call the workflow:
!redun run workflow.py main --input-dir ./fasta --tag run=test-run 1> redun_stdout.txt 2>redun_stderr.txt
Inspect the output:
!cat redun_stdout.txt
→ connected lamindb: testuser1/test-redun-lamin
→ running outside of synched git repo, cloning https://github.com/laminlabs/redun-lamin into /home/runner/.cache/lamindb/redun-lamin
→ created Transform('taasWKaw'), started new Run('aIpjba9d') at 2024-12-20 15:03:51 UTC
→ params: input_dir='./fasta' executor='Executor.default' amino_acid='C' enzyme_regex='[KR]' max_length='75' min_length='4' missed_cleavages='0'
! folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/fasta
?25l
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% -:--:--
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1% -:--:--
downloading... ╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2% 0:00:07
downloading... ━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12% 0:00:02
downloading... ━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━ 93% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
→ finished Run('aIpjba9d') after 0d 0h 0m 8s at 2024-12-20 15:03:59 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=364ca84d)
And the error log:
!tail -1 redun_stderr.txt
[redun] Execution duration: 11.15 seconds
View data lineage:
artifact = ln.Artifact.filter(description="results", suffix=".tgz").one()
artifact.view_lineage()
Track the redun execution id¶
If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:
# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json", "r") as file:
redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='aIpjba9damBrWNzOKMoq', started_at=2024-12-20 15:03:51 UTC, finished_at=2024-12-20 15:03:59 UTC, is_consecutive=True, reference='f9f901c3-fdb5-4670-9666-578b02c220a9', reference_type='redun_id', transform_id=1, environment_id=5, created_by_id=1, created_at=2024-12-20 15:03:51 UTC)
Track the redun run report¶
Attach a run report:
report = ln.Artifact(
"redun_stderr.txt",
description=f"Redun run report of {redun_exec['id']}",
run=False,
visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='aIpjba9damBrWNzOKMoq', started_at=2024-12-20 15:03:51 UTC, finished_at=2024-12-20 15:03:59 UTC, is_consecutive=True, reference='f9f901c3-fdb5-4670-9666-578b02c220a9', reference_type='redun_id', transform_id=1, report_id=7, environment_id=5, created_by_id=1, created_at=2024-12-20 15:03:51 UTC)
View transforms and runs in LaminHub¶
View the database content¶
ln.view()
****************
* module: core *
****************
Artifact
uid | key | description | suffix | type | size | hash | n_objects | n_observations | _hash_type | _accessor | visibility | _key_is_virtual | storage_id | transform_id | version | is_latest | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||||||||
6 | MB5sJFAS1Y0JJR4I0000 | data/results.tgz | results | .tgz | None | 83555 | fw6hYw5G2RrD_updzhn1ZA | None | None | md5 | None | 1 | False | 1 | 1.0 | None | True | 1.0 | 2024-12-20 15:04:01.863639+00:00 | 1 |
4 | D7Dkt9lYIKYLtq830000 | fasta/PO5F1.fasta | None | .fasta | None | 477 | -7iJgveFO9ia0wE1bqVu6g | None | None | md5 | None | 1 | True | 1 | NaN | None | True | NaN | 2024-12-20 15:03:52.215146+00:00 | 1 |
3 | Gj1ow5VEqaJArr380000 | fasta/SOX2.fasta | None | .fasta | None | 414 | C5q_yaFXGk4SAEpfdqBwnQ | None | None | md5 | None | 1 | True | 1 | NaN | None | True | NaN | 2024-12-20 15:03:52.214469+00:00 | 1 |
2 | B6bhvTHZwltWZSBg0000 | fasta/KLF4.fasta | None | .fasta | None | 609 | LyuoYkWs4SgYcH7P7JLJtA | None | None | md5 | None | 1 | True | 1 | NaN | None | True | NaN | 2024-12-20 15:03:52.213619+00:00 | 1 |
1 | QF4pLRnpbdLMaI6l0000 | fasta/MYC.fasta | None | .fasta | None | 536 | WGbEtzPw-3bQEGcngO_pHQ | None | None | md5 | None | 1 | True | 1 | NaN | None | True | NaN | 2024-12-20 15:03:52.212164+00:00 | 1 |
ParamValue
value | param_id | created_at | created_by_id | |
---|---|---|---|---|
id | ||||
1 | ./fasta | 1 | 2024-12-20 15:03:51.585943+00:00 | 1 |
2 | Executor.default | 2 | 2024-12-20 15:03:51.585987+00:00 | 1 |
3 | C | 3 | 2024-12-20 15:03:51.586018+00:00 | 1 |
4 | [KR] | 4 | 2024-12-20 15:03:51.586046+00:00 | 1 |
5 | 75 | 5 | 2024-12-20 15:03:51.586080+00:00 | 1 |
6 | 4 | 6 | 2024-12-20 15:03:51.586113+00:00 | 1 |
7 | 0 | 7 | 2024-12-20 15:03:51.586144+00:00 | 1 |
Run
uid | started_at | finished_at | is_consecutive | reference | reference_type | transform_id | report_id | environment_id | parent_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
1 | aIpjba9damBrWNzOKMoq | 2024-12-20 15:03:51.560191+00:00 | 2024-12-20 15:03:59.616101+00:00 | True | f9f901c3-fdb5-4670-9666-578b02c220a9 | redun_id | 1 | 7 | 5 | None | 2024-12-20 15:03:51.560264+00:00 | 1 |
Storage
uid | root | description | type | region | instance_uid | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|
id | |||||||||
1 | eIMKjUlVmNf5 | /home/runner/work/redun-lamin/redun-lamin/docs... | None | local | None | iQlBPgD8uaqR | None | 2024-12-20 15:03:38.828144+00:00 | 1 |
Transform
uid | name | key | description | type | source_code | hash | reference | reference_type | _source_code_artifact_id | version | is_latest | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||
1 | taasWKawCiNA0000 | workflow.py | workflow.py | None | script | """workflow.py"""\n\n# This code is a copy fro... | B36u9mvhSeZwmt4wniwBNg | https://github.com/laminlabs/redun-lamin/blob/... | url | None | 0.1.0 | True | 2024-12-20 15:03:51.557163+00:00 | 1 |
ULabel
uid | name | description | reference | reference_type | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
1 | R7FcEjMW | redun | None | None | None | 1 | 2024-12-20 15:03:52.176877+00:00 | 1 |
User
uid | handle | name | created_at | |
---|---|---|---|---|
id | ||||
1 | DzTjkKse | testuser1 | Test User1 | 2024-12-20 15:03:38.824310+00:00 |
******************
* module: bionty *
******************
Organism
uid | name | ontology_id | scientific_name | synonyms | description | source_id | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||
1 | 1dpCL6Td | human | NCBITaxon:9606 | homo_sapiens | None | None | 1 | 1 | 2024-12-20 15:03:55.864044+00:00 | 1 |
Protein
uid | name | uniprotkb_id | synonyms | description | length | gene_symbol | ensembl_gene_ids | source_id | organism_id | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
4 | 3qNrC4hwnDC9 | PO5F1_HUMAN POU domain, class 5, transcription... | Q01860 | Octamer-binding protein 3|Oct-3|Octamer-bindin... | class 5, transcription factor 1 | 360 | POU5F1 | ENST00000259915.13 [Q01860-1];ENST00000376243.... | 22 | 1 | 1 | 2024-12-20 15:03:59.591628+00:00 | 1 |
3 | 38rbzWPtKmb2 | SOX2_HUMAN Transcription factor SOX-2 | P48431 | 317 | SOX2 | ENST00000325404.3; | 22 | 1 | 1 | 2024-12-20 15:03:58.396878+00:00 | 1 | ||
2 | 6ThKerPbf6DR | KLF4_HUMAN Krueppel-like factor 4 | O43474 | Epithelial zinc finger protein EZF|Gut-enriche... | 513 | KLF4 | ENST00000374672.5 [O43474-1]; | 22 | 1 | 1 | 2024-12-20 15:03:57.065757+00:00 | 1 | |
1 | 36jnmKHdiT9m | MYC_HUMAN Myc proto-oncogene protein | P01106 | Class E basic helix-loop-helix protein 39|bHLH... | 454 | MYC | ENST00000377970.6 [P01106-1];ENST00000524013.2... | 22 | 1 | 1 | 2024-12-20 15:03:55.874460+00:00 | 1 |
Source
uid | entity | organism | name | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | version | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
101 | 5JnV | BioSample | all | ncbi | False | True | NCBI BioSample attributes | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | None | 2023-09 | None | 2024-12-20 15:03:38.960159+00:00 | 1 |
100 | MJRq | bionty.Ethnicity | human | hancestro | False | True | Human Ancestry Ontology | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | None | 3.0 | None | 2024-12-20 15:03:38.960094+00:00 | 1 |
99 | 6vJm | bionty.DevelopmentalStage | mouse | mmusdv | False | False | Mouse Developmental Stages | http://aber-owl.net/media/ontologies/MMUSDV/9/... | 5bef72395d853c7f65450e6c2a1fc653 | https://github.com/obophenotype/developmental-... | None | 2020-03-10 | None | 2024-12-20 15:03:38.960029+00:00 | 1 |
98 | 10va | bionty.DevelopmentalStage | mouse | mmusdv | False | True | Mouse Developmental Stages | https://github.com/obophenotype/developmental-... | https://github.com/obophenotype/developmental-... | None | 2024-05-28 | None | 2024-12-20 15:03:38.959964+00:00 | 1 | |
97 | 7Zm9 | bionty.DevelopmentalStage | human | hsapdv | False | False | Human Developmental Stages | http://aber-owl.net/media/ontologies/HSAPDV/11... | 52181d59df84578ed69214a5cb614036 | https://github.com/obophenotype/developmental-... | None | 2020-03-10 | None | 2024-12-20 15:03:38.959899+00:00 | 1 |
96 | 1GbF | bionty.DevelopmentalStage | human | hsapdv | False | True | Human Developmental Stages | https://github.com/obophenotype/developmental-... | https://github.com/obophenotype/developmental-... | None | 2024-05-28 | None | 2024-12-20 15:03:38.959834+00:00 | 1 | |
95 | 1atB | Drug | all | chebi | False | False | Chemical Entities of Biological Interest | s3://bionty-assets/df_all__chebi__2024-07-27__... | https://www.ebi.ac.uk/chebi/ | None | 2024-07-27 | None | 2024-12-20 15:03:38.959768+00:00 | 1 |
Delete the test instance:
!rm -rf /Users/falexwolf/repos/redun-lamin/docs/test-redun-lamin
!lamin delete --force test-redun-lamin
Show code cell output
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.15/x64/bin/lamin", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
return super().__call__(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 209, in delete
return delete(instance, force=force)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 102, in delete
n_objects = check_storage_is_empty(
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 836, in check_storage_is_empty
raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb' contains 6 objects - delete them prior to deleting the instance