Redun¶
Here, we’ll see how to track redun workflow runs with LaminDB.
Note
This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.
!lamin init --storage ./test-redun-lamin --schema bionty
Show code cell output
💡 connected lamindb: testuser1/test-redun-lamin
Amend the workflow¶
import lamindb as ln
import json
💡 connected lamindb: testuser1/test-redun-lamin
Let’s amend a redun workflow.py
to register input & output artifacts in LaminDB:
To track the workflow run in LaminDB, add (see on GitHub):
ln.track(params=params)
To register the output file via LaminDB, add (see on GitHub):
ln.Artifact(output_path, description="results").save()
Run redun¶
Let’s see what the input files are:
!ls ./fasta
KLF4.fasta MYC.fasta PO5F1.fasta SOX2.fasta
And call the workflow:
!redun run workflow.py main --input-dir ./fasta --tag run=test-run 1> redun_stdout.txt 2>redun_stderr.txt
Inspect the output:
!cat redun_stdout.txt
💡 connected lamindb: testuser1/test-redun-lamin
💡 running outside of synched git repo, cloning https://github.com/laminlabs/redun-lamin into /home/runner/.cache/lamindb/redun-lamin
💡 saved: Transform(uid='taasWKawCiNA6zf0', version='0.1.0', name='workflow.py', key='workflow.py', type='script', reference='https://github.com/laminlabs/redun-lamin/blob/9b3b4b43b3c99bbbacd3b269f2fc67dc81d69746/docs/workflow.py', reference_type='url', created_by_id=1, updated_at='2024-07-26 14:36:42 UTC')
💡 saved: Run(uid='Mq8iwFKNePPKRfLPFtxk', transform_id=1, created_by_id=1)
❗ this creates one artifact per file in the directory - you might simply call ln.Artifact(dir) to get one artifact for the entire directory
❗ folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/fasta
?25l
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% -:--:--
downloading... ━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━ 74% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━━━━━━ 74% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=a413c0f6)
And the error log:
!tail -1 redun_stderr.txt
[redun] Execution duration: 12.83 seconds
View data lineage:
artifact = ln.Artifact.filter(description="results", suffix=".tgz").one()
artifact.view_lineage()
Track the redun execution id¶
If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:
# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json", "r") as file:
redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='Mq8iwFKNePPKRfLPFtxk', started_at='2024-07-26 14:36:42 UTC', finished_at='2024-07-26 14:36:53 UTC', is_consecutive=True, reference='caa56866-913d-4db1-9b62-e438f339a734', reference_type='redun_id', transform_id=1, created_by_id=1, environment_id=6)
Track the redun run report¶
Attach a run report:
report = ln.Artifact(
"redun_stderr.txt",
description=f"Redun run report of {redun_exec['id']}",
run=False,
visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='Mq8iwFKNePPKRfLPFtxk', started_at='2024-07-26 14:36:42 UTC', finished_at='2024-07-26 14:36:53 UTC', is_consecutive=True, reference='caa56866-913d-4db1-9b62-e438f339a734', reference_type='redun_id', transform_id=1, created_by_id=1, report_id=8, environment_id=6)
View transforms and runs in LaminHub¶
View the database content¶
ln.view()
****************
* module: core *
****************
Artifact
uid | version | description | key | suffix | type | accessor | size | hash | hash_type | n_objects | n_observations | visibility | key_is_virtual | storage_id | transform_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||
7 | eU0RWORwwc0Lbof0sCGW | None | results | data/results.tgz | .tgz | dataset | None | 83892 | vLIgl4MqH9AlJgwmF58hew | md5 | None | None | 1 | False | 1 | 1.0 | 1.0 | 1 | 2024-07-26 14:36:55.051269+00:00 |
4 | Eq71OWB33IGtAEgfmJl7 | None | None | fasta/SOX2.fasta | .fasta | dataset | None | 414 | C5q_yaFXGk4SAEpfdqBwnQ | md5 | None | None | 1 | True | 1 | NaN | NaN | 1 | 2024-07-26 14:36:43.399238+00:00 |
3 | Rw3ujzMGrSBIz1YBbUdl | None | None | fasta/MYC.fasta | .fasta | dataset | None | 536 | WGbEtzPw-3bQEGcngO_pHQ | md5 | None | None | 1 | True | 1 | NaN | NaN | 1 | 2024-07-26 14:36:43.398658+00:00 |
2 | wnDG5zgjYc8gBrVNdI4D | None | None | fasta/KLF4.fasta | .fasta | dataset | None | 609 | LyuoYkWs4SgYcH7P7JLJtA | md5 | None | None | 1 | True | 1 | NaN | NaN | 1 | 2024-07-26 14:36:43.397900+00:00 |
1 | Vt5zocplJcXDknyDW7PH | None | None | fasta/PO5F1.fasta | .fasta | dataset | None | 477 | -7iJgveFO9ia0wE1bqVu6g | md5 | None | None | 1 | True | 1 | NaN | NaN | 1 | 2024-07-26 14:36:43.396614+00:00 |
Run
uid | started_at | finished_at | is_consecutive | reference | reference_type | transform_id | report_id | environment_id | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||
1 | Mq8iwFKNePPKRfLPFtxk | 2024-07-26 14:36:42.828558+00:00 | 2024-07-26 14:36:53.052234+00:00 | True | caa56866-913d-4db1-9b62-e438f339a734 | redun_id | 1 | 8 | 6 | 1 |
Storage
uid | root | description | type | region | instance_uid | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|
id | |||||||||
1 | iifPKVuGs2qC | /home/runner/work/redun-lamin/redun-lamin/docs... | None | local | None | None | None | 1 | 2024-07-26 14:36:31.180860+00:00 |
Transform
uid | version | name | key | description | type | reference | reference_type | latest_report_id | source_code_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
1 | taasWKawCiNA6zf0 | 0.1.0 | workflow.py | workflow.py | None | script | https://github.com/laminlabs/redun-lamin/blob/... | url | None | 5 | 1 | 2024-07-26 14:36:53.054365+00:00 |
ULabel
uid | name | description | reference | reference_type | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
1 | JL67MXW2 | redun | None | None | None | 1 | 1 | 2024-07-26 14:36:43.376696+00:00 |
User
uid | handle | name | updated_at | |
---|---|---|---|---|
id | ||||
1 | DzTjkKse | testuser1 | Test User1 | 2024-07-26 14:36:31.177137+00:00 |
******************
* module: bionty *
******************
Organism
uid | name | ontology_id | scientific_name | synonyms | description | source_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||
1 | 1dpCL6Td | human | NCBITaxon:9606 | homo_sapiens | None | None | 1 | 1 | 1 | 2024-07-26 14:36:44.471538+00:00 |
Protein
uid | name | uniprotkb_id | synonyms | description | length | gene_symbol | ensembl_gene_ids | organism_id | source_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
4 | 38rbzWPtKmb2 | SOX2_HUMAN Transcription factor SOX-2 | P48431 | 317 | SOX2 | ENST00000325404.3; | 1 | 22 | 1 | 1 | 2024-07-26 14:36:53.030986+00:00 | ||
3 | 36jnmKHdiT9m | MYC_HUMAN Myc proto-oncogene protein | P01106 | Class E basic helix-loop-helix protein 39|bHLH... | 454 | MYC | ENST00000377970.6 [P01106-1];ENST00000524013.2... | 1 | 22 | 1 | 1 | 2024-07-26 14:36:50.826416+00:00 | |
2 | 6ThKerPbf6DR | KLF4_HUMAN Krueppel-like factor 4 | O43474 | Epithelial zinc finger protein EZF|Gut-enriche... | 513 | KLF4 | ENST00000374672.5 [O43474-1]; | 1 | 22 | 1 | 1 | 2024-07-26 14:36:48.967931+00:00 | |
1 | 3qNrC4hwnDC9 | PO5F1_HUMAN POU domain, class 5, transcription... | Q01860 | Octamer-binding protein 3|Oct-3|Octamer-bindin... | class 5, transcription factor 1 | 360 | POU5F1 | ENST00000259915.13 [Q01860-1];ENST00000376243.... | 1 | 22 | 1 | 1 | 2024-07-26 14:36:46.949710+00:00 |
Source
uid | entity | organism | source | version | in_db | currently_used | source_name | url | md5 | source_website | df_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
75 | 3pvh | BioSample | all | ncbi | 2023-09 | False | True | NCBI BioSample attributes | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | None | None | 1 | 2024-07-26 14:36:31.300251+00:00 |
74 | 5kwU | Ethnicity | human | hancestro | 3.0 | False | True | Human Ancestry Ontology | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | None | None | 1 | 2024-07-26 14:36:31.300076+00:00 |
73 | 4hcb | DevelopmentalStage | mouse | mmusdv | 2020-03-10 | False | True | Mouse Developmental Stages | http://aber-owl.net/media/ontologies/MMUSDV/9/... | 5bef72395d853c7f65450e6c2a1fc653 | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-07-26 14:36:31.299900+00:00 |
72 | 238S | DevelopmentalStage | human | hsapdv | 2020-03-10 | False | True | Human Developmental Stages | http://aber-owl.net/media/ontologies/HSAPDV/11... | 52181d59df84578ed69214a5cb614036 | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-07-26 14:36:31.299725+00:00 |
71 | 1auD | Drug | all | dron | 2023-03-10 | False | False | Drug Ontology | https://data.bioontology.org/ontologies/DRON/s... | 75e86011158fae76bb46d96662a33ba3 | https://bioportal.bioontology.org/ontologies/DRON | None | None | 1 | 2024-07-26 14:36:31.299547+00:00 |
70 | 4uDt | Drug | all | dron | 2024-03-02 | False | True | Drug Ontology | https://data.bioontology.org/ontologies/DRON/s... | 84138459de4f65034e979f4e46783747 | https://bioportal.bioontology.org/ontologies/DRON | None | None | 1 | 2024-07-26 14:36:31.299369+00:00 |
69 | 5e83 | BFXPipeline | all | lamin | 1.0.0 | False | True | Bioinformatics Pipeline | s3://bionty-assets/bfxpipelines.json | a7eff57a256994692fba46e0199ffc94 | https://lamin.ai | None | None | 1 | 2024-07-26 14:36:31.299187+00:00 |
Delete the test instance:
!rm -rf /Users/falexwolf/repos/redun-lamin/docs/test-redun-lamin
!lamin delete --force test-redun-lamin
Show code cell output
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.14/x64/bin/lamin", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
return super().__call__(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 105, in delete
return delete(instance, force=force)
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 98, in delete
n_objects = check_storage_is_empty(
File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 779, in check_storage_is_empty
raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb contains 7 objects ('_is_initialized' ignored) - delete them prior to deleting the instance
['/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/Eq71OWB33IGtAEgfmJl7.fasta', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/R3OdVuWFHYCDqx0voyd0.txt', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/Rw3ujzMGrSBIz1YBbUdl.fasta', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/Vt5zocplJcXDknyDW7PH.fasta', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/_is_initialized', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/pKDVBlJR1Il7VOIxhW01.txt', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/s7ZDjpQvryR5jJwzucXg.py', '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb/wnDG5zgjYc8gBrVNdI4D.fasta']