Redun¶
Here, we’ll see how to track redun workflow runs with LaminDB.
Note
This use case is based on github.com/ricomnl/bioinformatics-pipeline-tutorial.
!lamin init --storage ./test-redun-lamin --schema bionty
Show code cell output
→ connected lamindb: testuser1/test-redun-lamin
Amend the workflow¶
import lamindb as ln
import json
→ connected lamindb: testuser1/test-redun-lamin
Let’s amend a redun workflow.py
to register input & output artifacts in LaminDB:
To track the workflow run in LaminDB, add (see on GitHub):
ln.track(params=params)
To register the output file via LaminDB, add (see on GitHub):
ln.Artifact(output_path, description="results").save()
Run redun¶
Let’s see what the input files are:
!ls ./fasta
KLF4.fasta MYC.fasta PO5F1.fasta SOX2.fasta
And call the workflow:
!redun run workflow.py main --input-dir ./fasta --tag run=test-run 1> redun_stdout.txt 2>redun_stderr.txt
Inspect the output:
!cat redun_stdout.txt
→ connected lamindb: testuser1/test-redun-lamin
→ running outside of synched git repo, cloning https://github.com/laminlabs/redun-lamin into /home/runner/.cache/lamindb/redun-lamin
→ created Transform('taasWKaw'), started new Run('B5bXqvlp') at 2024-11-21 05:38:13 UTC
→ params: input_dir='./fasta' executor='Executor.default' amino_acid='C' enzyme_regex='[KR]' max_length='75' min_length='4' missed_cleavages='0'
! folder is outside existing storage location, will copy files from ./fasta to /home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/fasta
?25l
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% -:--:--
downloading... ━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━ 31% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━ 77% 0:00:01
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
downloading... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
?25h
→ finished Run('B5bXqvlp') after 0d 0h 0m 6s at 2024-11-21 05:38:20 UTC
File(path=/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/data/results.tgz, hash=87d2c9d7)
And the error log:
!tail -1 redun_stderr.txt
[redun] Execution duration: 9.24 seconds
View data lineage:
artifact = ln.Artifact.filter(description="results", suffix=".tgz").one()
artifact.view_lineage()
Track the redun execution id¶
If we want to be able to query LaminDB for redun execution ID, this here is a way to get it:
# export the run information from redun
!redun log --exec --exec-tag run=test-run --format json --no-pager > redun_exec.json
# load the redun execution id from the JSON and store it in the LaminDB run record
with open("redun_exec.json", "r") as file:
redun_exec = json.loads(file.readline())
artifact.run.reference = redun_exec["id"]
artifact.run.reference_type = "redun_id"
artifact.run.save()
Run(uid='B5bXqvlpwmHKWf43dx8F', started_at=2024-11-21 05:38:13 UTC, finished_at=2024-11-21 05:38:20 UTC, is_consecutive=True, reference='79f1262c-3572-43ab-88e7-73c955b64207', reference_type='redun_id', transform_id=1, environment_id=5, created_by_id=1, created_at=2024-11-21 05:38:13 UTC)
Track the redun run report¶
Attach a run report:
report = ln.Artifact(
"redun_stderr.txt",
description=f"Redun run report of {redun_exec['id']}",
run=False,
visibility=0,
).save()
artifact.run.report = report
artifact.run.save()
Run(uid='B5bXqvlpwmHKWf43dx8F', started_at=2024-11-21 05:38:13 UTC, finished_at=2024-11-21 05:38:20 UTC, is_consecutive=True, reference='79f1262c-3572-43ab-88e7-73c955b64207', reference_type='redun_id', transform_id=1, report_id=7, environment_id=5, created_by_id=1, created_at=2024-11-21 05:38:13 UTC)
View transforms and runs in LaminHub¶
View the database content¶
ln.view()
****************
* module: core *
****************
Artifact
uid | version | is_latest | description | key | suffix | type | size | hash | n_objects | n_observations | _hash_type | _accessor | visibility | _key_is_virtual | storage_id | transform_id | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||||||||
6 | lGX0JdqwmtvDs4vK0000 | None | True | results | data/results.tgz | .tgz | None | 83820 | AS7kma56Et_If0h7cy1BLg | None | None | md5 | None | 1 | False | 1 | 1.0 | 1.0 | 2024-11-21 05:38:22.176916+00:00 | 1 |
4 | Sz4CrH9YtvrNUu6c0000 | None | True | None | fasta/KLF4.fasta | .fasta | None | 609 | LyuoYkWs4SgYcH7P7JLJtA | None | None | md5 | None | 1 | True | 1 | NaN | NaN | 2024-11-21 05:38:14.244412+00:00 | 1 |
3 | PyzvgSJD1yAU73RS0000 | None | True | None | fasta/PO5F1.fasta | .fasta | None | 477 | -7iJgveFO9ia0wE1bqVu6g | None | None | md5 | None | 1 | True | 1 | NaN | NaN | 2024-11-21 05:38:14.243942+00:00 | 1 |
2 | 2B4vGQeL0HydWW350000 | None | True | None | fasta/SOX2.fasta | .fasta | None | 414 | C5q_yaFXGk4SAEpfdqBwnQ | None | None | md5 | None | 1 | True | 1 | NaN | NaN | 2024-11-21 05:38:14.243279+00:00 | 1 |
1 | GRTAkaIePM43a4xe0000 | None | True | None | fasta/MYC.fasta | .fasta | None | 536 | WGbEtzPw-3bQEGcngO_pHQ | None | None | md5 | None | 1 | True | 1 | NaN | NaN | 2024-11-21 05:38:14.242099+00:00 | 1 |
! No records found
! No records found
! No records found
Run
uid | started_at | finished_at | is_consecutive | reference | reference_type | transform_id | report_id | environment_id | parent_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
1 | B5bXqvlpwmHKWf43dx8F | 2024-11-21 05:38:13.525734+00:00 | 2024-11-21 05:38:20.052577+00:00 | True | 79f1262c-3572-43ab-88e7-73c955b64207 | redun_id | 1 | 7 | 5 | None | 2024-11-21 05:38:13.525793+00:00 | 1 |
Storage
uid | root | description | type | region | instance_uid | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|
id | |||||||||
1 | JFEZ4jhg6iHq | /home/runner/work/redun-lamin/redun-lamin/docs... | None | local | None | iQlBPgD8uaqR | None | 2024-11-21 05:37:57.045059+00:00 | 1 |
Transform
uid | version | is_latest | name | key | description | type | source_code | hash | reference | reference_type | _source_code_artifact_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||
1 | taasWKawCiNA0000 | 0.1.0 | True | workflow.py | workflow.py | None | script | """workflow.py"""\n\n# This code is a copy fro... | B36u9mvhSeZwmt4wniwBNg | https://github.com/laminlabs/redun-lamin/blob/... | url | None | 2024-11-21 05:38:13.519883+00:00 | 1 |
ULabel
uid | name | description | reference | reference_type | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
1 | y8sdzOhx | redun | None | None | None | 1 | 2024-11-21 05:38:14.208410+00:00 | 1 |
User
uid | handle | name | created_at | |
---|---|---|---|---|
id | ||||
1 | DzTjkKse | testuser1 | Test User1 | 2024-11-21 05:37:57.038946+00:00 |
******************
* module: bionty *
******************
Delete the test instance:
!rm -rf /Users/falexwolf/repos/redun-lamin/docs/test-redun-lamin
!lamin delete --force test-redun-lamin
Show code cell output
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.10.15/x64/bin/lamin", line 8, in <module>
sys.exit(main())
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
return super().__call__(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 209, in delete
return delete(instance, force=force)
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 102, in delete
n_objects = check_storage_is_empty(
File "/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/lamindb_setup/core/upath.py", line 824, in check_storage_is_empty
raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage '/home/runner/work/redun-lamin/redun-lamin/docs/test-redun-lamin/.lamindb' contains 6 objects - delete them prior to deleting the instance