Nextflow

Nextflow is a workflow management system used for executing scientific workflows across platforms scalably, portably, and reproducibly.

Here, we’ll run a demo of the microscopy pipeline mcmicro to correct uneven illumination. Reference

Note

Typically, you run the Nextflow workflow from the command line or Seqera Platform and then register input and output data with a script. The Seqera Platform allows for post-run scripts that can automate this process.

Let’s load an instance that already has example data.

!lamin load nextflow-mcmicro
Hide code cell output
💡 connected lamindb: testuser1/nextflow-mcmicro
import lamindb as ln
💡 connected lamindb: testuser1/nextflow-mcmicro

Run and register Nextflow workflow

!nextflow run https://github.com/labsyspharm/mcmicro --in exemplar-001 --start-at illumination --stop-at registration
Hide code cell output
N E X T F L O W  ~  version 24.04.2
Launching `https://github.com/labsyspharm/mcmicro` [adoring_thompson] DSL2 - revision: a095a0516f [master]
[2b/fd9d72] Submitted process > illumination (2)
[ac/8c6b1b] Submitted process > illumination (3)
[fe/0fc152] Submitted process > illumination (1)
[92/2fa313] Submitted process > registration:ashlar (1)

Now we register our Nextflow run by running our registration script.

!python register_mcmicro_run.py
Hide code cell output
💡 connected lamindb: testuser1/nextflow-mcmicro
💡 saved: Transform(uid='vOKPaJJlpElAttki', version='1.0.0', name='mcmicro', type='pipeline', reference='https://github.com/labsyspharm/mcmicro', created_by_id=1, updated_at='2024-06-19 23:17:08 UTC')
💡 saved: Run(uid='2TaiOucQRwJLtVJm6eXZ', transform_id=2, created_by_id=1)

Data lineage

View data lineage:

output = ln.Artifact.filter(description__icontains="mcmicro").one()
output.view_lineage()
_images/52c7e4ebfcff55fb0547c9eeed0c691e57bd5d69c08c5ac8bc80e8e8fdd8db2a.svg

View the database content:

ln.view()
Artifact
uid version description key suffix type accessor size hash hash_type n_objects n_observations visibility key_is_virtual storage_id transform_id run_id created_by_id updated_at
id
11 4bYYvK8SHlFdk1yWBHAo None mcmicro exemplar-001/registration/exemplar-001.ome.tif .tif dataset None 175490712 1WJhHAbkkfvPGQc0yg9gB4 sha1-fl None None 1 False 1 2 2 1 2024-06-19 23:17:09.207047+00:00
10 VDagwmBowSoSkMs66u9a None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 fUw4NVqV-Zy_OdxiMetlfg md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.887046+00:00
9 EPWd15kCrRhYwPK6AowJ None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 Yw4DJkg2QQ7ez4j2_qWN_Q md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.886435+00:00
8 oFeLyC03vKZk1H1zOnFr None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 _PVbSfSL5apmaZkcpL2mbw md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.885818+00:00
7 iiJ2VtgmlXmCMG51vM9w None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 H_ya8KoVaaeu_Ve_N14TAg md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.885168+00:00
6 vlmhDuV5qEkKElT4YgpR None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 idW8uRMTLfXNJHnboZy8GQ md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.884541+00:00
5 QNpffXJvyxqJ9JWa7Mx4 None None exemplar-001/illumination/exemplar-001-cycle-0... .tif dataset None 22119019 qpmIHKbuxwe2sE_rdcPqfA md5 None None 1 False 1 1 1 1 2024-06-19 23:15:22.883906+00:00
Run
uid started_at finished_at is_consecutive reference reference_type transform_id report_id environment_id created_by_id
id
1 AK1n8B2FyPFXAyCZx040 2024-06-19 23:15:07.406072+00:00 None True None None 1 None None 1
2 2TaiOucQRwJLtVJm6eXZ 2024-06-19 23:17:08.361188+00:00 None None nextflow\n0705397f-4d7e-4602-a40c-323e19ba1407 nextflow_id 2 None None 1
Storage
uid root description type region instance_uid run_id created_by_id updated_at
id
1 QceDDAYInFm5 /home/runner/work/nextflow-lamin-usecases/next... None local None 7XJiuVOUySVN None 1 2024-06-19 23:15:05.927617+00:00
Transform
uid version name key description type reference reference_type latest_report_id source_code_id created_by_id updated_at
id
2 vOKPaJJlpElAttki 1.0.0 mcmicro None None pipeline https://github.com/labsyspharm/mcmicro None None None 1 2024-06-19 23:17:08.357651+00:00
1 OgKsvBBjq3UMm8xI None Download None None pipeline None None None None 1 2024-06-19 23:15:07.399021+00:00
User
uid handle name updated_at
id
1 DzTjkKse testuser1 Test User1 2024-06-19 23:15:05.921734+00:00

Clean up the test instance:

!lamin delete --force nextflow-mcmicro
Hide code cell output
💡 deleting instance testuser1/nextflow-mcmicro
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.10.14/x64/bin/lamin", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 367, in __call__
    return super().__call__(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/rich_click/rich_command.py", line 152, in main
    rv = self.invoke(ctx)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamin_cli/__main__.py", line 103, in delete
    return delete(instance, force=force)
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/lamindb_setup/_delete.py", line 136, in delete
    isettings.storage.root.rmdir()
  File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/pathlib.py", line 1215, in rmdir
    self._accessor.rmdir(self)
OSError: [Errno 39] Directory not empty: '/home/runner/work/nextflow-lamin-usecases/nextflow-lamin-usecases/docs'