nf-lamin plugin reference
¶
Configuration¶
Basic Configuration¶
Add the following block to your nextflow.config:
plugins {
id 'nf-lamin'
}
lamin {
instance = "<your-lamin-org>/<your-lamin-instance>"
api_key = secrets.LAMIN_API_KEY
project_uids = ['proj123456789012'] // optional
ulabel_uids = ['ulab123456789012'] // optional
}
Settings:
lamin.instance: (Required) The LaminDB instance to connect to, in the formatorganization/instance.lamin.api_key: (Required) Your Lamin Hub API key. It is strongly recommended to set this usingnextflow secrets.lamin.project_uids: (Optional) List of project UIDs to link to all artifacts, runs, and transforms.lamin.ulabel_uids: (Optional) List of ulabel UIDs to link to all artifacts, runs, and transforms.
Alternatively, you can use environment variables, though this is less secure:
export LAMIN_CURRENT_INSTANCE="laminlabs/lamindata"
export LAMIN_API_KEY="your-lamin-api-key"
export LAMIN_CURRENT_PROJECT="proj123456789012" # Project UID
Advanced Configuration¶
The plugin offers advanced settings for custom deployments or for tuning its behavior.
lamin {
// ... basic settings ...
// Root-level UIDs apply to all artifacts, runs, and transforms
project_uids = ['proj123456789012']
ulabel_uids = ['ulab123456789012']
// The environment name in LaminDB (e.g. "prod" or "staging")
env = "prod"
// Enable dry-run mode to test configuration without creating records
dry_run = false
// API connection settings
api {
// The Supabase API URL for the LaminDB instance (if env is set to "custom")
supabase_api_url = "https://your-supabase-api-url.supabase.co"
// The Supabase anon key for the LaminDB instance (if env is set to "custom")
supabase_anon_key = secrets.SUPABASE_ANON_KEY
// The number of retries for API requests
max_retries = 3
// The delay between retries in milliseconds
retry_delay = 100
}
// Run-specific metadata linking
run {
project_uids = ['proj-run-specific']
ulabel_uids = ['ulab-run-specific']
}
// Transform-specific metadata linking
transform {
project_uids = ['proj-transform-specific']
ulabel_uids = ['ulab-transform-specific']
}
// Manually specify a transform UID if known (advanced users only)
transform_uid = "your-transform-uid"
// Manually specify a run UID if known (advanced users only)
run_uid = "your-run-uid"
}
Artifact Tracking Configuration¶
Control which files are tracked as artifacts and attach metadata using pattern-based rules. You can configure tracking globally, or separately for inputs and outputs.
Basic Artifact Tracking¶
lamin {
// ... instance and api_key ...
// Control output artifact tracking
output_artifacts {
enabled = true
include_pattern = '.*\\.(fastq|bam|vcf)\\.gz$' // Only track compressed files
exclude_pattern = '.*\\.tmp$' // Exclude temporary files
}
// Control input artifact tracking
input_artifacts {
enabled = true
exclude_pattern = '.*\\.log$' // Don't track log files as inputs
}
}
Advanced Artifact Tracking with Rules¶
Use rules to apply different configurations based on file patterns:
lamin {
// Root-level UIDs apply to all artifacts, runs, and transforms
project_uids = ['global-project']
ulabel_uids = ['project-wide-label']
// Global artifact settings
artifacts {
enabled = true
}
// Input-specific configuration
input_artifacts {
enabled = true
rules {
reference_data {
pattern = '.*reference.*\\.(fasta|gtf)$'
ulabel_uids = ['reference-data-label']
kind = 'reference'
order = 1 // Higher priority
}
sample_data {
pattern = '.*\\.fastq\\.gz$'
ulabel_uids = ['raw-reads-label']
project_uids = ['sequencing-project']
kind = 'dataset'
order = 2
}
}
}
// Output-specific configuration
output_artifacts {
enabled = true
exclude_pattern = '.*\\.(log|tmp)$' // Exclude logs and temp files
rules {
// Exclude work-in-progress outputs
exclude_intermediate {
type = 'exclude'
pattern = '.*intermediate.*'
order = 1
}
// Track final BAM files
aligned_reads {
pattern = '.*\\.bam$'
ulabel_uids = ['aligned-reads-label']
kind = 'aligned_data'
order = 2
}
// Track variant calls with high priority
variants {
pattern = '.*\\.vcf\\.gz$'
ulabel_uids = ['variants-label']
project_uids = ['variant-calling-project']
kind = 'variants'
order = 3
}
// Disable tracking for specific file types
disable_fastqc {
enabled = false
pattern = '.*_fastqc\\.(html|zip)$'
}
}
}
}
Configuration Options¶
Global Options (apply to artifacts, input_artifacts, output_artifacts):
enabled(Boolean, default:true) - Enable or disable artifact trackinginclude_pattern(String) - Java regex pattern. Files must match this pattern to be trackedexclude_pattern(String) - Java regex pattern. Files matching this pattern will not be trackedulabel_uids(Listor String) - ULabel UIDs to attach to all matching artifacts project_uids(Listor String) - Project UIDs to attach to all matching artifacts kind(String) - Artifact kind (e.g., ‘dataset’, ‘model’, ‘reference’, ‘report’)rules(Map) - Named rules for path-specific configurations
Rule Options (apply to individual rules within rules):
enabled(Boolean, default:true) - Enable or disable this rulepattern(String, required) - Java regex pattern to match file pathstype(String, default:'include') - Rule type:'include'to track matching files,'exclude'to skip themdirection(String, default: inherited) - Apply rule to'input','output', or'both'order(Integer, default:100) - Rule evaluation priority (lower numbers = higher priority)ulabel_uids(Listor String) - ULabel UIDs to attach to matching artifacts project_uids(Listor String) - Project UIDs to attach to matching artifacts kind(String) - Override artifact kind for matching files
Rule Evaluation¶
Global patterns are checked first (
include_patternandexclude_pattern)Rules are evaluated in order of priority (
orderfield, lower numbers first)All matching rules are processed - each can modify the tracking decision and add metadata
Later
includerules can override earlierexcluderules, and vice versaIf no rules match, the file is tracked using global settings
Metadata merging: ULabels and Projects from global config and all matching rules are combined (duplicates removed)
Pattern Syntax¶
Patterns use Java regular expressions. Common patterns:
.*\\.fastq$- Match files ending with.fastq.*\\.fastq\\.gz$- Match compressed FASTQ files.*/output/.*- Match files in anyoutputdirectory.*_(R1|R2)_.*- Match paired-end read files^(?!.*temp).*$- Match files NOT containing “temp”
Important: Backslashes must be escaped in Groovy strings: \\. instead of \.
Examples¶
Disable all artifact tracking:
lamin {
output_artifacts {
enabled = false
}
input_artifacts {
enabled = false
}
}
Track only specific file types:
lamin {
output_artifacts {
enabled = true
include_pattern = '.*\\.(bam|vcf\\.gz|h5ad)$'
}
}
Exclude temporary and intermediate files:
lamin {
output_artifacts {
enabled = true
exclude_pattern = '.*\\.(tmp|temp|intermediate).*'
}
}
Different labels for different file types:
lamin {
output_artifacts {
rules {
raw_data {
pattern = '.*\\.fastq\\.gz$'
ulabel_uids = ['raw-sequencing-data']
kind = 'raw_reads'
}
processed_data {
pattern = '.*\\.h5ad$'
ulabel_uids = ['processed-expression-matrix']
kind = 'expression_matrix'
}
}
}
}
You can also set these using environment variables:
export LAMIN_ENV="prod"
export LAMIN_DRY_RUN="false"
export LAMIN_CURRENT_PROJECT="proj123456789012" # Used for project_uids
export SUPABASE_API_URL="https://your-supabase-api-url.supabase.co"
export SUPABASE_ANON_KEY="your-supabase-anon-key"
export LAMIN_MAX_RETRIES=3
export LAMIN_RETRY_DELAY=100
export LAMIN_TRANSFORM_UID="your-transform-uid"
export LAMIN_RUN_UID="your-run-uid"
Advanced settings explained:
project_uids&ulabel_uids: Root-level UIDs that apply to all artifacts, runs, and transforms. Can be combined with object-specific UIDs inrunandtransformsections.run&transform: Object-specific metadata linking. UIDs specified here are merged with root-level UIDs.env: Environment selector for LaminDB instance (e.g., “prod”, “staging”, or “custom”)dry_run: Whentrue, the plugin validates configuration and connects to LaminDB but does not create or modify any records (transforms, runs, or artifacts). Useful for testing your setup without affecting the database.api: Advanced API connection settings including Supabase connection details and retry behaviorsupabase_api_url&supabase_anon_key: Custom Supabase connection details (only needed ifenv = "custom")max_retries&retry_delay: Control retry behavior for API requests
transform_uid&run_uid: Manually override transform/run UIDs (advanced usage only)
Functions¶
getRunUid()¶
Returns the UID of the current Lamin run.
Returns: String - The run UID, or null if the plugin hasn’t initialized the run yet.
Example:
include { getRunUid } from 'plugin/nf-lamin'
workflow {
def runUid = getRunUid()
log.info "Current run: ${runUid}"
}
getTransformUid()¶
Returns the UID of the current Lamin transform.
Returns: String - The transform UID, or null if the plugin hasn’t initialized the transform yet.
Example:
include { getTransformUid } from 'plugin/nf-lamin'
workflow {
def transformUid = getTransformUid()
log.info "Current transform: ${transformUid}"
}
getInstanceSlug()¶
Returns the currently configured LaminDB instance identifier.
Returns: String - The instance slug in the format “owner/name” (e.g., “laminlabs/lamindata”), or null if not available.
Example:
include { getInstanceSlug } from 'plugin/nf-lamin'
workflow {
def instance = getInstanceSlug()
log.info "Connected to LaminDB instance: ${instance}"
}
Lamin URIs¶
The plugin provides native support for lamin:// URIs, allowing you to reference LaminDB artifacts directly in your Nextflow workflows using Nextflow’s standard file() function.
lamin://<owner>/<instance>/artifact/<uid>[/<subpath>]
Components:
owner- The LaminDB instance owner (organization or user)instance- The LaminDB instance nameuid- The artifact UID (16 or 20 characters)16-character base UIDs fetch the most recently updated version
20-character full UIDs fetch that specific version
subpath- (Optional) Path within the artifact for directories or archives
Basic Usage¶
Use lamin:// URIs with the file() function:
workflow {
// Reference a LaminDB artifact directly by URI
def input_file = file('lamin://laminlabs/lamindata/artifact/PnNjE93TdZGJ')
log.info "Using artifact: ${input_file}"
Channel.of(input_file)
| myProcess
}
With Sub-paths¶
For artifacts that are directories or archives, reference specific files within them:
workflow {
// Reference a specific file within an artifact directory
def config_file = file('lamin://myorg/myinstance/artifact/abcd1234efgh5678/config/settings.yaml')
Channel.of(config_file)
| processConfig
}
As Workflow Parameters¶
Use lamin:// URIs as workflow parameters:
params.input = 'lamin://laminlabs/lamindata/artifact/PnNjE93TdZGJ'
workflow {
Channel.fromPath(params.input)
| myProcess
}
Or pass them on the command line:
nextflow run my-pipeline.nf --input 'lamin://laminlabs/lamindata/artifact/PnNjE93TdZGJ'
Requirements¶
The
nf-laminplugin must be configured with a valid API keyThe workflow must have started (the plugin initializes on workflow start)
For private cloud storage, ensure your AWS/GCS credentials are configured in
nextflow.config
Limitations¶
lamin://paths are read-only - you cannot write to themCurrently uses cloud credentials from your
nextflow.config(automatic credential federation from Lamin Hub is planned for a future release)