lamindb.integrations.lightning
¶
PyTorch Lightning integration for LaminDB.
The public API has two layers:
Checkpointis the concrete LaminDB implementation that persists checkpoint, config, andhparams.yamlfiles asArtifactobjects and annotates them withFeatureobjects.ArtifactPublishingModelCheckpointis the generic extension layer adding checkpoint artifact lifecycle hooks without implementing Lamin persistence details yet.
External integrations can either subclass Checkpoint directly or attach
an ArtifactObserver to react to saved and removed artifacts.
Here is a guide: lightning.
Main API¶
- class lamindb.integrations.lightning.Checkpoint(dirpath=None, *, features=None, monitor=None, verbose=False, save_last=None, save_top_k=1, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_train_steps=None, train_time_interval=None, every_n_epochs=None, save_on_train_epoch_end=None, enable_version_counter=True, run_uid_is_version=True, artifact_observers=None)¶
A
ModelCheckpointthat annotatespytorchlightningcheckpoints.Extends
lightning’sModelCheckpointwith artifact creation & feature annotation. Each checkpoint is a separate artifact whose key is derived from either the explicitdirpathor the trainer’s logger configuration.When
dirpathis omitted (recommended), Lightning decides where to store checkpoints locally (typicallylightning_logs/version_N/checkpoints/) and the artifact key is derived from the logger’ssave_dir,name, andversion. Whendirpathis provided, it is used directly as the key prefix.All artifacts are scoped under a single base prefix. Checkpoints (and
hparams.yaml) live under{base}/checkpoints/; other artifacts (e.g.config.yaml) live directly under{base}/.Base prefix derivation (highest priority first):
dirpathprovided →{dirpath}(logger is ignored for key purposes)dirpathomitted, logger present →{save_dir_basename}/{name}/{version}dirpathomitted, no logger → empty
When
run_uid_is_versionisTrue(the default) and a Lamin run context is active, the run UID is incorporated into the base prefix:Case 1/3: the run UID is appended as an extra path segment (e.g.
my/dir/{run_uid}, or just{run_uid}).Case 2: the logger’s auto-incremented
versionis replaced by the run UID ({save_dir_basename}/{name}/{run_uid}).
Resulting key layout (with run UID active):
{base}/checkpoints/epoch=0-step=100.ckpt {base}/checkpoints/hparams.yaml {base}/config.yaml
If available in the database through
save_lightning_features(), the followinglamindb.lightningfeatures are automatically tracked:Artifact-level:
is_best_model,is_last_model,score,model_rank,save_weights_only,monitor,modeRun-level:
logger_name,logger_version,max_epochs,max_steps,precision,accumulate_grad_batches,gradient_clip_val,monitor,mode
Additionally, model hyperparameters (from
pl_module.hparams) and datamodule hyperparameters (fromtrainer.datamodule.hparams) are captured if corresponding features exist.This is the concrete LaminDB implementation built on top of
ArtifactPublishingModelCheckpoint. Use it when you want LaminDB to be the persistence layer. For secondary systems such as ClearML, prefer attaching anArtifactObserveror subclassingCheckpointand reacting inon_artifact_saved().- Parameters:
dirpath (_PATH | None, default:
None) – Directory for checkpoints. When provided, also used as the artifact key prefix. When omitted (recommended), Lightning picks the local directory and the key prefix is derived from the logger.features (dict[Literal[‘run’, ‘artifact’], dict[str, Any]] | None, default:
None) – Features to annotate runs and artifacts. Use “run” key for run-level features (static metadata). Use “artifact” key for artifact-level features (values can be static or None for auto-population from trainer metrics/attributes).monitor (str | None, default:
None) – Quantity to monitor for saving best checkpoint.verbose (bool, default:
False) – Verbosity mode.save_last (bool | None, default:
None) – Save a copy of the last checkpoint.save_top_k (int, default:
1) – Number of best checkpoints to keep.save_weights_only (bool, default:
False) – Save only model weights (not optimizer state).mode (Literal[‘min’, ‘max’], default:
'min') – One of “min” or “max” for monitor comparison.auto_insert_metric_name (bool, default:
True) – Include metric name in checkpoint filename.every_n_train_steps (int | None, default:
None) – Checkpoint every N training steps.train_time_interval (timedelta | None, default:
None) – Checkpoint at time intervals.every_n_epochs (int | None, default:
None) – Checkpoint every N epochs.save_on_train_epoch_end (bool | None, default:
None) – Run checkpointing at end of training epoch.enable_version_counter (bool, default:
True) – Append version to filename to avoid collisions.run_uid_is_version (bool, default:
True) – WhenTrue(default) and a Lamin run context is active, incorporate the run UID into the base prefix. For the logger case the logger’s auto-incremented version is replaced; for the dirpath and no-logger cases the run UID is appended as an extra path segment. Prevents cross-run key collisions.artifact_observers (list[ArtifactObserver] | None, default:
None) – Optional observer objects notified when checkpoint, config, or hparams artifacts are saved or when checkpoint files are removed locally. Observers followArtifactObserverand receiveArtifactSavedEventandArtifactRemovedEvent.
Examples
Let Lightning decide where to store checkpoints (recommended):
import lightning as pl from lightning.pytorch.loggers import CSVLogger from lamindb.integrations import lightning as ll ll.save_lightning_features() callback = ll.Checkpoint(monitor="val_loss", save_top_k=3) logger = CSVLogger(save_dir="logs") trainer = pl.Trainer(callbacks=[callback], logger=logger) trainer.fit(model, dataloader) # Query checkpoints — key prefix is derived from the logger # e.g. "logs/lightning_logs/version_0/checkpoints/" ln.Artifact.filter(key__startswith=callback.checkpoint_key_prefix)
Explicit
dirpathfor full control over the artifact key prefix:callback = ll.Checkpoint( dirpath="deployments/my_model/", monitor="val_loss", save_top_k=3, ) trainer = pl.Trainer(callbacks=[callback]) trainer.fit(model, dataloader) # Query checkpoints ln.Artifact.filter(key__startswith=callback.checkpoint_key_prefix)
Using the CLI:
# config.yaml trainer: callbacks: - class_path: lamindb.integrations.lightning.Checkpoint init_args: monitor: val_loss save_top_k: 3 # Run with: # python main.py fit --config config.yaml
For more, see the guide: lightning.
- property base_prefix: str¶
The base artifact key prefix for all artifacts from this callback.
Checkpoints live under
{base_prefix}/checkpoints/and configs directly under{base_prefix}/.Available after
setup()has been called.
- property checkpoint_key_prefix: str¶
The artifact key prefix used for checkpoint artifacts.
Available after
setup()has been called, for example oncetrainer.fit()has started.
- setup(trainer, pl_module, stage)¶
Validate user features and detect available auto-features.
- Return type:
None
- resolve_artifact_storage_uri(artifact)¶
Resolve the physical artifact location for downstream registries.
This is the stable abstraction external packages should use instead of reconstructing storage locations from Lamin internals.
- Return type:
str
- resolve_artifact_key(trainer, filepath, kind)¶
Return the Lamin artifact key for a checkpoint-related file.
- Return type:
str
- save_checkpoint_artifact(trainer, filepath, *, feature_values=None)¶
Save a checkpoint artifact to Lamin and emit the corresponding event.
This is the main persistence hook used by
_save_checkpoint(). It is a useful override point for subclasses that want to augment Lamin persistence while keeping the generic lifecycle behavior from the base class.- Return type:
- save_config_artifact(trainer, config_path)¶
Save a Lightning CLI config artifact and emit the corresponding event.
Config artifacts are routed through the same lifecycle surface as checkpoints so observers and subclasses see a unified event stream.
- Return type:
- lamindb.integrations.lightning.save_lightning_features()¶
Save features to auto-track lightning parameters & metrics.
Creates the following features under the
lamindb.lightningfeature type if they do not already exist:Artifact-level features:
is_best_model(bool): Whether this checkpoint is the best model.is_last_model(bool): Whether this checkpoint is the most recently saved model.score(float): The monitored metric score.model_rank(int): Rank among all checkpoints (0 = best).save_weights_only(bool): Whether this checkpoint only stores model weights.monitor(str): Metric name this checkpoint uses for comparison.mode(str): Optimization mode (minormax) used for checkpoint ranking.
Run-level features:
logger_name(str): Name from the first Lightning logger.logger_version(str): Version from the first Lightning logger.max_epochs(int): Maximum number of epochs.max_steps(int): Maximum number of training steps.precision(str): Training precision (e.g., “32”, “16-mixed”, “bf16”).accumulate_grad_batches(int): Number of batches to accumulate gradients over.gradient_clip_val(float): Gradient clipping value.monitor(str): Metric name being monitored.mode(str): Optimization mode (“min” or “max”).
- Parameters:
None.
- Return type:
None
Example
Save the features to the database:
from lamindb.integrations import lightning as ll ll.save_lightning_features()
Auxiliary classes¶
- class lamindb.integrations.lightning.ArtifactPublishingModelCheckpoint(*args, artifact_observers=None, **kwargs)¶
ModelCheckpoint with observable artifact lifecycle hooks.
- This layer captures artifact kinds, observer registration, saved/removed
events, latest artifact tracking, and key compatibility hooks. Concrete subclasses remain responsible for how artifacts are persisted.
Subclasses are expected to implement:
resolve_artifact_key()to map local files to logical artifact keysresolve_artifact_storage_uri()to expose a stable backend URIsave_checkpoint_artifact(),save_config_artifact(), andsave_hparams_artifact()to persist files
SaveConfigCallbackonly depends on this base class, which means a custom checkpoint callback can participate in config saving without inheriting from Lamin’s concreteCheckpoint.
- property last_checkpoint_artifact: Any | None¶
The most recently saved checkpoint artifact handle.
- property last_config_artifact: Any | None¶
The most recently saved config artifact handle.
- property last_hparams_artifact: Any | None¶
The most recently saved hparams artifact handle.
- property last_artifact_event: ArtifactSavedEvent | ArtifactRemovedEvent | None¶
The last artifact lifecycle event emitted by this callback.
- get_last_artifact(kind)¶
Return the most recently saved artifact for a given artifact kind.
- Return type:
Any|None
- add_artifact_observer(observer)¶
Register an observer notified about artifact lifecycle events.
- Return type:
None
- remove_artifact_observer(observer)¶
Unregister a previously added artifact observer.
- Return type:
None
- resolve_artifact_storage_uri(artifact)¶
Resolve the physical location for a persisted artifact.
- Return type:
str
- resolve_artifact_key(trainer, filepath, kind)¶
Return the logical artifact key for a checkpoint-related file.
- Return type:
str
- on_artifact_saved(event)¶
Hook for subclasses after an artifact has been saved.
- Return type:
None
- on_artifact_removed(event)¶
Hook for subclasses after a checkpoint file has been removed.
- Return type:
None
- save_checkpoint_artifact(trainer, filepath, *, feature_values=None)¶
Persist a checkpoint artifact and emit the corresponding event.
- Return type:
Any
- save_config_artifact(trainer, config_path)¶
Persist a config artifact and emit the corresponding event.
- Return type:
Any
- save_hparams_artifact(trainer, hparams_path)¶
Persist an hparams artifact and emit the corresponding event.
- Return type:
Any|None
- class lamindb.integrations.lightning.SaveConfigCallback(*args, **kwargs)¶
SaveConfigCallback that also saves config to the instance.
Use with LightningCLI to save the resolved configuration file alongside checkpoints.
The local config file is saved under
{save_dir}/{name}/{version}/derived from the first logger, avoiding Lightning’strainer.log_dirwhich hardcodes anisinstancecheck forTensorBoardLogger/CSVLoggerand silently changes the directory for other loggers.This callback looks for any
ArtifactPublishingModelCheckpoint, not just Lamin’s concreteCheckpoint. That keeps the config-save path aligned with custom subclasses built on the generic artifact-publishing base.Config artifacts are stored directly under the base prefix of the active
Checkpointcallback. The base prefix follows the same derivation rules as for checkpoints (dirpath > logger > empty), so configs are always co-located with their checkpoints:Checkpoint.dirpathset →{dirpath}/config.yaml({dirpath}/{run_uid}/config.yamlwith run-UID scoping)Logger present, no
dirpath→{save_dir_basename}/{name}/{version}/config.yamlNeither →
config.yaml(or{run_uid}/config.yamlwith run-UID scoping)
Example:
from lightning.pytorch.cli import LightningCLI from lamindb.integrations import lightning as ll cli = LightningCLI( MyModel, MyDataModule, save_config_callback=ll.SaveConfigCallback, )
- setup(trainer, pl_module, stage)¶
Save resolved configuration file alongside checkpoints.
- Return type:
None
- class lamindb.integrations.lightning.ArtifactSavedEvent(kind, key, local_path, trainer, artifact, storage_uri)¶
Metadata emitted after a checkpoint-related artifact has been persisted.
artifactis intentionally typed generically so downstream integrations can expose their own persisted object while still using the common lifecycle API.storage_uriis the stable hand-off value for registries such as ClearML.
- class lamindb.integrations.lightning.ArtifactRemovedEvent(kind, key, local_path, trainer, artifact=None, storage_uri=None)¶
Metadata emitted after a local checkpoint file has been removed.
Removal currently applies to checkpoint files. Config and hparams artifacts are save-only in the current Lightning integration.
-
artifact:
Any|None= None¶
-
storage_uri:
str|None= None¶
-
artifact: