Features¶
LaminDB¶
Access data & metadata across storage (files, arrays) & database (SQL) backends.
Manage
Feature
,FeatureSet
,ULabel
Plug-in custom schemas & manage schema migrations
Use array formats in memory & storage: DataFrame, AnnData, MuData, SOMA, … backed by parquet, zarr, TileDB, HDF5, h5ad, DuckDB, …
Create iterable collections of artifacts:
Artifact
,Collection
Use PyTorch data loaders:
mapped()
Version artifacts, collections & transforms:
IsVersioned
Track data lineage across notebooks, pipelines & UI: track()
, Transform
& Run
.
Execution reports, source code and Python environments for notebooks & scripts
Integrate with workflow managers: redun, nextflow, snakemake
Manage registries for experimental metadata & in-house ontologies, import public ontologies.
Use >20 public ontologies with plug-in
bionty
Gene
,Protein
,CellMarker
,ExperimentalFactor
,CellType
,CellLine
,Tissue
, …Safeguards against typos & duplications
Ontology versioning
Validate, standardize & annotate based on registries: validate
& standardize
.
Use a high-level curation flow:
Curate
Inspect validation failures:
inspect
Annotate with features & labels:
FeatureManager
Save data & metadata ACID:
save
Organize and share data across a mesh of LaminDB instances.
Create & load instances like git repos:
lamin init
&lamin load
Zero-copy transfer data across instances
Integrate with analytics tools.
Vitessce:
save_vitessce_config
Zero lock-in, scalable, auditable, access management, and more.
Zero lock-in: LaminDB runs on generic backends server-side and is not a client for “Lamin Cloud”
Flexible storage backends (local, S3, GCP, anything fsspec supports)
Two SQL backends for managing metadata: SQLite & Postgres
Scalable: registries support 100s of millions of entries
Auditable: data & metadata records are hashed, timestamped, and attributed to users (full audit log to come)
Access management:
High-level access management through Lamin’s collaborator roles
Fine-grained access management via storage & SQL roles
Secure: embedded in your infrastructure (Lamin has no access to your data & metadata)
Tested & typed (up to Django Model fields)
LaminHub¶
See pricing.
Secure & intuitive access management.
LaminHub provides a layer for AWS & GCP that makes access management more secure & intuitive.
Rather than configuring storage & database permissions directly on AWS or GCP, LaminHub allows you to manage collaborators for databases & storage locations in the same way you manage access to repositories on GitHub. However, in contrast to a typical SaaS product like GitHub, LaminHub leaves you in full control of your data with direct API access to databases & storage locations on AWS or GCP.
How does it work?
Based on an identity provider (Google, GitHub, SSO, OIDC) and a role-based permission system, LaminDB users automatically receive federated access tokens for data on AWS or GCP. These tokens are short-lived and thereby minimize attack surface.
LaminHub’s permission system makes it easy to minimize attack surfaces by implementing the principle of least privilege.
A UI to work with LaminDB instances.
Explore in the hub UI or lamin load owner/instance
via the CLI:
lamin.ai/laminlabs/arrayloader-benchmarks - Work with ML models & benchmarks
lamin.ai/laminlabs/cellxgene - An instance with the CELLxGENE data (guide)
lamin.ai/laminlabs/lamindata - A generic demo instance with various data types
See validated datasets in context of ontologies & experimental metadata.
![](https://lamin-site-assets.s3.amazonaws.com/.lamindb/DjVOPEBiAcGlt3Gq3APh.png)
Query & search.
![](https://lamin-site-assets.s3.amazonaws.com/.lamindb/L188T2JjzZHWHfv2sZGu.png)
See scripts, notebooks & pipelines with their inputs & outputs.
![](https://lamin-site-assets.s3.amazonaws.com/.lamindb/RGXj5wcAf7EAc6J8dJfH.png)
Track pipelines, notebooks & UI transforms in one registry.
![](https://lamin-site-assets.s3.amazonaws.com/.lamindb/IpV8Kiq4xUbgXhzlUYT7.png)