#### Design & architecture

LaminDB is a distributed system like git that can be run or hosted
anywhere. It only needs a SQLite or Postgres database and a storage
location (file system, S3, GCP, HuggingFace, ...).

You can easily create your new local instance:

-[ Shell ]-

 lamin init --storage ./mydir

-[ Py ]-

 import lamindb as ln
 ln.setup.init(storage="./mydir")

-[ R ]-

 library(laminr)
 lamin_init(storage="./mydir")

Or you can let collaborators connect to a cloud-hosted instance:

-[ Shell ]-

 lamin connect account/instance

-[ Py ]-

 import lamindb as ln
 ln.connect("account/instance")

-[ R ]-

 library(laminr)
 ln <- import_module("lamindb")
 ln <- ln$connect("account/instance")

For learning more about how to create & host LaminDB instances, see
Install & setup . LaminDB instances work standalone but can optionally
be managed by LaminHub. For an architecture diagram of LaminHub, reach
out!

### Database schema & API

LaminDB provides a SQL schema for common metadata entities:
"Artifact", "Collection", "Transform", "Feature", "Record" etc. - see
the API reference or the source code.

The core metadata schema is extendable through modules, e.g., with
basic biological ("Gene", "Protein", "CellLine", etc.) & operational
entities ("Biosample", "Techsample", "Treatment", etc.).

Data models are defined in Python using the Django ORM. Django
translates them to SQL tables. Django is one of the most-used &
highly-starred projects on GitHub (~1M dependents, ~73k stars) and has
been robustly maintained for 15 years. While the SQLAlchemy ORM has
some advantages, Django is the most popular choice for building
metadata management systems in the life sciences.

On top of the metadata schema, LaminDB is a Python API that models
datasets as artifacts, abstracts storage & database access, data
transformations, and ontologies.

### Modules

LaminDB can be extended with modules building on the Django ecosystem.
Examples are:

* bionty: Basic biological ontologies, with easy import from >20
  public ontologies

* pertdb: Registries for perturbations (compounds, biologics, genetic
  interventions, etc.)

If you'd like to create your own module:

1. Create a git repository with registries similar to pertdb

2. Create & deploy migrations via "lamin migrate create" and "lamin
 migrate deploy"

For more information, see Install & setup .

### Repositories

LaminDB and its plugins consist in open-source Python libraries &
publicly hosted metadata assets:

* lamindb: Core library.

* bionty: Basic biological ontologies, with easy import from >20
  public ontologies

* pertdb: Registries for perturbations (compounds, biologics, genetic
  interventions, etc.)

Tightly integrated dependencies are available as git submodules here,
for instance,

* lamindb-setup: Setup & configure LaminDB.

* lamin-cli: CLI for "lamindb" and "lamindb-setup".

Use cases / domain-specific repos:

* lamin-usecases: Use cases as visible on the docs.

* redun-lamin: Track redun workflow runs with LaminDB.

* lamin-mlops: MLOps use cases (MNIST, W&B, MLflow, Croissant).

* cellxgene-lamin: CELLxGENE data and curation.

* lamin-spatial: Spatial data (RxRx, Vitessce).

* snakemake-lamin: Track Snakemake runs with LaminDB.

* nf-lamin: Nextflow integration with LaminDB.

For a comprehensive list of open-sourced software, browse our GitHub
account, for instance,

* readfcs: FCS artifact reader.

There is a public repository for LaminHub:

* laminhub-public: Make issues and follow releases of LaminHub, no
  source code.