Python: lamindb

A data framework for biology.

Installation:

pip install lamindb

If you just want to read data from a LaminDB instance, use DB:

import lamindb as ln

db = ln.DB("laminlabs/cellxgene")

To write data, connect to a writable instance:

lamin login
lamin connect account/name

You can create an instance at lamin.ai and invite collaborators. If you prefer to work with a local SQLite instance, run:

lamin init --storage ./quickstart-data --modules bionty

LaminDB will then auto-connect upon import and you can then create & save objects like this:

import lamindb as ln
# → connected lamindb: account/instance

ln.Artifact("my_dataset.parquet", key="datasets/my_dataset.parquet").save()

Lineage

Track inputs, outputs & environment of a notebook or script run.

track([transform, project, space, branch, ...])

Track a run of your notebook or script.

finish([ignore_non_consecutive])

Finish the run and write a run report.

Decorate a function with @tracked() to track inputs, outputs & environment of function executions.

tracked([uid])

Track function runs.

Artifacts & storage locations

Files, folders & arrays and their storage locations.

Artifact()

Datasets & models stored as files, folders, or arrays.

Storage()

Storage locations of artifacts such as local directories or S3 buckets.

Transforms & runs

Data transformations and their executions.

Transform()

Data transformations such as scripts, notebooks, functions, or pipelines.

Run()

Runs of transforms such as the execution of a script.

Records, labels, features & schemas

Create labels and manage flexible records, e.g., for samples or donors.

Record()

Flexible metadata records.

ULabel()

Universal labels.

Define features & schemas to validate artifacts & records.

Feature()

Dimensions of measurement such as dataframe columns or dictionary keys.

Schema()

Schemas of datasets such as column sets of dataframes.

Project management

User()

Users.

Collection()

Versioned collections of artifacts.

Project()

Projects to label artifacts, transforms, records, and runs.

Space()

Workspaces with managed access for specific users or teams.

Branch()

Branches for change management with archive and trash states.

Reference()

References such as internal studies, papers, documents, or URLs.

Basic utilities

Connecting, viewing database content, accessing settings & run context.

DB(instance)

Query any registry of any instance.

connect([instance])

Connect the global default instance.

view(*[, limit, modules, registries, df])

View metadata.

save(records[, ignore_conflicts, batch_size])

Bulk save records.

UPath(*args[, protocol])

Paths: low-level key-value access to files.

settings

Global live settings (Settings).

context

Global run context (Context).

Curators and integrations

curators

Curators.

integrations

Integrations.

Examples, errors & setup

examples

Examples.

errors

Errors.

setup

Setup & configure LaminDB.

Developer API

base

Base library.

core

Core library.

models

Models library.