Skip to main content
Ctrl+K
logo

Lamin Docs

Site Navigation

  • Guide
  • API Reference
  • Integrations
  • Changelog

Site Navigation

  • Guide
  • API Reference
  • Integrations
  • Changelog

Overview

  • Introduction
  • Tutorial

How to

  • Install & setup
  • Query & search
    • Stream datasets from storage
  • Track notebooks, scripts & workflows
  • Organize datasets
    • Validate & standardize datasets
  • Manage changes
  • Access & manage ontologies
    • Manage biological ontologies
    • Access public ontologies
      • Gene
      • Protein
      • Organism
      • CellLine
      • CellType
      • CellMarker
      • Tissue
      • Disease
      • Phenotype
      • Pathway
      • ExperimentalFactor
      • DevelopmentalStage
      • Ethnicity
  • Sync & transfer data across databases

Use cases

  • Query & build atlases
    • CELLxGENE: scRNA-seq
      • Curate AnnData based on the CELLxGENE schema
    • Arc Virtual Cell Atlas: scRNA-seq
    • Hubmap
    • RxRx: cell imaging
  • Manage data types
    • scRNA-seq
      • Standardize and append a dataset
      • Query artifacts
      • Analyze a collection in memory
      • Train a machine learning model on a collection
      • Concatenate datasets to a single array store
      • Preprocessing and clustering 3k PBMCs
    • Bulk RNA-seq
    • Methylation
    • Flow cytometry
      • Append a new dataset
      • Query & integrate data
      • Analyze the collection and save a result
    • Spatial RNA-seq
      • Interactive visualization using Vitessce
      • Curate and ingest spatial data
      • Train a spatial ML model
    • Single-cell imaging
      • Generate single-cell images
      • Featurize single-cell images
      • Identify autophagy-positive cells
    • Multi-modal
    • EHR
  • Leverage ontologies
    • CellTypist
    • Gene Ontology (GO)
    • Cell type annotation and pathway analysis
    • RDF export & SPARQL queries
  • Trace data & code
    • Re-constructing Schmidt et al. (2022)
  • Manage computational pipelines
    • Nextflow
    • Redun
    • Snakemake
  • Manage MLOps
    • Weights & Biases
    • MLFlow
    • Curate MNIST
    • Croissant
  • Manage visualization dashboards
    • Vitessce: AnnData
    • Vitessce: SpatialData

The Hub

  • Manage records
  • Launch computational pipelines
  • Manage access permissions
  • Security

Other topics

  • Design & architecture
  • FAQ
    • Pydantic & Pandera vs. LaminDB
    • Will data get duplicated upon re-running code?
    • Will data & metadata stay in sync?
    • Can I disable tracking run inputs?
    • How do I validate & annotate arbitrary data structures?
    • What happens if I import a schema module without lamindb?
    • Where to store external links and IDs?
    • Keep artifacts local in a cloud instance
    • Django field validation
    • Why should I not index datasets with gene symbols?
    • How does search work?
  • Influences
  • Glossary

Query & search .md .md¶

This guide walks through different ways of querying & searching registries. To understand the underlying cross-linking of objects in the SQL database, see Organize datasets.

If you already have a set of artifacts and you’d like to stream their content, see Stream datasets from storage .

# initialize a test database to run examples
!lamin init --storage ./test-registries --modules bionty
Show code cell output Hide code cell output
→ initialized lamindb: testuser1/test-registries

Let’s start by creating a few exemplary datasets:

import lamindb as ln

ln.Artifact(ln.examples.datasets.file_fastq(), key="raw/my_fastq.fastq.gz").save()
ln.Artifact(ln.examples.datasets.file_jpg_paradisi05(), key="my_image.jpg").save()
ln.Artifact.from_dataframe(ln.examples.datasets.df_iris(), key="iris.parquet").save()
ln.examples.datasets.mini_immuno.save_mini_immuno_datasets()
Show code cell output Hide code cell output
→ connected lamindb: testuser1/test-registries
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
! no run & transform got linked, call `ln.track()` & re-run
→ loading artifact into memory for validation
! no run & transform got linked, call `ln.track()` & re-run
→ loading artifact into memory for validation

Get an overview¶

The easiest way to get an overview over all artifacts is by typing to_dataframe(), which returns the most recently created artifacts in the Artifact registry.

ln.Artifact.to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

5 rows × 21 columns

You can include features.

ln.Artifact.to_dataframe(include="features")
Show code cell output Hide code cell output
→ queried for all categorical features of dtypes Record or ULabel and non-categorical features: (7) ['perturbation', 'sample_note', 'temperature', 'experiment', 'date_of_study', 'study_note', 'study_metadata']
/home/runner/work/lamindb/lamindb/lamindb/models/sqlrecord.py:772: FutureWarning: The default `to_dataframe(limit=...)` will change from 100 to 20 in lamindb 2.6.0. Pass `limit=100` to keep the current behavior or `limit=20` to adopt the future default now.
  return cls.filter().to_dataframe(
uid key perturbation temperature experiment date_of_study study_note study_metadata
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad {IFNG, DMSO} 22.6 Experiment 2 2025-02-13 NaN {'detail1': '456', 'detail2': 2}
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad {IFNG, DMSO} 21.6 Experiment 1 2024-12-01 We had a great time performing this study and ... {'detail1': '123', 'detail2': 1}
3 xtoeLCrISnlv0qnN0000 iris.parquet NaN NaN NaN NaT NaN NaN
2 Dp5dO9exiCAQhfxa0000 my_image.jpg NaN NaN NaN NaT NaN NaN
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz NaN NaN NaN NaT NaN NaN

You can include fields from other registries.

ln.Artifact.to_dataframe(
    include=[
        "created_by__name",
        "records__name",
        "cell_types__name",
        "schemas__itype",
    ]
)
Show code cell output Hide code cell output
/home/runner/work/lamindb/lamindb/lamindb/models/sqlrecord.py:772: FutureWarning: The default `to_dataframe(limit=...)` will change from 100 to 20 in lamindb 2.6.0. Pass `limit=100` to keep the current behavior or `limit=20` to adopt the future default now.
  return cls.filter().to_dataframe(
uid key created_by__name records__name cell_types__name schemas__itype
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad Test User1 {IFNG, DMSO, Experiment 2} {T cell, B cell} {bionty.Gene.ensembl_gene_id, Feature}
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad Test User1 {IFNG, DMSO, Experiment 1} {CD8-positive, alpha-beta T cell, T cell, B cell} {bionty.Gene.ensembl_gene_id, Feature}
3 xtoeLCrISnlv0qnN0000 iris.parquet Test User1 {None} {None} {None}
2 Dp5dO9exiCAQhfxa0000 my_image.jpg Test User1 {None} {None} {None}
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz Test User1 {None} {None} {None}

You can also get an overview of the entire database.

ln.view()
Show code cell output Hide code cell output
****************
* module: core *
****************
Artifact
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

5 rows × 21 columns

! truncated query result to limit=7 Feature objects
Feature
uid name _dtype_str unit description array_rank array_size array_shape synonyms default_value ... coerce is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
9 4Sbk4ftywwJv study_metadata dict None None 0 0 None None None ... None False False 2026-05-26 09:53:20.794000+00:00 1 1 1 1 None None
8 TzC4jDn72QXU study_note str None None 0 0 None None None ... None False False 2026-05-26 09:53:20.787000+00:00 1 1 1 1 None None
7 JDoXntkcqUhC date_of_study date None None 0 0 None None None ... True False False 2026-05-26 09:53:20.781000+00:00 1 1 1 1 None None
6 6mcdD8qAS0Rj experiment cat[Record] None None 0 0 None None None ... None False False 2026-05-26 09:53:20.775000+00:00 1 1 1 1 None None
5 qtfTjQQxoXTU temperature float None None 0 0 None None None ... None False False 2026-05-26 09:53:20.769000+00:00 1 1 1 1 None None
4 INMg1bYOyVBe cell_type_by_model cat[bionty.CellType] None None 0 0 None None None ... None False False 2026-05-26 09:53:20.763000+00:00 1 1 1 1 None None
3 Y5VrrRf32eXh cell_type_by_expert cat[bionty.CellType] None None 0 0 None None None ... None False False 2026-05-26 09:53:20.757000+00:00 1 1 1 1 None None

7 rows × 21 columns

JsonValue
value hash is_locked created_at branch_id created_on_id space_id created_by_id run_id feature_id
id
7 {'detail1': '456', 'detail2': 2} QAU2Is6uXBBgz8zC_p-rAQ False 2026-05-26 09:53:29.347000+00:00 1 1 1 1 None 9
6 2025-02-13 SGTsR3XvXFi5jZ8UjC6YaQ False 2026-05-26 09:53:29.345000+00:00 1 1 1 1 None 7
5 22.6 54rmFUZH0WdllA5alp-64g False 2026-05-26 09:53:29.338000+00:00 1 1 1 1 None 5
4 {'detail1': '123', 'detail2': 1} nJ33A6k51yp-1ZlqFabWdw False 2026-05-26 09:53:26.179000+00:00 1 1 1 1 None 9
3 We had a great time performing this study and ... ixx1CqAyBO8WO7lLdLpqTg False 2026-05-26 09:53:26.177000+00:00 1 1 1 1 None 8
2 2024-12-01 gNXeOkGaab5bqWC7D--aHQ False 2026-05-26 09:53:26.175000+00:00 1 1 1 1 None 7
1 21.6 XftFE5byhwPHY-11WjfNAw False 2026-05-26 09:53:26.169000+00:00 1 1 1 1 None 5
Record
uid name description reference reference_type extra_data is_locked is_type created_at branch_id created_on_id space_id created_by_id type_id schema_id run_id
id
4 KmuqKZoMG0s3n5I3 Experiment 2 None None None None False False 2026-05-26 09:53:19.635000+00:00 1 1 1 1 None None None
3 DMcy2evBj8gC2xVN Experiment 1 None None None None False False 2026-05-26 09:53:19.635000+00:00 1 1 1 1 None None None
2 ZjHHdnlSMQ26Fham IFNG None None None None False False 2026-05-26 09:53:19.624000+00:00 1 1 1 1 None None None
1 7gksxA8Zucj5PoKz DMSO None None None None False False 2026-05-26 09:53:19.624000+00:00 1 1 1 1 None None None
Schema
uid name description n_members coerce flexible itype otype hash minimal_set ... maximal_set is_locked is_type created_at branch_id created_on_id space_id created_by_id run_id type_id
id
7 jdB1Q7KOp8gW7OPv None None 3.0 None False bionty.Gene.ensembl_gene_id None fSbuKqXueizoVnttx06vsw True ... False False False 2026-05-26 09:53:29.308000+00:00 1 1 1 1 None None
6 T4SiT46dbzIpjGq8 None None 2.0 None False Feature None 39VJ-Elna58POh_On9sv-g True ... False False False 2026-05-26 09:53:29.301000+00:00 1 1 1 1 None None
5 eGiBVcfvKsVQpyqM None None 3.0 None False bionty.Gene.ensembl_gene_id None P5KzXILi0TzYDHB82Pvt-w True ... False False False 2026-05-26 09:53:26.133000+00:00 1 1 1 1 None None
4 R0qyxtZYmPxc5U5e None None 4.0 None False Feature None 4NBvP9aRjhwVdfZjX4_few True ... False False False 2026-05-26 09:53:26.125000+00:00 1 1 1 1 None None
3 0000000000000002 anndata_ensembl_gene_ids_and_valid_features_in... None NaN None True None AnnData aqGWHvyY49W_PHELUMiBMw True ... False False False 2026-05-26 09:53:20.822000+00:00 1 1 1 1 None None
2 0000000000000001 valid_ensembl_gene_ids None NaN None True bionty.Gene.ensembl_gene_id None 1gocc_TJ1RU2bMwDRK-WUA True ... False False False 2026-05-26 09:53:20.815000+00:00 1 1 1 1 None None
1 0000000000000000 valid_features None NaN None True Feature None kMi7B_N88uu-YnbTLDU-DA True ... False False False 2026-05-26 09:53:20.807000+00:00 1 1 1 1 None None

7 rows × 21 columns

Storage
uid root description type region instance_uid is_locked created_at branch_id created_on_id space_id created_by_id run_id
id
1 ugc66VmkJv4R /home/runner/work/lamindb/lamindb/docs/test-re... None local None hlGq1WkbeSSf False 2026-05-26 09:53:15.672000+00:00 1 1 1 1 None
******************
* module: bionty *
******************
! truncated query result to limit=7 CellType objects
CellType
uid name ontology_id abbr synonyms description is_locked created_at branch_id created_on_id space_id created_by_id run_id source_id
id
16 2OTzqBTMlYe5n3 mature T cell CL:0002419 None CD3e-positive T cell|mature T-cell A T Cell That Expresses A T Cell Receptor Comp... False 2026-05-26 09:53:23.088000+00:00 1 1 1 1 None 26
15 4BEwsp1Qruxeii mature alpha-beta T cell CL:0000791 None mature alpha-beta T lymphocyte|mature alpha-be... A Alpha-Beta T Cell That Has A Mature Phenotype. False 2026-05-26 09:53:23.088000+00:00 1 1 1 1 None 26
14 6By01L04BqiLTW alpha-beta T cell CL:0000789 None alpha-beta T-cell|alpha-beta T lymphocyte|alph... A T Cell That Expresses An Alpha-Beta T Cell R... False 2026-05-26 09:53:23.088000+00:00 1 1 1 1 None 26
13 6IC9NGJEv2Y4TD CD8-positive, alpha-beta T cell CL:0000625 None CD8-positive, alpha-beta T-cell|CD8-positive, ... A T Cell Expressing An Alpha-Beta T Cell Recep... False 2026-05-26 09:53:22.541000+00:00 1 1 1 1 None 26
12 u3sr1GdfF3aIV9 nucleate cell CL:0002242 None None A Cell Containing At Least One Nucleus. False 2026-05-26 09:53:20.731000+00:00 1 1 1 1 None 26
11 4Ilrnj9ULJe69Z hematopoietic cell CL:0000988 None haemopoietic cell|hemopoietic cell|haematopoie... A Cell Of A Hematopoietic Lineage. False 2026-05-26 09:53:20.731000+00:00 1 1 1 1 None 26
10 7GpphKmr4cyIoB lymphocyte of B lineage CL:0000945 None None A Lymphocyte Of B Lineage With The Commitment ... False 2026-05-26 09:53:20.731000+00:00 1 1 1 1 None 26
Gene
uid abbr synonyms description symbol stable_id ensembl_gene_id ncbi_gene_ids biotype is_locked created_at branch_id created_on_id space_id created_by_id run_id source_id organism_id
id
4 iFxDa8hoEWuWi9 None CADPR1 CD38 molecule CD38 None ENSG00000004468 952 protein_coding False 2026-05-26 09:53:29.283000+00:00 1 1 1 1 None 7 1
3 3bhNYquOnA4sdo None CD14 molecule CD14 None ENSG00000170458 929 protein_coding False 2026-05-26 09:53:26.102000+00:00 1 1 1 1 None 7 1
2 1j4At3x7akJU8n None T4|LEU-3 CD4 molecule CD4 None ENSG00000010610 920 protein_coding False 2026-05-26 09:53:26.102000+00:00 1 1 1 1 None 7 1
1 6Aqvc8ckDYeNrD None CD8|CD8ALPHA|P32 CD8 subunit alpha CD8A None ENSG00000153563 925 protein_coding False 2026-05-26 09:53:26.102000+00:00 1 1 1 1 None 7 1
Organism
uid name ontology_id abbr synonyms description scientific_name is_locked created_at branch_id created_on_id space_id created_by_id run_id source_id
id
1 1dpCL6TduFJ3AP human NCBITaxon:9606 None None None Homo sapiens False 2026-05-26 09:53:22.094000+00:00 1 1 1 1 None 1
! truncated query result to limit=7 Source objects
Source
uid entity organism name version in_db currently_used description url md5 source_website is_locked created_at branch_id created_on_id space_id created_by_id run_id dataframe_artifact_id
id
43 5JnVODh4 BioSample all ncbi 2023-09 False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... None https://www.ncbi.nlm.nih.gov/biosample/docs/at... False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
42 7au3ZQrD bionty.Ethnicity human hancestro 2025-10-14 False True Human Ancestry Ontology http://purl.obolibrary.org/obo/hancestro/relea... None https://github.com/EBISPOT/hancestro False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
41 6na9vRls bionty.DevelopmentalStage mouse mmusdv 2025-01-23 False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
40 7JO1x6p1 bionty.DevelopmentalStage human hsapdv 2025-01-23 False True Human Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
39 1atB0WnU Drug all chebi 2024-07-27 False False Chemical Entities of Biological Interest s3://bionty-assets/df_all__chebi__2024-07-27__... None https://www.ebi.ac.uk/chebi/ False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
38 ugaIoIlj Drug all dron 2024-08-05 False True Drug Ontology http://purl.obolibrary.org/obo/dron/releases/2... None https://bioportal.bioontology.org/ontologies/DRON False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None
37 3rm9aOzL BFXPipeline all lamin 1.0.0 False True Bioinformatics Pipeline s3://bionty-assets/df_all__lamin__1.0.0__BFXpi... None https://lamin.ai False 2026-05-26 09:53:16.149000+00:00 1 1 1 1 None None

Auto-complete objects¶

For registries with less than 100k objects, auto-completing a Lookup object is the most convenient way of finding a record.

records = ln.Record.lookup()

With auto-complete, we find a record:

experiment_1 = records.experiment_1
experiment_1
Show code cell output Hide code cell output
Record(uid='DMcy2evBj8gC2xVN', is_type=False, name='Experiment 1', description=None, reference=None, reference_type=None, extra_data=None, branch_id=1, created_on_id=1, space_id=1, created_by_id=1, type_id=None, schema_id=None, run_id=None, created_at=2026-05-26 09:53:19 UTC, is_locked=False)

This works for any BaseSQLRecord class, e.g., also for plugin bionty.

import bionty as bt

cell_types = bt.CellType.lookup()

Get one object¶

get() errors if none or more than one matching objects are found.

ln.Record.get(experiment_1.uid)  # by uid
ln.Record.get(name="Experiment 1")  # by field
Show code cell output Hide code cell output
Record(uid='DMcy2evBj8gC2xVN', is_type=False, name='Experiment 1', description=None, reference=None, reference_type=None, extra_data=None, branch_id=1, created_on_id=1, space_id=1, created_by_id=1, type_id=None, schema_id=None, run_id=None, created_at=2026-05-26 09:53:19 UTC, is_locked=False)

Query objects by fields¶

Use filter() to query all artifacts by the suffix field:

qs = ln.Artifact.filter(suffix=".h5ad")
qs
Show code cell output Hide code cell output
<ArtifactQuerySet [Artifact(uid='TzrkrqBbXDiqXA7g0000', key='examples/dataset1.h5ad', description=None, suffix='.h5ad', kind='dataset', otype='AnnData', size=31672, hash='FB3CeMjmg1ivN6HDy6wsSg', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=None, schema_id=3, created_by_id=1, created_at=2026-05-26 09:53:26 UTC, is_locked=False, version_tag=None, is_latest=True), Artifact(uid='urVJ9qxb4sGuyp920000', key='examples/dataset2.h5ad', description=None, suffix='.h5ad', kind='dataset', otype='AnnData', size=26896, hash='RKJjWbINYNIwYU8BxCejMw', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=None, schema_id=3, created_by_id=1, created_at=2026-05-26 09:53:29 UTC, is_locked=False, version_tag=None, is_latest=True)]>

This returns a QuerySet, which lazily references the set of BaseSQLRecord objects that matches the filter statement. You can iteratively filter a queryset:

qs = qs.filter(records__name="Experiment 1")

To access the results encoded in a queryset, call:

  • to_dataframe(): A pandas DataFrame with each record in a row.

  • one(): Exactly one record. Will raise an error if there is none. Is equivalent to the .get() method shown above.

  • one_or_none(): Either one record or None if there is no query result.

Alternatively,

  • use the QuerySet as an iterator

  • get individual objects via qs[0], qs[1]

For example:

qs.to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3 1

1 rows × 21 columns

The SQLRecord classes in LaminDB are Django Models and any Django query works.

Query objects by features¶

The Artifact, Record, and Run registries can be queried by features, via an implicit lookup in the Feature registry:

ln.Artifact.filter(perturbation="DMSO").to_dataframe(include="features")
Show code cell output Hide code cell output
→ queried for all categorical features of dtypes Record or ULabel and non-categorical features: (7) ['perturbation', 'sample_note', 'temperature', 'experiment', 'date_of_study', 'study_note', 'study_metadata']
/tmp/ipykernel_3332/2020331585.py:1: FutureWarning: The default `to_dataframe(limit=...)` will change from 100 to 20 in lamindb 2.6.0. Pass `limit=100` to keep the current behavior or `limit=20` to adopt the future default now.
  ln.Artifact.filter(perturbation="DMSO").to_dataframe(include="features")
uid key perturbation temperature experiment date_of_study study_note study_metadata
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad {IFNG, DMSO} 22.6 Experiment 2 2025-02-13 NaN {'detail1': '456', 'detail2': 2}
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad {IFNG, DMSO} 21.6 Experiment 1 2024-12-01 We had a great time performing this study and ... {'detail1': '123', 'detail2': 1}
perturbation = ln.Feature.get(name="perturbation")  # can optionally pass a feature type to disambiguate
ln.Artifact.filter(perturbation == "DMSO")  # note this is now an expression using the == syntax
Show code cell output Hide code cell output
<ArtifactQuerySet [Artifact(uid='TzrkrqBbXDiqXA7g0000', key='examples/dataset1.h5ad', description=None, suffix='.h5ad', kind='dataset', otype='AnnData', size=31672, hash='FB3CeMjmg1ivN6HDy6wsSg', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=None, schema_id=3, created_by_id=1, created_at=2026-05-26 09:53:26 UTC, is_locked=False, version_tag=None, is_latest=True), Artifact(uid='urVJ9qxb4sGuyp920000', key='examples/dataset2.h5ad', description=None, suffix='.h5ad', kind='dataset', otype='AnnData', size=26896, hash='RKJjWbINYNIwYU8BxCejMw', n_files=None, n_observations=3, branch_id=1, created_on_id=1, space_id=1, storage_id=1, run_id=None, schema_id=3, created_by_id=1, created_at=2026-05-26 09:53:29 UTC, is_locked=False, version_tag=None, is_latest=True)]>

Just like for fields holding dictionary values, you can query for dictionary keys in features whose dtype is dict:

ln.Artifact.filter(study_metadata__detail1="123").to_dataframe(include="features")
Show code cell output Hide code cell output
→ queried for all categorical features of dtypes Record or ULabel and non-categorical features: (7) ['perturbation', 'sample_note', 'temperature', 'experiment', 'date_of_study', 'study_note', 'study_metadata']
uid key perturbation temperature experiment date_of_study study_note study_metadata
id
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad {IFNG, DMSO} 21.6 Experiment 1 2024-12-01 We had a great time performing this study and ... {'detail1': '123', 'detail2': 1}

You can query for whether a dataset is annotated annotated by a feature:

ln.Artifact.filter(perturbation__isnull=False).to_dataframe(include="features")
Show code cell output Hide code cell output
→ queried for all categorical features of dtypes Record or ULabel and non-categorical features: (7) ['perturbation', 'sample_note', 'temperature', 'experiment', 'date_of_study', 'study_note', 'study_metadata']
/tmp/ipykernel_3332/3455849368.py:1: FutureWarning: The default `to_dataframe(limit=...)` will change from 100 to 20 in lamindb 2.6.0. Pass `limit=100` to keep the current behavior or `limit=20` to adopt the future default now.
  ln.Artifact.filter(perturbation__isnull=False).to_dataframe(include="features")
uid key perturbation temperature experiment date_of_study study_note study_metadata
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad {IFNG, DMSO} 22.6 Experiment 2 2025-02-13 NaN {'detail1': '456', 'detail2': 2}
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad {IFNG, DMSO} 21.6 Experiment 1 2024-12-01 We had a great time performing this study and ... {'detail1': '123', 'detail2': 1}

Query runs by parameters¶

Here is an example for querying by parameters: Track parameters & features.

Search for objects¶

You can search every registry via search(). For example, the Artifact registry.

ln.Artifact.search("iris").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None None 1

1 rows × 21 columns

Here is more background on search and examples for searching the entire cell type ontology: How does search work?

Query related registries¶

Django has a double-under-score syntax to filter based on related tables.

This syntax enables you to traverse several layers of relations and leverage different comparators.

ln.Artifact.filter(created_by__handle__startswith="testuse").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

5 rows × 21 columns

The filter selects all artifacts based on the users who ran the generating notebook. Under the hood, in the SQL database, it’s joining the artifact table with the user table.

Another typical example is querying all datasets that measure a particular feature. For instance, which datasets measure "CD8A". Here is how to do it:

cd8a = bt.Gene.get(symbol="CD8A")
# query for all feature sets that contain CD8A
schemas_with_cd8a = ln.Schema.filter(genes=cd8a)
# get all artifacts
ln.Artifact.filter(schemas__in=schemas_with_cd8a).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3 1

2 rows × 21 columns

Instead of splitting this across three queries, the double-underscore syntax allows you to define a path for one query.

ln.Artifact.filter(schemas__genes__symbol="CD8A").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3 1

2 rows × 21 columns

Filter operators¶

You can qualify the type of comparison in a query by using a comparator.

Below follows a list of the most import, but Django supports about two dozen field comparators field__comparator=value.

and¶

ln.Artifact.filter(suffix=".h5ad", records=experiment_1).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3 1

1 rows × 21 columns

less than/ greater than¶

Or subset to artifacts greater than 10kB. Here, we can’t use keyword arguments, but need an explicit where statement.

ln.Artifact.filter(records=experiment_1, size__gt=1e4).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3 1

1 rows × 21 columns

in¶

ln.Artifact.filter(suffix__in=[".jpg", ".fastq.gz"]).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None None ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None None 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None None ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None None 1

2 rows × 21 columns

order by¶

ln.Artifact.filter().order_by("created_at").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1

5 rows × 21 columns

# reverse ordering
ln.Artifact.filter().order_by("-created_at").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

5 rows × 21 columns

ln.Artifact.filter().order_by("key").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

5 rows × 21 columns

# reverse ordering
ln.Artifact.filter().order_by("-key").to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None NaN ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None NaN 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1

5 rows × 21 columns

contains¶

ln.Transform.filter(description__contains="search").to_dataframe().head(5)
Show code cell output Hide code cell output
uid id key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id created_on_id space_id environment_id plan_id created_by_id

And case-insensitive:

ln.Transform.filter(description__icontains="Search").to_dataframe().head(5)
Show code cell output Hide code cell output
uid id key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id created_on_id space_id environment_id plan_id created_by_id

startswith¶

ln.Transform.filter(description__startswith="Query").to_dataframe()
Show code cell output Hide code cell output
uid id key description kind source_code hash reference reference_type version_tag is_latest is_locked created_at branch_id created_on_id space_id environment_id plan_id created_by_id

or¶

ln.Artifact.filter(ln.Q(suffix=".jpg") | ln.Q(suffix=".fastq.gz")).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
2 Dp5dO9exiCAQhfxa0000 my_image.jpg None .jpg None None 29358 r4tnqmKI_SjrkdLzpuWp4g None None ... True False 2026-05-26 09:53:19.015000+00:00 1 1 1 1 None None 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None None ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None None 1

2 rows × 21 columns

negate/ unequal¶

ln.Artifact.filter(~ln.Q(suffix=".jpg")).to_dataframe()
Show code cell output Hide code cell output
uid key description suffix kind otype size hash n_files n_observations ... is_latest is_locked created_at branch_id created_on_id space_id storage_id run_id schema_id created_by_id
id
5 urVJ9qxb4sGuyp920000 examples/dataset2.h5ad None .h5ad dataset AnnData 26896 RKJjWbINYNIwYU8BxCejMw None 3.0 ... True False 2026-05-26 09:53:29.287000+00:00 1 1 1 1 None 3.0 1
4 TzrkrqBbXDiqXA7g0000 examples/dataset1.h5ad None .h5ad dataset AnnData 31672 FB3CeMjmg1ivN6HDy6wsSg None 3.0 ... True False 2026-05-26 09:53:26.105000+00:00 1 1 1 1 None 3.0 1
3 xtoeLCrISnlv0qnN0000 iris.parquet None .parquet dataset DataFrame 5202 kTxrzohIAV896P7ceYwYJA None 150.0 ... True False 2026-05-26 09:53:19.308000+00:00 1 1 1 1 None NaN 1
1 hz2AhgywWxzsQHwD0000 raw/my_fastq.fastq.gz None .fastq.gz None None 20 hi7ZmAzz8sfMd3vIQr-57Q None NaN ... True False 2026-05-26 09:53:18.691000+00:00 1 1 1 1 None NaN 1

4 rows × 21 columns

previous

Install & setup

next

Stream datasets from storage

Table of contents
  • Get an overview
  • Auto-complete objects
  • Get one object
  • Query objects by fields
  • Query objects by features
  • Query runs by parameters
  • Search for objects
  • Query related registries
  • Filter operators
    • and
    • less than/ greater than
    • in
    • order by
    • contains
    • startswith
    • or
    • negate/ unequal
© 2026 Lamin Labs · Hub · Docs · Security · Blog · Contact · About · Legal · Imprint · LinkedIn · X · GitHub