Organism

lamindb provides access to the following public Organism ontologies through bionty:

  1. Ensembl Species

  2. NCBI Taxonomy

Here we show how to access and search Organism ontologies to standardize new data.

import bionty as bt
import pandas as pd
 connected lamindb: testuser1/test-public-ontologies

PublicOntology objects

Let us create a public ontology accessor with .public method, which chooses a default public ontology source from Source. It’s a PublicOntology object, which you can think about as a public registry:

organisms = bt.Organism.public(organism="vertebrates")
organisms
PublicOntology
Entity: Organism
Organism: vertebrates
Source: ensembl, release-112
#terms: 324

As for registries, you can export the ontology as a DataFrame:

df = organisms.df()
df.head()
/tmp/ipykernel_3585/50703996.py:1: DeprecationWarning: Use to_dataframe instead of df, df will be removed in the future.
  df = organisms.df()
scientific_name division ontology_id assembly assembly_accession genebuild variation microarray pan_compara peptide_compara genome_alignments other_alignments core_db species_id synonyms
name
spiny chromis Acanthochromis polyacanthus EnsemblVertebrates NCBITaxon:80966 ASM210954v1 GCA_002109545.1 2018-05-Ensembl/2020-03 N N N Y Y Y acanthochromis_polyacanthus_core_112_1 1 None
eurasian sparrowhawk Accipiter nisus EnsemblVertebrates NCBITaxon:211598 Accipiter_nisus_ver1.0 GCA_004320145.1 2019-07-Ensembl/2019-09 N N N N N Y accipiter_nisus_core_112_1 1 None
giant panda Ailuropoda melanoleuca EnsemblVertebrates NCBITaxon:9646 ASM200744v2 GCA_002007445.2 2020-05-Ensembl/2020-06 N N N Y Y Y ailuropoda_melanoleuca_core_112_2 1 None
yellow-billed parrot Amazona collaria EnsemblVertebrates NCBITaxon:241587 ASM394721v1 GCA_003947215.1 2019-07-Ensembl/2019-09 N N N N N Y amazona_collaria_core_112_1 1 None
midas cichlid Amphilophus citrinellus EnsemblVertebrates NCBITaxon:61819 Midas_v5 GCA_000751415.1 2018-05-Ensembl/2018-07 N N N Y Y Y amphilophus_citrinellus_core_112_5 1 None

Unlike registries, you can also export it as a Pronto object via public.ontology.

Look up terms

As for registries, terms can be looked up with auto-complete:

lookup = organisms.lookup()

The . accessor provides normalized terms (lower case, only contains alphanumeric characters and underscores):

lookup.giant_panda
Organism(name='giant panda', scientific_name='Ailuropoda melanoleuca', division='EnsemblVertebrates', ontology_id='NCBITaxon:9646', assembly='ASM200744v2', assembly_accession='GCA_002007445.2', genebuild='2020-05-Ensembl/2020-06', variation='N', microarray='N', pan_compara='N', peptide_compara='Y', genome_alignments='Y', other_alignments='Y', core_db='ailuropoda_melanoleuca_core_112_2', species_id=1, synonyms=None)

To look up the exact original strings, convert the lookup object to dict and use the [] accessor:

lookup_dict = lookup.dict()
lookup_dict["giant panda"]
Organism(name='giant panda', scientific_name='Ailuropoda melanoleuca', division='EnsemblVertebrates', ontology_id='NCBITaxon:9646', assembly='ASM200744v2', assembly_accession='GCA_002007445.2', genebuild='2020-05-Ensembl/2020-06', variation='N', microarray='N', pan_compara='N', peptide_compara='Y', genome_alignments='Y', other_alignments='Y', core_db='ailuropoda_melanoleuca_core_112_2', species_id=1, synonyms=None)

By default, the name field is used to generate lookup keys. You can specify another field to look up:

lookup = organisms.lookup(organisms.scientific_name)
lookup.ailuropoda_melanoleuca
Organism(name='giant panda', scientific_name='Ailuropoda melanoleuca', division='EnsemblVertebrates', ontology_id='NCBITaxon:9646', assembly='ASM200744v2', assembly_accession='GCA_002007445.2', genebuild='2020-05-Ensembl/2020-06', variation='N', microarray='N', pan_compara='N', peptide_compara='Y', genome_alignments='Y', other_alignments='Y', core_db='ailuropoda_melanoleuca_core_112_2', species_id=1, synonyms=None)

Search terms

Search behaves in the same way as it does for registries:

organisms.search("rabbit").head(3)
name scientific_name division assembly assembly_accession genebuild variation microarray pan_compara peptide_compara genome_alignments other_alignments core_db species_id synonyms
ontology_id
NCBITaxon:9986 rabbit Oryctolagus cuniculus EnsemblVertebrates OryCun2.0 GCA_000003625.1 2009-11-Ensembl/2019-05 N Y N Y Y Y oryctolagus_cuniculus_core_112_2 1 None

By default, search also covers synonyms and all other fileds containing strings:

organisms.search("sapiens").head(3)
name scientific_name division assembly assembly_accession genebuild variation microarray pan_compara peptide_compara genome_alignments other_alignments core_db species_id synonyms
ontology_id
NCBITaxon:9606 human Homo sapiens EnsemblVertebrates GRCh38.p14 GCA_000001405.29 2014-01-Ensembl/2023-12 Y Y Y Y Y Y homo_sapiens_core_112_38 1 None

Search specific field (by default, search is done on all fields containing strings):

organisms.search(
    "oryctolagus_cuniculus",
    field=organisms.scientific_name,
).head()
name scientific_name division assembly assembly_accession genebuild variation microarray pan_compara peptide_compara genome_alignments other_alignments core_db species_id synonyms
ontology_id

Standardize Organism identifiers

Let us generate a DataFrame that stores a number of Organism identifiers, some of which corrupted:

df_orig = pd.DataFrame(
    index=[
        "spiny chromis",
        "silver-eye",
        "platyfish",
        "california sea lion",
        "This organism does not exist",
    ]
)
df_orig
spiny chromis
silver-eye
platyfish
california sea lion
This organism does not exist

We can check whether any of our values are validated against the ontology reference:

validated = organisms.validate(df_orig.index, organisms.name)
df_orig.index[~validated]
! 1 unique term (20.00%) is not validated: 'This organism does not exist'
Index(['This organism does not exist'], dtype='object')

Ontology source versions

For any given entity, we can choose from a number of versions:

bt.Source.filter(entity="bionty.Organism").df()
Hide code cell output
/tmp/ipykernel_3585/2798978696.py:1: DeprecationWarning: Use to_dataframe instead of df, df will be removed in the future.
  bt.Source.filter(entity="bionty.Organism").df()
uid entity organism name in_db currently_used description url md5 source_website space_id dataframe_artifact_id version run_id created_at created_by_id _aux branch_id
id
1 33TUF039 bionty.Organism vertebrates ensembl False True Ensembl https://ftp.ensembl.org/pub/release-112/specie... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
2 6bbVUTCS bionty.Organism bacteria ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
3 6s9nV6xh bionty.Organism fungi ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/fungi... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
4 2PmTrc8x bionty.Organism metazoa ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/metaz... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
5 7GPHh16S bionty.Organism plants ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
6 4tsksCMX bionty.Organism all ncbitaxon False True NCBItaxon Ontology http://purl.obolibrary.org/obo/ncbitaxon/2023-... None https://github.com/obophenotype/ncbitaxon 1 None 2023-06-20 None 2025-09-18 06:40:29.074000+00:00 1 None 1
# only lists the sources that are currently used
bt.Source.filter(entity="bionty.Organism", currently_used=True).df()
/tmp/ipykernel_3585/2728461179.py:2: DeprecationWarning: Use to_dataframe instead of df, df will be removed in the future.
  bt.Source.filter(entity="bionty.Organism", currently_used=True).df()
uid entity organism name in_db currently_used description url md5 source_website space_id dataframe_artifact_id version run_id created_at created_by_id _aux branch_id
id
1 33TUF039 bionty.Organism vertebrates ensembl False True Ensembl https://ftp.ensembl.org/pub/release-112/specie... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
2 6bbVUTCS bionty.Organism bacteria ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
3 6s9nV6xh bionty.Organism fungi ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/fungi... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
4 2PmTrc8x bionty.Organism metazoa ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/metaz... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
5 7GPHh16S bionty.Organism plants ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
6 4tsksCMX bionty.Organism all ncbitaxon False True NCBItaxon Ontology http://purl.obolibrary.org/obo/ncbitaxon/2023-... None https://github.com/obophenotype/ncbitaxon 1 None 2023-06-20 None 2025-09-18 06:40:29.074000+00:00 1 None 1

When instantiating a Bionty object, we can choose a source or version:

source = bt.Source.filter(
    name="ensembl", organism="vertebrates"
).first()
organisms= bt.Organism.public(source=source)
organisms
PublicOntology
Entity: Organism
Organism: vertebrates
Source: ensembl, release-112
#terms: 324

The currently used ontologies can be displayed using:

bt.Source.filter(currently_used=True).df()
Hide code cell output
/tmp/ipykernel_3585/2711573161.py:1: DeprecationWarning: Use to_dataframe instead of df, df will be removed in the future.
  bt.Source.filter(currently_used=True).df()
uid entity organism name in_db currently_used description url md5 source_website space_id dataframe_artifact_id version run_id created_at created_by_id _aux branch_id
id
1 33TUF039 bionty.Organism vertebrates ensembl False True Ensembl https://ftp.ensembl.org/pub/release-112/specie... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
2 6bbVUTCS bionty.Organism bacteria ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
3 6s9nV6xh bionty.Organism fungi ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/fungi... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
4 2PmTrc8x bionty.Organism metazoa ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/metaz... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
5 7GPHh16S bionty.Organism plants ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... None https://www.ensembl.org 1 None release-57 None 2025-09-18 06:40:29.074000+00:00 1 None 1
6 4tsksCMX bionty.Organism all ncbitaxon False True NCBItaxon Ontology http://purl.obolibrary.org/obo/ncbitaxon/2023-... None https://github.com/obophenotype/ncbitaxon 1 None 2023-06-20 None 2025-09-18 06:40:29.074000+00:00 1 None 1
7 4UGNz3fr bionty.Gene human ensembl False True Ensembl s3://bionty-assets/df_human__ensembl__release-... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
8 4r4fvV0S bionty.Gene mouse ensembl False True Ensembl s3://bionty-assets/df_mouse__ensembl__release-... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
9 4RPA3Re0 bionty.Gene saccharomyces cerevisiae ensembl False True Ensembl s3://bionty-assets/df_saccharomyces cerevisiae... None https://www.ensembl.org 1 None release-112 None 2025-09-18 06:40:29.074000+00:00 1 None 1
10 3EYyGRYN bionty.Protein human uniprot False True Uniprot s3://bionty-assets/df_human__uniprot__2024-03_... None https://www.uniprot.org 1 None 2024-03 None 2025-09-18 06:40:29.074000+00:00 1 None 1
11 01RWXN2V bionty.Protein mouse uniprot False True Uniprot s3://bionty-assets/df_mouse__uniprot__2024-03_... None https://www.uniprot.org 1 None 2024-03 None 2025-09-18 06:40:29.074000+00:00 1 None 1
12 3kDh8qAX bionty.CellMarker human cellmarker False True CellMarker s3://bionty-assets/human_cellmarker_2.0_CellMa... None http://bio-bigdata.hrbmu.edu.cn/CellMarker 1 None 2.0 None 2025-09-18 06:40:29.074000+00:00 1 None 1
13 7bV5uJo3 bionty.CellMarker mouse cellmarker False True CellMarker s3://bionty-assets/mouse_cellmarker_2.0_CellMa... None http://bio-bigdata.hrbmu.edu.cn/CellMarker 1 None 2.0 None 2025-09-18 06:40:29.074000+00:00 1 None 1
14 6LyRtvz8 bionty.CellLine all clo False True Cell Line Ontology s3://bionty-assets/df_all__clo__2022-03-21__Ce... None https://bioportal.bioontology.org/ontologies/CLO 1 None 2022-03-21 None 2025-09-18 06:40:29.074000+00:00 1 None 1
16 3T9KZcjQ bionty.CellType all cl False True Cell Ontology http://purl.obolibrary.org/obo/cl/releases/202... None https://obophenotype.github.io/cell-ontology 1 None 2025-04-10 None 2025-09-18 06:40:29.074000+00:00 1 None 1
17 4U5uYTlc bionty.Tissue all uberon False True Uberon multi-species anatomy ontology http://purl.obolibrary.org/obo/uberon/releases... None http://obophenotype.github.io/uberon 1 None 2025-05-28 None 2025-09-18 06:40:29.074000+00:00 1 None 1
18 IGIkseWQ bionty.Disease all mondo False True Mondo Disease Ontology http://purl.obolibrary.org/obo/mondo/releases/... None https://mondo.monarchinitiative.org 1 None 2025-06-03 None 2025-09-18 06:40:29.074000+00:00 1 None 1
19 4kswnHVF bionty.Disease human doid False True Human Disease Ontology http://purl.obolibrary.org/obo/doid/releases/2... None https://disease-ontology.org 1 None 2024-05-29 None 2025-09-18 06:40:29.074000+00:00 1 None 1
21 2a1HvjdB bionty.ExperimentalFactor all efo False True The Experimental Factor Ontology http://www.ebi.ac.uk/efo/releases/v3.70.0/efo.owl None https://bioportal.bioontology.org/ontologies/EFO 1 None 3.70.0 None 2025-09-18 06:40:29.074000+00:00 1 None 1
22 6S4qkDx1 bionty.Phenotype all pato False True Phenotype And Trait Ontology http://purl.obolibrary.org/obo/pato/releases/2... None https://github.com/pato-ontology/pato 1 None 2024-03-28 None 2025-09-18 06:40:29.074000+00:00 1 None 1
23 48fBFLmn bionty.Phenotype human hp False True Human Phenotype Ontology https://github.com/obophenotype/human-phenotyp... None https://hpo.jax.org 1 None 2024-04-26 None 2025-09-18 06:40:29.074000+00:00 1 None 1
25 7Ent3V2y bionty.Pathway all go False True Gene Ontology http://purl.obolibrary.org/obo/go/releases/202... None http://geneontology.org 1 None 2024-06-17 None 2025-09-18 06:40:29.074000+00:00 1 None 1
27 3rm9aOzL BFXPipeline all lamin False True Bioinformatics Pipeline s3://bionty-assets/df_all__lamin__1.0.0__BFXpi... None https://lamin.ai 1 None 1.0.0 None 2025-09-18 06:40:29.074000+00:00 1 None 1
28 ugaIoIlj Drug all dron False True Drug Ontology http://purl.obolibrary.org/obo/dron/releases/2... None https://bioportal.bioontology.org/ontologies/DRON 1 None 2024-08-05 None 2025-09-18 06:40:29.074000+00:00 1 None 1
30 1GbFkOdz bionty.DevelopmentalStage human hsapdv False True Human Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... 1 None 2024-05-28 None 2025-09-18 06:40:29.074000+00:00 1 None 1
31 10va5JSt bionty.DevelopmentalStage mouse mmusdv False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... 1 None 2024-05-28 None 2025-09-18 06:40:29.074000+00:00 1 None 1
32 MJRqduf9 bionty.Ethnicity human hancestro False True Human Ancestry Ontology http://purl.obolibrary.org/obo/hancestro/relea... None https://github.com/EBISPOT/hancestro 1 None 3.0 None 2025-09-18 06:40:29.074000+00:00 1 None 1
33 5JnVODh4 BioSample all ncbi False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... None https://www.ncbi.nlm.nih.gov/biosample/docs/at... 1 None 2023-09 None 2025-09-18 06:40:29.074000+00:00 1 None 1