DevelopmentalStage¶
lamindb provides access to the following public DevelopmentalStage ontologies through bionty:
Here we show how to access and search DevelopmentalStage ontologies to standardize new data.
import bionty as bt
import pandas as pd
PublicOntology objects¶
Let us create a public ontology accessor with .public
method, which chooses a default public ontology source from Source
.
It’s a PublicOntology object, which you can think about as a public registry:
developmentalstages = bt.DevelopmentalStage.public(organism="human")
developmentalstages
→ connected lamindb: testuser1/test-public-ontologies
PublicOntology
Entity: DevelopmentalStage
Organism: human
Source: hsapdv, 2024-05-28
#terms: 259
As for registries, you can export the ontology as a DataFrame
:
df = developmentalstages.df()
df.head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
HsapDv:0000000 | life cycle stage | A Spatiotemporal Region Encompassing Some Part... | developmental stage | [] |
HsapDv:0000001 | life cycle | Temporal Interval That Defines Human Life From... | None | [HsapDv:0000000] |
HsapDv:0000002 | embryonic stage | Prenatal Stage That Starts With Fertilization ... | None | [HsapDv:0000000, HsapDv:0000045, HsapDv:0000001] |
HsapDv:0000003 | Carnegie stage 01 | Embryonic Stage Defined By A Fertilized Oocyte... | CS01 | [HsapDv:0000000, HsapDv:0000002, HsapDv:000004... |
HsapDv:0000004 | cleavage stage | Early Stage Of Carnegie Stage 02 Consisting Of... | None | [] |
Unlike registries, you can also export it as a Pronto object via public.ontology
.
Look up terms¶
As for registries, terms can be looked up with auto-complete:
lookup = developmentalstages.lookup()
The .
accessor provides normalized terms (lower case, only contains alphanumeric characters and underscores):
lookup.organogenesis_stage
DevelopmentalStage(ontology_id='HsapDv:0000015', name='organogenesis stage', definition='Embryonic Stage At Which The Ectoderm, Endoderm, And Mesoderm Develop Into The Internal Organs Of The Organism.', synonyms=None, parents=array(['HsapDv:0000000', 'HsapDv:0000002', 'HsapDv:0000045',
'HsapDv:0000001'], dtype=object))
To look up the exact original strings, convert the lookup object to dict and use the []
accessor:
lookup_dict = lookup.dict()
lookup_dict["organogenesis stage"]
DevelopmentalStage(ontology_id='HsapDv:0000015', name='organogenesis stage', definition='Embryonic Stage At Which The Ectoderm, Endoderm, And Mesoderm Develop Into The Internal Organs Of The Organism.', synonyms=None, parents=array(['HsapDv:0000000', 'HsapDv:0000002', 'HsapDv:0000045',
'HsapDv:0000001'], dtype=object))
By default, the name
field is used to generate lookup keys. You can specify another field to look up:
lookup = developmentalstages.lookup(developmentalstages.ontology_id)
lookup.hsapdv_0000015
DevelopmentalStage(ontology_id='HsapDv:0000015', name='organogenesis stage', definition='Embryonic Stage At Which The Ectoderm, Endoderm, And Mesoderm Develop Into The Internal Organs Of The Organism.', synonyms=None, parents=array(['HsapDv:0000000', 'HsapDv:0000002', 'HsapDv:0000045',
'HsapDv:0000001'], dtype=object))
Search terms¶
Search behaves in the same way as it does for registries:
developmentalstages.search("organogenesis").head(3)
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
HsapDv:0000015 | organogenesis stage | Embryonic Stage At Which The Ectoderm, Endoder... | None | [HsapDv:0000000, HsapDv:0000002, HsapDv:000004... |
HsapDv:0000016 | Carnegie stage 09 | Organogenesis Stage During Which Somites 1-3 A... | CS09 | [HsapDv:0000000, HsapDv:0000015, HsapDv:000000... |
HsapDv:0000017 | Carnegie stage 10 | Organogenesis Stage During Which Somites 4-12 ... | CS10 | [HsapDv:0000000, HsapDv:0000015, HsapDv:000000... |
By default, search also covers synonyms and all other fileds containing strings:
developmentalstages.search("developmental stage").head(3)
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
HsapDv:0000000 | life cycle stage | A Spatiotemporal Region Encompassing Some Part... | developmental stage | [] |
HsapDv:0000020 | Carnegie stage 13 | Organogenesis Developmental Stage During Which... | CS13 | [HsapDv:0000000, HsapDv:0000015, HsapDv:000000... |
HsapDv:0000031 | Carnegie stage 05a | Carnegie Developmental Stage 5 Defined By A So... | CS05a | [HsapDv:0000000, HsapDv:0000009, HsapDv:000000... |
Search specific field (by default, search is done on all fields containing strings):
developmentalstages.search(
"Prenatal Stage That Starts With Fertilization",
field=developmentalstages.definition,
).head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
HsapDv:0000002 | embryonic stage | Prenatal Stage That Starts With Fertilization ... | None | [HsapDv:0000000, HsapDv:0000045, HsapDv:0000001] |
HsapDv:0000045 | prenatal stage | Prenatal Stage That Starts With Fertilization ... | None | [HsapDv:0000000, HsapDv:0000001] |
Standardize DevelopmentalStage identifiers¶
Let us generate a DataFrame
that stores a number of DevelopmentalStage identifiers, some of which corrupted:
df_orig = pd.DataFrame(
index=[
"blastula stage",
"Carnegie stage 03",
"neurula stage",
"organogenesis stage",
"This developmentalstage does not exist",
]
)
df_orig
blastula stage |
---|
Carnegie stage 03 |
neurula stage |
organogenesis stage |
This developmentalstage does not exist |
We can check whether any of our values are validated against the ontology reference:
validated = developmentalstages.validate(df_orig.index, developmentalstages.name)
df_orig.index[~validated]
! 1 unique term (20.00%) is not validated: 'This developmentalstage does not exist'
Index(['This developmentalstage does not exist'], dtype='object')
Ontology source versions¶
For any given entity, we can choose from a number of versions:
bt.Source.filter(entity="bionty.DevelopmentalStage").df()
# only lists the sources that are currently used
bt.Source.filter(entity="bionty.DevelopmentalStage", currently_used=True).df()
uid | entity | organism | name | in_db | currently_used | description | url | md5 | source_website | space_id | dataframe_artifact_id | version | run_id | created_at | created_by_id | _aux | branch_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||||||
30 | 1GbFkOdz | bionty.DevelopmentalStage | human | hsapdv | False | True | Human Developmental Stages | https://github.com/obophenotype/developmental-... | None | https://github.com/obophenotype/developmental-... | 1 | None | 2024-05-28 | None | 2025-07-14 06:41:44.843000+00:00 | 1 | None | 1 |
31 | 10va5JSt | bionty.DevelopmentalStage | mouse | mmusdv | False | True | Mouse Developmental Stages | https://github.com/obophenotype/developmental-... | None | https://github.com/obophenotype/developmental-... | 1 | None | 2024-05-28 | None | 2025-07-14 06:41:44.843000+00:00 | 1 | None | 1 |
When instantiating a Bionty object, we can choose a source or version:
source = bt.Source.filter(
name="hsapdv", organism="human"
).first()
developmentalstages= bt.DevelopmentalStage.public(source=source)
developmentalstages
PublicOntology
Entity: DevelopmentalStage
Organism: human
Source: hsapdv, 2024-05-28
#terms: 259
The currently used ontologies can be displayed using:
bt.Source.filter(currently_used=True).df()