Tissue¶
lamindb provides access to the following public Tissue ontologies through bionty:
Here we show how to access and search Tissue ontologies to standardize new data.
import bionty as bt
import pandas as pd
PublicOntology objects¶
Let us create a public ontology accessor with .public
method, which chooses a default public ontology source from Source
.
It’s a PublicOntology object, which you can think about as a public registry:
tissues = bt.Tissue.public(organism="all")
tissues
→ connected lamindb: testuser1/test-public-ontologies
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2024-08-07
#terms: 15631
As for registries, you can export the ontology as a DataFrame
:
df = tissues.df()
df.head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0000000 | processual entity | An Occurrent [Span:Occurrent] That Exists In T... | None | [] |
UBERON:0000002 | uterine cervix | Lower, Narrow Portion Of The Uterus Where It J... | caudal segment of uterus|canalis cervicis uter... | [UBERON:0001560, UBERON:0005156, UBERON:000099... |
UBERON:0000003 | naris | Orifice Of The Olfactory System. The Naris Is ... | None | [UBERON:0000161, UBERON:0005725] |
UBERON:0000004 | nose | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|nose|peripheral olfactory organ | [UBERON:0004121, UBERON:0000475, UBERON:0002268] |
UBERON:0000005 | chemosensory organ | None | chemosensory sensory organ | [UBERON:0000020, UBERON:0005726, UBERON:0001016] |
Unlike registries, you can also export it as a Pronto object via public.ontology
.
Look up terms¶
As for registries, terms can be looked up with auto-complete:
lookup = tissues.lookup()
The .
accessor provides normalized terms (lower case, only contains alphanumeric characters and underscores):
lookup.alveolus_of_lung
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='pulmonary alveolus|alveolus pulmonis|respiratory alveolus|lung alveolus', parents=array(['UBERON:0003215', 'UBERON:0004119', 'UBERON:0008874',
'UBERON:0002048', 'UBERON:0010369'], dtype=object))
To look up the exact original strings, convert the lookup object to dict and use the []
accessor:
lookup_dict = lookup.dict()
lookup_dict["alveolus of lung"]
Tissue(ontology_id='UBERON:0002299', name='alveolus of lung', definition='Spherical Outcropping Of The Respiratory Bronchioles And Primary Site Of Gas Exchange With The Blood. Alveoli Are Particular To Mammalian Lungs. Different Structures Are Involved In Gas Exchange In Other Vertebrates[Wp].', synonyms='pulmonary alveolus|alveolus pulmonis|respiratory alveolus|lung alveolus', parents=array(['UBERON:0003215', 'UBERON:0004119', 'UBERON:0008874',
'UBERON:0002048', 'UBERON:0010369'], dtype=object))
By default, the name
field is used to generate lookup keys. You can specify another field to look up:
lookup = tissues.lookup(tissues.ontology_id)
lookup.uberon_0000031
Tissue(ontology_id='UBERON:0000031', name='lamina propria of trachea', definition='A Lamina Propria That Is Part Of A Respiratory Airway.', synonyms='tracheal lamina propria|windpipe lamina propria|lamina propria mucosae of windpipe|lamina propria mucosa of trachea|lamina propria mucosa of windpipe|lamina propria of windpipe|trachea lamina propria|lamina propria mucosae of trachea|windpipe lamina propria mucosae|trachea lamina propria mucosae|trachea lamina propria mucosa|windpipe lamina propria mucosa', parents=array(['UBERON:0004779', 'UBERON:0001005', 'UBERON:0001004'], dtype=object))
Search terms¶
Search behaves in the same way as it does for registries:
tissues.search("lung alveolus").head(3)
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0002299 | alveolus of lung | Spherical Outcropping Of The Respiratory Bronc... | pulmonary alveolus|alveolus pulmonis|respirato... | [UBERON:0003215, UBERON:0004119, UBERON:000887... |
UBERON:0002172 | alveolar atrium | None | atrium of alveolus of lung|atrium of alveolus|... | [UBERON:0000064, UBERON:0004119, UBERON:000229... |
UBERON:0004861 | right lung alveolus | An Alveolus That Is Part Of A Right Lung [Auto... | alveolus of right lung | [UBERON:0002299, UBERON:0006526, UBERON:000216... |
By default, search also covers synonyms and all other fileds containing strings:
tissues.search("nasal sac").head(3)
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0000004 | nose | The Olfactory Organ Of Vertebrates, Consisting... | nasal sac|nose|peripheral olfactory organ | [UBERON:0004121, UBERON:0000475, UBERON:0002268] |
UBERON:4300152 | accessory nasal sac | Accessory Nasal Sacs Are Found In A Variety Of... | None | [UBERON:0000062] |
UBERON:2005085 | nasal artery | The Nasal Arteries Start At The Internal Carot... | NA | [UBERON:0001637] |
Search specific field (by default, search is done on all fields containing strings):
tissues.search(
"spherical outcropping of the respiratory",
field=tissues.definition,
).head()
name | definition | synonyms | parents | |
---|---|---|---|---|
ontology_id | ||||
UBERON:0002299 | alveolus of lung | Spherical Outcropping Of The Respiratory Bronc... | pulmonary alveolus|alveolus pulmonis|respirato... | [UBERON:0003215, UBERON:0004119, UBERON:000887... |
Standardize Tissue identifiers¶
Let us generate a DataFrame
that stores a number of Tissue identifiers, some of which corrupted:
df_orig = pd.DataFrame(
index=[
"UBERON:0000000",
"UBERON:0000005",
"UBERON:0000001",
"UBERON:0000002",
"This tissue does not exist",
]
)
df_orig
UBERON:0000000 |
---|
UBERON:0000005 |
UBERON:0000001 |
UBERON:0000002 |
This tissue does not exist |
We can check whether any of our values are validated against the ontology reference:
validated = tissues.validate(df_orig.index, tissues.name)
df_orig.index[~validated]
! 5 unique terms (100.00%) are not validated: 'UBERON:0000000', 'UBERON:0000005', 'UBERON:0000001', 'UBERON:0000002', 'This tissue does not exist'
Index(['UBERON:0000000', 'UBERON:0000005', 'UBERON:0000001', 'UBERON:0000002',
'This tissue does not exist'],
dtype='object')
Ontology source versions¶
For any given entity, we can choose from a number of versions:
bt.Source.filter(entity="bionty.Tissue").df()
Show code cell output
uid | entity | organism | name | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | version | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
40 | MUtA | bionty.Tissue | all | uberon | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | http://obophenotype.github.io/uberon | None | 2024-08-07 | None | 2024-12-20 15:03:38.346116+00:00 | 1 | |
41 | 01Ng | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | dc473b764acfe765be31250c853ef23d | http://obophenotype.github.io/uberon | None | 2024-05-13 | None | 2024-12-20 15:03:38.346141+00:00 | 1 |
42 | 6wOb | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | http://obophenotype.github.io/uberon | None | 2024-03-22 | None | 2024-12-20 15:03:38.346166+00:00 | 1 | |
43 | Cwzj | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 2048667b5fdf93192384bdf53cafba18 | http://obophenotype.github.io/uberon | None | 2024-02-20 | None | 2024-12-20 15:03:38.346191+00:00 | 1 |
44 | 2PlK | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 9cff222ef07a1f622a6d12775ada650d | http://obophenotype.github.io/uberon | None | 2024-01-18 | None | 2024-12-20 15:03:38.346216+00:00 | 1 |
45 | svSf | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | abcee3ede566d1311d758b853ccdf5aa | http://obophenotype.github.io/uberon | None | 2023-09-05 | None | 2024-12-20 15:03:38.346241+00:00 | 1 |
46 | 1tLk | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 5611dd1375d5a95ac7d7de8e25e6016f | http://obophenotype.github.io/uberon | None | 2023-04-19 | None | 2024-12-20 15:03:38.346266+00:00 | 1 |
47 | 6VAw | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | 3f94e22fae4cdde88a555c5cd59c47da | http://obophenotype.github.io/uberon | None | 2023-02-14 | None | 2024-12-20 15:03:38.346291+00:00 | 1 |
48 | 7Iby | bionty.Tissue | all | uberon | False | False | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | c7c958a1ee48fdce146f2c1763eed27e | http://obophenotype.github.io/uberon | None | 2022-08-19 | None | 2024-12-20 15:03:38.346316+00:00 | 1 |
# only lists the sources that are currently used
bt.Source.filter(entity="bionty.Tissue", currently_used=True).df()
uid | entity | organism | name | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | version | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
40 | MUtA | bionty.Tissue | all | uberon | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | http://obophenotype.github.io/uberon | None | 2024-08-07 | None | 2024-12-20 15:03:38.346116+00:00 | 1 |
When instantiating a Bionty object, we can choose a source or version:
source = bt.Source.filter(
name="uberon", version="2023-04-19", organism="all"
).one()
tissues= bt.Tissue.public(source=source)
tissues
PublicOntology
Entity: Tissue
Organism: all
Source: uberon, 2023-04-19
#terms: 15499
The currently used ontologies can be displayed using:
bt.Source.filter(currently_used=True).df()
Show code cell output
uid | entity | organism | name | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | version | run_id | created_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
1 | 33TU | bionty.Organism | vertebrates | ensembl | False | True | Ensembl | https://ftp.ensembl.org/pub/release-112/specie... | 0ec37e77f4bc2d0b0b47c6c62b9f122d | https://www.ensembl.org | None | release-112 | None | 2024-12-20 15:03:38.345036+00:00 | 1 |
6 | 6bbV | bionty.Organism | bacteria | ensembl | False | True | Ensembl | https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... | ee28510ed5586ea7ab4495717c96efc8 | https://www.ensembl.org | None | release-57 | None | 2024-12-20 15:03:38.345237+00:00 | 1 |
7 | 6s9n | bionty.Organism | fungi | ensembl | False | True | Ensembl | http://ftp.ensemblgenomes.org/pub/fungi/releas... | dbcde58f4396ab8b2480f7fe9f83df8a | https://www.ensembl.org | None | release-57 | None | 2024-12-20 15:03:38.345264+00:00 | 1 |
8 | 2PmT | bionty.Organism | metazoa | ensembl | False | True | Ensembl | http://ftp.ensemblgenomes.org/pub/metazoa/rele... | 424636a574fec078a61cbdddb05f9132 | https://www.ensembl.org | None | release-57 | None | 2024-12-20 15:03:38.345290+00:00 | 1 |
9 | 7GPH | bionty.Organism | plants | ensembl | False | True | Ensembl | https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... | eadaa1f3e527e4c3940c90c7fa5c8bf4 | https://www.ensembl.org | None | release-57 | None | 2024-12-20 15:03:38.345316+00:00 | 1 |
10 | 4tsk | bionty.Organism | all | ncbitaxon | False | True | NCBItaxon Ontology | s3://bionty-assets/df_all__ncbitaxon__2023-06-... | 00d97ba65627f1cd65636d2df22ea76c | https://github.com/obophenotype/ncbitaxon | None | 2023-06-20 | None | 2024-12-20 15:03:38.345342+00:00 | 1 |
11 | 4UGN | bionty.Gene | human | ensembl | False | True | Ensembl | s3://bionty-assets/df_human__ensembl__release-... | 4ccda4d88720a326737376c534e8446b | https://www.ensembl.org | None | release-112 | None | 2024-12-20 15:03:38.345367+00:00 | 1 |
15 | 4r4f | bionty.Gene | mouse | ensembl | False | True | Ensembl | s3://bionty-assets/df_mouse__ensembl__release-... | 519cf7b8acc3c948274f66f3155a3210 | https://www.ensembl.org | None | release-112 | None | 2024-12-20 15:03:38.345470+00:00 | 1 |
19 | 4RPA | bionty.Gene | saccharomyces cerevisiae | ensembl | False | True | Ensembl | s3://bionty-assets/df_saccharomyces cerevisiae... | 11775126b101233525a0a9e2dd64edae | https://www.ensembl.org | None | release-112 | None | 2024-12-20 15:03:38.345572+00:00 | 1 |
22 | 3EYy | bionty.Protein | human | uniprot | False | True | Uniprot | s3://bionty-assets/df_human__uniprot__2024-03_... | b5b9e7645065b4b3187114f07e3f402f | https://www.uniprot.org | None | 2024-03 | None | 2024-12-20 15:03:38.345647+00:00 | 1 |
25 | 01RW | bionty.Protein | mouse | uniprot | False | True | Uniprot | s3://bionty-assets/df_mouse__uniprot__2024-03_... | b1b6a196eb853088d36198d8e3749ec4 | https://www.uniprot.org | None | 2024-03 | None | 2024-12-20 15:03:38.345724+00:00 | 1 |
28 | 3kDh | bionty.CellMarker | human | cellmarker | False | True | CellMarker | s3://bionty-assets/human_cellmarker_2.0_CellMa... | d565d4a542a5c7e7a06255975358e4f4 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | None | 2.0 | None | 2024-12-20 15:03:38.345800+00:00 | 1 |
29 | 7bV5 | bionty.CellMarker | mouse | cellmarker | False | True | CellMarker | s3://bionty-assets/mouse_cellmarker_2.0_CellMa... | 189586732c63be949e40dfa6a3636105 | http://bio-bigdata.hrbmu.edu.cn/CellMarker | None | 2.0 | None | 2024-12-20 15:03:38.345825+00:00 | 1 |
30 | 6LyR | bionty.CellLine | all | clo | False | True | Cell Line Ontology | https://data.bioontology.org/ontologies/CLO/su... | ea58a1010b7e745702a8397a526b3a33 | https://bioportal.bioontology.org/ontologies/CLO | None | 2022-03-21 | None | 2024-12-20 15:03:38.345850+00:00 | 1 |
32 | 1Lhf | bionty.CellType | all | cl | False | True | Cell Ontology | http://purl.obolibrary.org/obo/cl/releases/202... | 8a8638a9e79567935793e5007704c650 | https://obophenotype.github.io/cell-ontology | None | 2024-05-15 | None | 2024-12-20 15:03:38.345900+00:00 | 1 |
40 | MUtA | bionty.Tissue | all | uberon | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | http://obophenotype.github.io/uberon | None | 2024-08-07 | None | 2024-12-20 15:03:38.346116+00:00 | 1 | |
49 | 2L2r | bionty.Disease | all | mondo | False | True | Mondo Disease Ontology | http://purl.obolibrary.org/obo/mondo/releases/... | c47e8edb894c01f2511dfe0751fbc428 | https://mondo.monarchinitiative.org | None | 2024-06-04 | None | 2024-12-20 15:03:38.346342+00:00 | 1 |
57 | 4ksw | bionty.Disease | human | doid | False | True | Human Disease Ontology | http://purl.obolibrary.org/obo/doid/releases/2... | bbefd72247d638edfcd31ec699947407 | https://disease-ontology.org | None | 2024-05-29 | None | 2024-12-20 15:03:38.346538+00:00 | 1 |
65 | 2a1H | bionty.ExperimentalFactor | all | efo | False | True | The Experimental Factor Ontology | http://www.ebi.ac.uk/efo/releases/v3.70.0/efo.owl | https://bioportal.bioontology.org/ontologies/EFO | None | 3.70.0 | None | 2024-12-20 15:03:38.348787+00:00 | 1 | |
72 | 48fB | bionty.Phenotype | human | hp | False | True | Human Phenotype Ontology | https://github.com/obophenotype/human-phenotyp... | e0f2e534eb2ad44a4d45573ef27b508f | https://hpo.jax.org | None | 2024-04-26 | None | 2024-12-20 15:03:38.348965+00:00 | 1 |
77 | 4t7Q | bionty.Phenotype | mammalian | mp | False | True | Mammalian Phenotype Ontology | https://github.com/mgijax/mammalian-phenotype-... | 795d8378fe48ec13b41d01a86dd1c86c | https://github.com/mgijax/mammalian-phenotype-... | None | 2024-06-18 | None | 2024-12-20 15:03:38.349111+00:00 | 1 |
80 | sqPX | bionty.Phenotype | zebrafish | zp | False | True | Zebrafish Phenotype Ontology | https://github.com/obophenotype/zebrafish-phen... | 2231ebaa95becf8ff34a33c95a8d4350 | https://github.com/obophenotype/zebrafish-phen... | None | 2024-04-18 | None | 2024-12-20 15:03:38.349189+00:00 | 1 |
84 | 6S4q | bionty.Phenotype | all | pato | False | True | Phenotype And Trait Ontology | http://purl.obolibrary.org/obo/pato/releases/2... | 6b1eaacd3d453b34375ce2e31c16328a | https://github.com/pato-ontology/pato | None | 2024-03-28 | None | 2024-12-20 15:03:38.349288+00:00 | 1 |
86 | 7Ent | bionty.Pathway | all | go | False | True | Gene Ontology | https://data.bioontology.org/ontologies/GO/sub... | 7fa7ade5e3e26eab3959a7e4bc89ad4f | http://geneontology.org | None | 2024-06-17 | None | 2024-12-20 15:03:38.349338+00:00 | 1 |
91 | 3rm9 | BFXPipeline | all | lamin | False | True | Bioinformatics Pipeline | s3://bionty-assets/df_all__lamin__1.0.0__BFXpi... | https://lamin.ai | None | 1.0.0 | None | 2024-12-20 15:03:38.349461+00:00 | 1 | |
92 | ugaI | Drug | all | dron | False | True | Drug Ontology | https://data.bioontology.org/ontologies/DRON/s... | https://bioportal.bioontology.org/ontologies/DRON | None | 2024-08-05 | None | 2024-12-20 15:03:38.349486+00:00 | 1 | |
96 | 1GbF | bionty.DevelopmentalStage | human | hsapdv | False | True | Human Developmental Stages | https://github.com/obophenotype/developmental-... | https://github.com/obophenotype/developmental-... | None | 2024-05-28 | None | 2024-12-20 15:03:38.349585+00:00 | 1 | |
98 | 10va | bionty.DevelopmentalStage | mouse | mmusdv | False | True | Mouse Developmental Stages | https://github.com/obophenotype/developmental-... | https://github.com/obophenotype/developmental-... | None | 2024-05-28 | None | 2024-12-20 15:03:38.349634+00:00 | 1 | |
100 | MJRq | bionty.Ethnicity | human | hancestro | False | True | Human Ancestry Ontology | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | None | 3.0 | None | 2024-12-20 15:03:38.349683+00:00 | 1 |
101 | 5JnV | BioSample | all | ncbi | False | True | NCBI BioSample attributes | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | None | 2023-09 | None | 2024-12-20 15:03:38.349708+00:00 | 1 |