###### CellLine

lamindb provides access to the following public CellLine ontologies
through bionty:

1. Cellosaurus

2. Cell Line Ontology

Here we show how to access and search CellLine ontologies to
standardize new data.

 import bionty as bt
 import pandas as pd

##### PublicOntology objects

Let us create a public ontology accessor with ".public" method, which
chooses a default public ontology source from "Source". It's a
PublicOntology object, which you can think about as a public registry:

 celllines = bt.CellLine.public(organism="all")
 celllines

As for registries, you can export the ontology as a "DataFrame":

 df = celllines.to_dataframe()
 df.head()

Unlike registries, you can also export it as a Pronto object via
"public.ontology".

##### Look up terms

As for registries, terms can be looked up with auto-complete:

 lookup = celllines.lookup()

The "." accessor provides normalized terms (lower case, only contains
alphanumeric characters and underscores):

 lookup.hek293

To look up the exact original strings, convert the lookup object to
dict and use the "[]" accessor:

 lookup_dict = lookup.dict()
 lookup_dict["HEK293"]

By default, the "name" field is used to generate lookup keys. You can
specify another field to look up:

 lookup = celllines.lookup(celllines.ontology_id)

 lookup.cvcl_0045

##### Search terms

Search behaves in the same way as it does for registries:

 celllines.search("hek293").head(3)

By default, search also covers synonyms and all other fields
containing strings:

 celllines.search("Human Embryonic Kidney 293").head(3)

Search specific field (by default, search is done on all fields
containing strings):

 celllines.search(
 "suspension cell line",
 field=celllines.description,
 ).head()

##### Standardize CellLine identifiers

Let us generate a "DataFrame" that stores a number of CellLine
identifiers, some of which corrupted:

 df_orig = pd.DataFrame(
 index=[
 "253D cell",
 "HEK293",
 "2C1H7 cell",
 "283TAg cell",
 "This cellline does not exist",
 ]
 )
 df_orig

We can check whether any of our values are validated against the
ontology reference:

 validated = celllines.validate(df_orig.index, celllines.name)
 df_orig.index[~validated]

##### Ontology source versions

For any given entity, we can choose from a number of versions:

 bt.Source.filter(entity="bionty.CellLine").to_dataframe()

 # only lists the sources that are currently used
 bt.Source.filter(entity="bionty.CellLine", currently_used=True).to_dataframe()

When instantiating a Bionty object, we can choose a source or version:

 source = bt.Source.filter(
 name="cellosaurus", organism="all"
 ).first()
 celllines= bt.CellLine.public(source=source)
 celllines

The currently used ontologies can be displayed using:

 bt.Source.filter(currently_used=True).to_dataframe()