bionty.Protein¶
- class bionty.Protein(name: str | None, uniprotkb_id: str | None, synonyms: str | None, length: int | None, gene_symbol: str | None, ensembl_gene_ids: str | None, organism: Organism | None, source: Source | None)¶
- Bases: - BioRecord,- TracksRun,- TracksUpdates- Proteins - Uniprot. - Notes - For more info, see tutorials Manage biological ontologies and Protein. - Bulk create records via - from_values().- Example: - import bionty as bt record = bt.Protein.from_source(name="Synaptotagmin-15B", organism="human") record = bt.Protein.from_source(gene_symbol="SYT15B", organism="human") - Simple fields¶- uid: str¶
- A universal id (base62-encoded hash of defining fields). 
 - name: str | None¶
- Unique name of a protein. 
 - uniprotkb_id: str | None¶
- UniProt protein ID, 6 alphanumeric characters, possibly suffixed by 4 more. 
 - synonyms: str | None¶
- Bar-separated (|) synonyms that correspond to this protein. 
 - description: str | None¶
- Description of the protein. 
 - length: int | None¶
- Length of the protein sequence. 
 - gene_symbol: str | None¶
- The primary gene symbol corresponds to this protein. 
 - ensembl_gene_ids: str | None¶
- Bar-separated (|) Ensembl Gene IDs that correspond to this protein. 
 - is_locked: bool¶
- Whether the record is locked for edits. 
 - created_at: datetime¶
- Time of creation of record. 
 - updated_at: datetime¶
- Time of last update to record. 
 - Relational fields¶- branch: Branch¶
- Whether record is on a branch or in another “special state”. 
 - records: Record¶
- Records linked to the protein. 
 - Class methods¶- classmethod from_source(cls, *, name=None, uniprotkb_id=None, gene_symbol=None, organism=None, source=None, mute=False, **kwargs)¶
- Create a Protein record from source based on a single identifying field. - Parameters:
- name ( - str|- None, default:- None) – Protein name (e.g. “Synaptotagmin-15B”)
- uniprotkb_id ( - str|- None, default:- None) – UniProt protein ID (e.g. “Q8N6N3”)
- gene_symbol ( - str|- None, default:- None) – Gene symbol (e.g. “SYT15B”)
- organism ( - str|- Organism|- None, default:- None) – Organism name or Organism record source: Optional Source record to use
- mute ( - bool, default:- False) – Whether to suppress logging
 
- Return type:
- Returns:
- A single Protein record, list of Protein records, or None if not found 
 - Example: - import bionty as bt record = bt.Protein.from_source(name="Synaptotagmin-15B", organism="human") record = bt.Protein.from_source(uniprotkb_id="Q8N6N3") record = bt.Protein.from_source(gene_symbol="SYT15B", organism="human") 
 - classmethod import_source(source=None, update_records=False, *, organism=None, ignore_conflicts=True)¶
- Bulk save records from a Bionty ontology. - Use this method to initialize your registry with public ontology. - Parameters:
- source ( - Source|- None, default:- None) – Source record to import records from.
- update_records ( - bool, default:- False) –- If True, update existing records with the new source. - If a record has the same metadata in the new source, link the record to the new source. 
- If a record has no artifacts associated, update it’s metadata and link to the new source. 
- If a record associated artifacts, but different name in the new source, create a new record with the new source. 
 
- organism ( - str|- SQLRecord|- None, default:- None) – Organism name or record. Required for entities with a required organism foreign key when no source is passed.
- ignore_conflicts ( - bool, default:- True) – Whether to ignore conflicts during bulk record creation.
 
 - Examples: - import bionty as bt # import all records from a default source default_sources = bt.Source.filter(entity="bionty.CellType", currently_used=True).to_dataframe() bt.CellType.import_source() # import all records from a specific source source = bt.Source.get(entity="bionty.CellType", source="cl", version="2022-08-16") bt.CellType.import_source(source) bt.CellType.to_dataframe() # all records from the source are now in the registry # update existing records with a new source (version update) source = bt.Source.get(entity="bionty.CellType", source="cl", version="2024-08-16") bt.CellType.import_source(source, update_records=True) 
 - classmethod add_source(source, *, df=None, version=None, organism=None)¶
- Link a source record to the entity with a reference DataFrame. - Creates or retrieves a Source record for the entity and optionally associates it with a DataFrame artifact containing the ontology data. If the source already exists with a DataFrame artifact, returns the existing source. - Parameters:
- source ( - Source|- PublicOntology|- str) –- Source specification. Can be: - Source record: Existing - bionty.Sourceinstance
- PublicOntology: PublicOntology object with source metadata 
- str: Source name (e.g., “mondo”, “cl”, “go”) 
 
- df ( - DataFrame|- None, default:- None) – Optional DataFrame containing ontology data to store as Artifact. If None and source is a PublicOntology, uses the ontology’s DataFrame.
- version ( - str|- None, default:- None) – Source version string. Required when source is str and no existing source found. Examples: “2025-06-03”, “v1.0”, “release-112”
- organism ( - str|- None, default:- None) – Organism identifier. Required for organism-specific entities when source is str. Use “all” for cross-organism ontologies.
 
- Return type:
 - Examples - Add a source by name with version and organism: - import bionty as bt source = bt.Disease.add_source("mondo", version="2025-06-03", organism="all") - Add a source to an entity with a custom DataFrame: - import pandas as pd df = pd.DataFrame({"name": ["disease1"], "ontology_id": ["MONDO:123"]}) source = bt.Source( entity="bionty.Disease", name="new mondo", version="99.999", organism="human", ) source = bt.Disease.add_source(source=source, df=df) - Add from existing PublicOntology: - pub_ont = bt.Disease.public() source = bt.Disease.add_source(pub_ont) - Add organism-specific source: - source = bt.Gene.add_source("ensembl", version="release-112", organism="human") 
 - classmethod public(organism=None, source=None)¶
- The corresponding - bionty.base.PublicOntologyobject.- Note that the source is auto-configured and tracked via - bionty.Source.- Parameters:
- Return type:
 - See also - Example: - import bionty as bt # default source celltype_pub = bt.CellType.public() celltype_pub #> PublicOntology #> Entity: CellType #> Organism: all #> Source: cl, 2023-04-20 #> #terms: 2698 # default source of a organism gene_pub = bt.Gene.public(organism="mouse") gene_pub #> PublicOntology #> Entity: Gene #> Organism: mouse #> Source: ensembl, release-112 #> #terms: 57510 
 - classmethod filter(*queries, **expressions)¶
- Query records. - Parameters:
- queries – One or multiple - Qobjects.
- expressions – Fields and values passed as Django query expressions. 
 
- Return type:
- Returns:
- A - QuerySet.
 - See also - Guide: Query & search registries 
- Django documentation: Queries 
 - Examples - >>> ln.ULabel(name="my label").save() >>> ln.ULabel.filter(name__startswith="my").to_dataframe() 
 - classmethod get(idlike=None, **expressions)¶
- Get a single record. - Parameters:
- idlike ( - int|- str|- None, default:- None) – Either a uid stub, uid or an integer id.
- expressions – Fields and values passed as Django query expressions. 
 
- Raises:
- docs:lamindb.errors.DoesNotExist – In case no matching record is found. 
- Return type:
 - See also - Guide: Query & search registries 
- Django documentation: Queries 
 - Examples - ulabel = ln.ULabel.get("FvtpPJLJ") ulabel = ln.ULabel.get(name="my-label") 
 - classmethod to_dataframe(include=None, features=False, limit=100)¶
- Convert to - pd.DataFrame.- By default, shows all direct fields, except - updated_at.- Use arguments - includeor- featureto include other data.- Parameters:
- include ( - str|- list[- str] |- None, default:- None) – Related fields to include as columns. Takes strings of form- "ulabels__name",- "cell_types__name", etc. or a list of such strings.
- features ( - bool|- list[- str], default:- False) – If a list of feature names, filters- Featuredown to these features. If- True, prints all features with dtypes in the core schema module. If- "queryset", infers the features used within the set of artifacts or records. Only available for- Artifactand- Record.
- limit ( - int, default:- 100) – Maximum number of rows to display from a Pandas DataFrame. Defaults to 100 to reduce database load.
 
- Return type:
- DataFrame
 - Examples - Include the name of the creator in the - DataFrame:- >>> ln.ULabel.to_dataframe(include="created_by__name"]) - Include display of features for - Artifact:- >>> df = ln.Artifact.to_dataframe(features=True) >>> ln.view(df) # visualize with type annotations - Only include select features: - >>> df = ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"]) 
 - classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
- Search. - Parameters:
- string ( - str) – The input string to match against the field ontology values.
- field ( - str|- DeferredAttribute|- None, default:- None) – The field or fields to search. Search all string fields by default.
- limit ( - int|- None, default:- 20) – Maximum amount of top results to return.
- case_sensitive ( - bool, default:- False) – Whether the match is case sensitive.
 
- Return type:
- Returns:
- A sorted - DataFrameof search results with a score in column- score. If- return_querysetis- True.- QuerySet.
 - Examples - >>> ulabels = ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name") >>> ln.save(ulabels) >>> ln.ULabel.search("ULabel2") 
 - classmethod lookup(field=None, return_field=None)¶
- Return an auto-complete object for a field. - Parameters:
- field ( - str|- DeferredAttribute|- None, default:- None) – The field to look up the values for. Defaults to first string field.
- return_field ( - str|- DeferredAttribute|- None, default:- None) – The field to return. If- None, returns the whole record.
- keep – When multiple records are found for a lookup, how to return the records. - - "first": return the first record. -- "last": return the last record. -- False: return all records.
 
- Return type:
- NamedTuple
- Returns:
- A - NamedTupleof lookup information of the field values with a dictionary converter.
 - See also - Examples - >>> import bionty as bt >>> bt.settings.organism = "human" >>> bt.Gene.from_source(symbol="ADGB-DT").save() >>> lookup = bt.Gene.lookup() >>> lookup.adgb_dt >>> lookup_dict = lookup.dict() >>> lookup_dict['ADGB-DT'] >>> lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") >>> genes.ensg00000002745 >>> lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol") 
 - classmethod using(instance)¶
- Use a non-default LaminDB instance. - Parameters:
- instance ( - str|- None) – An instance identifier of form “account_handle/instance_name”.
- Return type:
 - Examples - >>> ln.ULabel.using("account_handle/instance_name").search("ULabel7", field="name") uid score name ULabel7 g7Hk9b2v 100.0 ULabel5 t4Jm6s0q 75.0 ULabel6 r2Xw8p1z 75.0 
 - classmethod inspect(values, field=None, *, mute=False, organism=None, source=None, from_source=True, strict_source=False)¶
- Inspect if values are mappable to a field. - Being mappable means that an exact match exists. - Parameters:
- values ( - list[- str] |- Series|- array) – Values that will be checked against the field.
- field ( - str|- DeferredAttribute|- None, default:- None) – The field of values. Examples are- 'ontology_id'to map against the source ID or- 'name'to map against the ontologies field names.
- mute ( - bool, default:- False) – Whether to mute logging.
- organism ( - str|- SQLRecord|- None, default:- None) – An Organism name or record.
- source ( - SQLRecord|- None, default:- None) – A- bionty.Sourcerecord that specifies the version to inspect against.
- strict_source ( - bool, default:- False) – Determines the validation behavior against records in the registry. - If- False, validation will include all records in the registry, ignoring the specified source. - If- True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
 
- Return type:
- bionty.base.dev.InspectResult 
 - See also - Example: - import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # inspect gene symbols gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] result = bt.Gene.inspect(gene_symbols, field=bt.Gene.symbol, organism="human") assert result.validated == ["A1CF", "A1BG"] assert result.non_validated == ["FANCD1", "FANCD20"] 
 - classmethod validate(values, field=None, *, mute=False, organism=None, source=None, strict_source=False)¶
- Validate values against existing values of a string field. - Note this is strict_source validation, only asserts exact matches. - Parameters:
- values ( - list[- str] |- Series|- array) – Values that will be validated against the field.
- field ( - str|- DeferredAttribute|- None, default:- None) – The field of values. Examples are- 'ontology_id'to map against the source ID or- 'name'to map against the ontologies field names.
- mute ( - bool, default:- False) – Whether to mute logging.
- organism ( - str|- SQLRecord|- None, default:- None) – An Organism name or record.
- source ( - SQLRecord|- None, default:- None) – A- bionty.Sourcerecord that specifies the version to validate against.
- strict_source ( - bool, default:- False) – Determines the validation behavior against records in the registry. - If- False, validation will include all records in the registry, ignoring the specified source. - If- True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
 
- Return type:
- ndarray
- Returns:
- A vector of booleans indicating if an element is validated. 
 - See also - Example: - import bionty as bt bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() gene_symbols = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.validate(gene_symbols, field=bt.Gene.symbol, organism="human") #> array([ True, True, False, False]) 
 - classmethod from_values(values, field=None, create=False, organism=None, source=None, mute=False)¶
- Bulk create validated records by parsing values for an identifier such as a name or an id). - Parameters:
- values ( - list[- str] |- Series|- array) – A list of values for an identifier, e.g.- ["name1", "name2"].
- field ( - str|- DeferredAttribute|- None, default:- None) – A- SQLRecordfield to look up, e.g.,- bt.CellMarker.name.
- create ( - bool, default:- False) – Whether to create records if they don’t exist.
- organism ( - SQLRecord|- str|- None, default:- None) – A- bionty.Organismname or record.
- source ( - SQLRecord|- None, default:- None) – A- bionty.Sourcerecord to validate against to create records for.
- mute ( - bool, default:- False) – Whether to mute logging.
 
- Return type:
- Returns:
- A list of validated records. For bionty registries. Also returns knowledge-coupled records. 
 - Notes - For more info, see tutorial: Manage biological ontologies. - Example: - import bionty as bt # Bulk create from non-validated values will log warnings & returns empty list ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"]) assert len(ulabels) == 0 # Bulk create records from validated values returns the corresponding existing records ulabels = ln.ULabel.from_values(["benchmark", "prediction", "test"], create=True).save() assert len(ulabels) == 3 # Bulk create records from public reference bt.CellType.from_values(["T cell", "B cell"]).save() 
 - classmethod standardize(values, field=None, *, return_field=None, return_mapper=False, case_sensitive=False, mute=False, source_aware=True, keep='first', synonyms_field='synonyms', organism=None, source=None, strict_source=False)¶
- Maps input synonyms to standardized names. - Parameters:
- values ( - Iterable) – Identifiers that will be standardized.
- field ( - str|- DeferredAttribute|- None, default:- None) – The field representing the standardized names.
- return_field ( - str|- DeferredAttribute|- None, default:- None) – The field to return. Defaults to field.
- return_mapper ( - bool, default:- False) – If- True, returns- {input_value: standardized_name}.
- case_sensitive ( - bool, default:- False) – Whether the mapping is case sensitive.
- mute ( - bool, default:- False) – Whether to mute logging.
- source_aware ( - bool, default:- True) – Whether to standardize from public source. Defaults to- Truefor BioRecord registries.
- keep ( - Literal[- 'first',- 'last',- False], default:- 'first') –- When a synonym maps to multiple names, determines which duplicates to mark as - pd.DataFrame.duplicated: -- "first": returns the first mapped standardized name -- "last": returns the last mapped standardized name -- False: returns all mapped standardized name.- When - keepis- False, the returned list of standardized names will contain nested lists in case of duplicates.- When a field is converted into return_field, keep marks which matches to keep when multiple return_field values map to the same field value. 
- synonyms_field ( - str, default:- 'synonyms') – A field containing the concatenated synonyms.
- organism ( - str|- SQLRecord|- None, default:- None) – An Organism name or record.
- source ( - SQLRecord|- None, default:- None) – A- bionty.Sourcerecord that specifies the version to validate against.
- strict_source ( - bool, default:- False) – Determines the validation behavior against records in the registry. - If- False, validation will include all records in the registry, ignoring the specified source. - If- True, validation will only include records in the registry that are linked to the specified source. Note: this parameter won’t affect validation against public sources.
 
- Return type:
- list[- str] |- dict[- str,- str]
- Returns:
- If - return_mapperis- False– a list of standardized names. Otherwise, a dictionary of mapped values with mappable synonyms as keys and standardized names as values.
 - See also - add_synonym()
- Add synonyms. 
- remove_synonym()
- Remove synonyms. 
 - Example: - import bionty as bt # save some gene records bt.Gene.from_values(["A1CF", "A1BG", "BRCA2"], field="symbol", organism="human").save() # standardize gene synonyms gene_synonyms = ["A1CF", "A1BG", "FANCD1", "FANCD20"] bt.Gene.standardize(gene_synonyms) #> ['A1CF', 'A1BG', 'BRCA2', 'FANCD20'] 
 - Methods¶- save(*args, **kwargs)¶
- Save the record and its parents recursively. - Example: - import bionty as bt record = bt.CellType.from_source(name="T cell") record.save() - Return type:
 
 - restore()¶
- Restore from trash onto the main branch. - Return type:
- None
 
 - delete(permanent=None, **kwargs)¶
- Delete record. - Parameters:
- permanent ( - bool|- None, default:- None) – Whether to permanently delete the record (skips trash). If- None, performs soft delete if the record is not already in the trash.
- Return type:
- None
 - Examples - For any - SQLRecordobject- record, call:- >>> record.delete() 
 - view_parents(field=None, with_children=False, distance=5)¶
- View parents in an ontology. - Parameters:
- field ( - str|- DeferredAttribute|- None, default:- None) – Field to display on graph
- with_children ( - bool, default:- False) – Whether to also show children.
- distance ( - int, default:- 5) – Maximum distance still shown.
 
 - Ontological hierarchies: - ULabel(project & sub-project),- CellType(cell type & subtype).- Examples - >>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True) 
 - view_children(field=None, distance=5)¶
- View children in an ontology. - Parameters:
- field ( - str|- DeferredAttribute|- None, default:- None) – Field to display on graph
- distance ( - int, default:- 5) – Maximum distance still shown.
 
 - Ontological hierarchies: - ULabel(project & sub-project),- CellType(cell type & subtype).- Examples - >>> import bionty as bt >>> bt.Tissue.from_source(name="subsegmental bronchus").save() >>> record = bt.Tissue.get(name="respiratory tube") >>> record.view_parents() >>> tissue.view_parents(with_children=True) 
 - add_synonym(synonym, force=False, save=None)¶
- Add synonyms to a record. - Parameters:
- synonym ( - str|- list[- str] |- Series|- array) – The synonyms to add to the record.
- force ( - bool, default:- False) – Whether to add synonyms even if they are already synonyms of other records.
- save ( - bool|- None, default:- None) – Whether to save the record to the database.
 
 - See also - remove_synonym()
- Remove synonyms. 
 - Example: - import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # add a synonym record.add_synonym("T cells") record.synonyms #> "T cells|T-cell|T-lymphocyte|T lymphocyte" 
 - remove_synonym(synonym)¶
- Remove synonyms from a record. - Parameters:
- synonym ( - str|- list[- str] |- Series|- array) – The synonym values to remove.
 - See also - add_synonym()
- Add synonyms 
 - Example: - import bionty as bt # save "T cell" record record = bt.CellType.from_source(name="T cell").save() record.synonyms #> "T-cell|T lymphocyte|T-lymphocyte" # remove a synonym record.remove_synonym("T-cell") record.synonyms #> "T lymphocyte|T-lymphocyte" 
 - set_abbr(value)¶
- Set value for abbr field and add to synonyms. - Parameters:
- value ( - str) – A value for an abbreviation.
 - See also - Example: - import bionty as bt # save an experimental factor record scrna = bt.ExperimentalFactor.from_source(name="single-cell RNA sequencing").save() assert scrna.abbr is None assert scrna.synonyms == "single-cell RNA-seq|single-cell transcriptome sequencing|scRNA-seq|single cell RNA sequencing" # set abbreviation scrna.set_abbr("scRNA") assert scrna.abbr == "scRNA" # synonyms are updated assert scrna.synonyms == "scRNA|single-cell RNA-seq|single cell RNA sequencing|single-cell transcriptome sequencing|scRNA-seq"