lamindb.curators.core.DataFrameCatManager¶
- class lamindb.curators.core.DataFrameCatManager(df, columns_field=FieldAttr(Feature.name), categoricals=None, sources=None, index=None, slot=None, maximal_set=False)¶
- Bases: - object- Manage categoricals by updating registries. - This class is accessible from within a - DataFrameCuratorvia the- .catattribute.- If you find non-validated values, you have two options: - new values found in the data can be registered via - DataFrameCurator.cat.add_new_from()- add_new_from()
- non-validated values can be accessed via - DataFrameCurator.cat.add_new_from()- non_validated()and addressed manually
 - Attributes¶- property non_validated: dict[str, list[str]]¶
- Return the non-validated features and labels. 
 - Methods¶- lookup(public=False)¶
- Lookup categories. - Parameters:
- public ( - bool, default:- False) – If “public”, the lookup is performed on the public reference.
- Return type:
 
 - validate()¶
- Validate variables and categorical observations. - Return type:
- bool
 
 - standardize(key)¶
- Replace synonyms with standardized values. - Modifies the input dataset inplace. - Parameters:
- key ( - str) – The key referencing the column in the DataFrame to standardize.
- Return type:
- None
 
 - add_new_from(key, **kwargs)¶
- Add validated & new categories. - Parameters:
- key ( - str) – The key referencing the slot in the DataFrame from which to draw terms.
- **kwargs – Additional keyword arguments to pass to create new records