clarite.modify.drop_extra_categories

clarite.modify.drop_extra_categories(data: pandas.core.frame.DataFrame, skip: Union[str, List[str], NoneType] = None, only: Union[str, List[str], NoneType] = None)

Update variable types to remove categories that don’t occur in the data

Parameters:
data: pd.DataFrame or pd.Series

Data to be processed

skip: str, list or None (default is None)

List of variables that will not be checked

only: str, list or None (default is None)

List of variables that are the only ones to be checked

Returns:
data: pd.DataFrame

DataFrame with categorical types updated as needed

Examples

>>> import clarite
>>> df = clarite.modify.drop_extra_categories(df, only=['SDDSRVYR'])
================================================================================
Running drop_extra_categories
--------------------------------------------------------------------------------
SDDSRVYR had categories with no occurrences: 3, 4