clarite.modify.rowfilter_incomplete_obs

clarite.modify.rowfilter_incomplete_obs(data, skip: Union[str, List[str], NoneType] = None, only: Union[str, List[str], NoneType] = None)

Remove rows containing null values

Parameters:
data: pd.DataFrame

The DataFrame to be processed and returned

skip: str, list or None (default is None)

List of columns that are not checked for null values

only: str, list or None (default is None)

List of columns that are the only ones to be checked for null values

Returns:
data: pd.DataFrame

The filtered DataFrame

Examples

>>> import clarite
>>> nhanes_filtered = clarite.modify.rowfilter_incomplete_obs(nhanes, only=[phenotype] + covariates)
================================================================================
Running rowfilter_incomplete_obs
--------------------------------------------------------------------------------
Removed 3,687 of 22,624 observations (16.30%) due to NA values in any of 8 variables