clarite.modify.make_binary¶
-
clarite.modify.
make_binary
(data: pandas.core.frame.DataFrame, skip: Union[str, List[str], None] = None, only: Union[str, List[str], None] = None)¶ Set variable types as Binary
Checks that each variable has at most 2 values and converts the type to pd.Categorical.
Note: When these variables are used in regression, they are ordered by value. For example, Sex (Male=1, Female=2) will encode “Male” as 0 and “Female” as 1 during the EWAS regression step.
Parameters: - data: pd.DataFrame or pd.Series
Data to be processed
- skip: str, list or None (default is None)
List of variables that should not be made binary
- only: str, list or None (default is None)
List of variables that are the only ones to be made binary
Returns: - data: pd.DataFrame
DataFrame with the same data but validated and converted to binary types
Examples
>>> import clarite >>> nhanes = clarite.modify.make_binary(nhanes, only=['female', 'black', 'mexican', 'other_hispanic']) ================================================================================ Running make_binary -------------------------------------------------------------------------------- Set 4 of 970 variable(s) as binary, each with 22,624 observations