clarite.analyze.ewas¶
-
clarite.analyze.
ewas
(outcome: str, covariates: List[str], data: Any, regression_kind: Union[str, Type[clarite.modules.analyze.regression.base.Regression], None] = None, **kwargs)¶ Run an Environment-Wide Association Study
All variables in data other than the outcome (outcome) and covariates are tested individually. Individual regression classes selected with regression_kind may work slightly differently. Results are sorted in order of increasing pvalue
Parameters: - outcome: string
The variable to be used as the output of the regressions
- covariates: list (strings),
The variables to be used as covariates. Any variables in the DataFrames not listed as covariates are regressed.
- data: Any, usually pd.DataFrame
The data to be analyzed, including the outcome, covariates, and any variables to be regressed.
- regression_kind: str or subclass of Regression
This can be ‘glm’, ‘weighted_glm’, or ‘r_survey’ for built-in Regression types, or a custom subclass of Regression None by default to maintain existing api (glm unless SurveyDesignSpec exists, in which case weighted_glm)
- kwargs: Keyword arguments specific to the Regression being used
Returns: - df: pd.DataFrame
EWAS results DataFrame with at least these columns: [‘N’, ‘pvalue’, ‘error’, ‘warnings’] indexed by the outcome and the variable being assessed in each row
Examples
>>> ewas_discovery = clarite.analyze.ewas("logBMI", covariates, nhanes_discovery) Running EWAS on a continuous variable