clarite.analyze.ewas¶

clarite.analyze.ewas(phenotype: str, covariates: List[str], data: Any, regression_kind: Union[str, Type[clarite.internal.regression.base.Regression], None] = None, **kwargs)¶

Run an Environment-Wide Association Study

Parameters:

phenotype: string

The variable to be used as the output of the regressions

covariates: list (strings),

The variables to be used as covariates. Any variables in the DataFrames not listed as covariates are regressed.

data: Any, usually pd.DataFrame

The data to be analyzed, including the phenotype, covariates, and any variables to be regressed.

regression_kind: str or subclass of Regression

This can be ‘glm’, ‘glm_weighted’, or ‘r_survey’ for built-in Regression types,: or a custom subclass of Regression

None by default to maintain existing api (‘glm’ unless SurveyDesignSpec exists, in which case weighted_glm)

kwargs: Keyword arguments specific to the Regression being used

Returns:

df: pd.DataFrame: EWAS results DataFrame with at least these columns: [‘N’, ‘pvalue’, ‘error’, ‘warnings’] indexed by the phenotype/outcome and the variable being assessed in each row

Examples

>>> ewas_discovery = clarite.analyze.ewas("logBMI", covariates, nhanes_discovery)
Running EWAS on a continuous variable