clarite.describe.correlations¶
-
clarite.describe.
correlations
(data: pandas.core.frame.DataFrame, threshold: float = 0.75)¶ Return variables with pearson correlation above the threshold
Parameters: - data: pd.DataFrame
The DataFrame to be described
- threshold: float, between 0 and 1
Return a dataframe listing pairs of variables whose absolute value of correlation is above this threshold
Returns: - result: pd.DataFrame
DataFrame listing pairs of correlated variables and their correlation value
Examples
>>> import clarite >>> correlations = clarite.describe.correlations(df, threshold=0.9) >>> correlations.head() var1 var2 correlation 0 supplement_count DSDCOUNT 1.000000 1 DR1TM181 DR1TMFAT 0.997900 2 DR1TP182 DR1TPFAT 0.996172 3 DRD370FQ DRD370UQ 0.987974 4 DR1TS160 DR1TSFAT 0.984733