clarite.describe.correlations

clarite.describe.correlations(data: pandas.core.frame.DataFrame, threshold: float = 0.75)

Return variables with pearson correlation above the threshold

Parameters:
data: pd.DataFrame

The DataFrame to be described

threshold: float, between 0 and 1

Return a dataframe listing pairs of variables whose absolute value of correlation is above this threshold

Returns:
result: pd.DataFrame

DataFrame listing pairs of correlated variables and their correlation value

Examples

>>> import clarite
>>> correlations = clarite.describe.correlations(df, threshold=0.9)
>>> correlations.head()
                    var1      var2  correlation
0  supplement_count  DSDCOUNT     1.000000
1          DR1TM181  DR1TMFAT     0.997900
2          DR1TP182  DR1TPFAT     0.996172
3          DRD370FQ  DRD370UQ     0.987974
4          DR1TS160  DR1TSFAT     0.984733