Plot

Functions that generate plots

clarite.plot.histogram(data, column: str, figsize: Tuple[int, int] = (12, 5), title: Optional[str] = None, figure: Optional[matplotlib.pyplot.figure] = None, **kwargs)

Plot a histogram of the values in the given column.

Parameters
data: pd.DataFrame

The DataFrame containing data to be plotted

column: string

The name of the column that will be plotted

figsize: tuple(int, int), default (12, 5)

The figure size of the resulting plot

title: string or None, default None

The title used for the plot

figure: matplotlib Figure or None, default None

Pass in an existing figure to plot to that instead of creating a new one (ignoring figsize)

**kwargs:

Other keyword arguments to pass to the histplot or catplot function of Seaborn

Returns
None

Examples

>>> import clarite
>>> title = f"Discovery: Skew of BMIMBX = {stats.skew(nhanes_discovery_cont['BMXBMI']):.6}"
>>> clarite.plot.histogram(nhanes_discovery_cont, column="BMXBMI", title=title, bins=100)
../_images/histogram.png
clarite.plot.distributions(data, filename: str, continuous_kind: str = 'count', nrows: int = 4, ncols: int = 3, quality: str = 'medium', variables: Optional[List[str]] = None, sort: bool = True)

Create a pdf containing histograms for each binary or categorical variable, and one of several types of plots for each continuous variable.

Parameters
data: pd.DataFrame

The DataFrame containing data to be plotted

filename: string or pathlib.Path

Name of the saved pdf file. The extension will be added automatically if it was not included.

continuous_kind: string

What kind of plots to use for continuous data. Binary and Categorical variables will always be shown with histograms. One of {‘count’, ‘box’, ‘violin’, ‘qq’}

nrows: int (default=4)

Number of rows per page

ncols: int (default=3)

Number of columns per page

quality: ‘low’, ‘medium’, or ‘high’

Adjusts the DPI of the plots (150, 300, or 1200)

variables: List[str] or None

Which variables to plot. If None, all variables are plotted.

sort: Boolean (default=True)

Whether or not to sort variable names

Returns
None

Examples

>>> import clarite
>>> clarite.plot.distributions(df[['female', 'occupation', 'LBX074']], filename="test")
../_images/distributions_count.png
>>> clarite.plot.distributions(df[['female', 'occupation', 'LBX074']], filename="test", continuous_kind='box')
../_images/distributions_box.png
>>> clarite.plot.distributions(df[['female', 'occupation', 'LBX074']], filename="test", continuous_kind='violin')
../_images/distributions_violin.png
>>> clarite.plot.distributions(df[['female', 'occupation', 'LBX074']], filename="test", continuous_kind='qq')
../_images/distributions_qq.png
clarite.plot.manhattan(dfs: Dict[str, pandas.core.frame.DataFrame], categories: Optional[Dict[str, str]] = None, bonferroni: Optional[float] = 0.05, fdr: Optional[float] = None, num_labeled: int = 3, label_vars: Optional[List[str]] = None, figsize: Tuple[int, int] = (12, 6), dpi: int = 300, title: Optional[str] = None, figure: Optional[matplotlib.pyplot.figure] = None, colors: List[str] = ['#53868B', '#4D4D4D'], background_colors: List[str] = ['#EBEBEB', '#FFFFFF'], filename: Optional[str] = None, return_figure: bool = False)

Create a Manhattan-like plot for a list of EWAS Results

Parameters
dfs: DataFrame

Dictionary of dataset names to pandas dataframes of ewas results (requires certain columns)

categories: dictionary (string: string) or None

A dictionary mapping each variable name to a category name for optional grouping

bonferroni: float or None (default 0.05)

Show a cutoff line at the pvalue corresponding to a given bonferroni-corrected pvalue

fdr: float or None (default None)

Show a cutoff line at the pvalue corresponding to a given fdr

num_labeled: int, default 3

Label the top <num_labeled> results with the variable name

label_vars: list of strings, default None

Label the named variables (or pass None to skip labeling this way)

figsize: tuple(int, int), default (12, 6)

The figure size of the resulting plot in inches

dpi: int, default 300

The figure dots-per-inch

title: string or None, default None

The title used for the plot

figure: matplotlib Figure or None, default None

Pass in an existing figure to plot to that instead of creating a new one (ignoring figsize and dpi)

colors: List(string, string), default [“#53868B”, “#4D4D4D”]

A list of colors to use for alternating categories (must be same length as ‘background_colors’)

background_colors: List(string, string), default [“#EBEBEB”, “#FFFFFF”]

A list of background colors to use for alternating categories (must be same length as ‘colors’)

filename: Optional str

If provided, a copy of the plot will be saved to the specified file instead of being shown

return_figure: boolean, default False

If True, return figure instead of showing or saving the plot. Useful to customize the plot

Returns
figure: matplotlib Figure or None

If return_figure, returns a matplotlib Figure object. Else returns None

Examples

>>> clarite.plot.manhattan({'discovery':disc_df, 'replication':repl_df}, categories=data_categories, title="EWAS Results")
../_images/manhattan.png
clarite.plot.manhattan_fdr(dfs: Dict[str, pandas.core.frame.DataFrame], categories: Optional[Dict[str, str]] = None, cutoff: Optional[float] = 0.05, num_labeled: int = 3, label_vars: Optional[List[str]] = None, figsize: Tuple[int, int] = (12, 6), dpi: int = 300, title: Optional[str] = None, figure: Optional[matplotlib.pyplot.figure] = None, colors: List[str] = ['#53868B', '#4D4D4D'], background_colors: List[str] = ['#EBEBEB', '#FFFFFF'], filename: Optional[str] = None, return_figure: bool = False)

Create a Manhattan-like plot for a list of EWAS Results using FDR significance

Parameters
dfs: DataFrame

Dictionary of dataset names to pandas dataframes of ewas results (requires certain columns)

categories: dictionary (string: string) or None

A dictionary mapping each variable name to a category name for optional grouping

cutoff: float or None (default 0.05)

The pvalue to draw the FDR significance line at (None for no line)

num_labeled: int, default 3

Label the top <num_labeled> results with the variable name

label_vars: list of strings, default None

Label the named variables (or pass None to skip labeling this way)

figsize: tuple(int, int), default (12, 6)

The figure size of the resulting plot in inches

dpi: int, default 300

The figure dots-per-inch

title: string or None, default None

The title used for the plot

figure: matplotlib Figure or None, default None

Pass in an existing figure to plot to that instead of creating a new one (ignoring figsize and dpi)

colors: List(string, string), default [“#53868B”, “#4D4D4D”]

A list of colors to use for alternating categories (must be same length as ‘background_colors’)

background_colors: List(string, string), default [“#EBEBEB”, “#FFFFFF”]

A list of background colors to use for alternating categories (must be same length as ‘colors’)

filename: Optional str

If provided, a copy of the plot will be saved to the specified file instead of being shown

return_figure: boolean, default False

If True, return figure instead of showing or saving the plot. Useful to customize the plot

Returns
figure: matplotlib Figure or None

If return_figure, returns a matplotlib Figure object. Else returns None

Examples

>>> clarite.plot.manhattan_fdr({'discovery':disc_df, 'replication':repl_df},
 categories=data_categories, title="EWAS Results")
../_images/manhattan_fdr.png
clarite.plot.manhattan_bonferroni(dfs: Dict[str, pandas.core.frame.DataFrame], categories: Optional[Dict[str, str]] = None, cutoff: Optional[float] = 0.05, num_labeled: int = 3, label_vars: Optional[List[str]] = None, figsize: Tuple[int, int] = (12, 6), dpi: int = 300, title: Optional[str] = None, figure: Optional[matplotlib.pyplot.figure] = None, colors: List[str] = ['#53868B', '#4D4D4D'], background_colors: List[str] = ['#EBEBEB', '#FFFFFF'], filename: Optional[str] = None, return_figure: bool = False)

Create a Manhattan-like plot for a list of EWAS Results using Bonferroni significance

Parameters
dfs: DataFrame

Dictionary of dataset names to pandas dataframes of ewas results (requires certain columns)

categories: dictionary (string: string) or None

A dictionary mapping each variable name to a category name for optional grouping

cutoff: float or None (default 0.05)

The pvalue to draw the Bonferroni significance line at (None for no line)

num_labeled: int, default 3

Label the top <num_labeled> results with the variable name

label_vars: list of strings, default None

Label the named variables (or pass None to skip labeling this way)

figsize: tuple(int, int), default (12, 6)

The figure size of the resulting plot in inches

dpi: int, default 300

The figure dots-per-inch

title: string or None, default None

The title used for the plot

figure: matplotlib Figure or None, default None

Pass in an existing figure to plot to that instead of creating a new one (ignoring figsize and dpi)

colors: List(string, string), default [“#53868B”, “#4D4D4D”]

A list of colors to use for alternating categories (must be same length as ‘background_colors’)

background_colors: List(string, string), default [“#EBEBEB”, “#FFFFFF”]

A list of background colors to use for alternating categories (must be same length as ‘colors’)

filename: Optional str

If provided, a copy of the plot will be saved to the specified file instead of being shown

return_figure: boolean, default False

If True, return figure instead of showing or saving the plot. Useful to customize the plot

Returns
figure: matplotlib Figure or None

If return_figure, returns a matplotlib Figure object. Else returns None

Examples

>>> clarite.plot.manhattan_bonferroni({'discovery':disc_df, 'replication':repl_df},
 categories=data_categories, title="EWAS Results")
../_images/manhattan_bonferroni.png
clarite.plot.top_results(ewas_result: pandas.core.frame.DataFrame, pvalue_name: str = 'pvalue', cutoff: Optional[float] = 0.05, num_rows: int = 20, figsize: Optional[Tuple[int, int]] = None, dpi: int = 300, title: Optional[str] = None, figure: Optional[matplotlib.pyplot.figure] = None, filename: Optional[str] = None)

Create a dotplot for EWAS Results showing pvalues and beta coefficients

Parameters
ewas_result: DataFrame

EWAS Result to plot

pvalue_name: str

‘pvalue’, ‘pvalue_fdr’, or ‘pvalue_bonferroni’

cutoff: float (default 0.05)

A vertical line is drawn in the pvalue column to show a significance cutoff

num_rows: int (default 20)

How many rows to show in the plot

figsize: tuple(int, int), default (12, 6)

The figure size of the resulting plot in inches

dpi: int, default 300

The figure dots-per-inch

title: string or None, default None

The title used for the plot

figure: matplotlib Figure or None, default None

Pass in an existing figure to plot to that instead of creating a new one (ignoring figsize and dpi)

filename: Optional str

If provided, a copy of the plot will be saved to the specified file instead of being shown

Returns
None

Examples

>>> clarite.plot.top_results(ewas_result)
../_images/top_results.png