Title: | Visualise Correlations |
---|---|
Description: | An investigative tool designed to help users visualize correlations between variables in their datasets. This package aims to provide an easy and effective way to explore and visualize these correlations, making it easier to interpret and communicate results. |
Authors: | Alan Inglis |
Maintainer: | Alan Inglis <[email protected]> |
License: | GPL (>=2) |
Version: | 0.1.0 |
Built: | 2024-11-15 04:53:40 UTC |
Source: | https://github.com/alaninglis/corrviz |
This function creates an animated solar system plot of correlations between variables in a dataset.
animSolar( mat, sun = NULL, export = FALSE, num_frames = 100, path = NULL, gif_name = "solar_system.gif", fps = 60 )
animSolar( mat, sun = NULL, export = FALSE, num_frames = 100, path = NULL, gif_name = "solar_system.gif", fps = 60 )
mat |
A square correlation matrix to visualise. |
sun |
A character string specifying the column name in the dataset to be treated as the 'sun' in the solar system plot. |
export |
A logical value specifying whether to export the animation as a GIF file, default is FALSE. |
num_frames |
An integer specifying the number of frames in the animation, default is 100. |
path |
A character string specifying the directory path where the GIF file will be saved, default is NULL. |
gif_name |
A character string specifying the name of the GIF file. Must be in the format "myFile.gif" |
fps |
An integer specifying the frames per second for the animation.
Default is 60 and is only used when exporting a gif via |
In a solar system correlation plot, the dependent variable of interest is positioned at the center, represented as the sun. The explanatory variables are depicted as planets orbiting around the sun, with their distance from the sun corresponding to the absolute value of their correlation with the dependent variable. Therefore, the greater the distance of a planet from the sun, the weaker the correlation between the explanatory variable and the dependent variable.
The num_frames
argument is used to select the number of frames.
Setting this to a low value will produce the plot
quicker, however having a low number of frames will result in the "planets" jumping
as the frames transition. Additionally, a low values of num_frames
will affect the
orbit of the animation when setting export = FALSE
. This differs from the
fps
argument which sets the number of frames to play per second for use
when exporting a gif.
An animated solar system plot displaying correlations.
cm <- cor(mtcars) animSolar(mat = cm, sun = 'mpg', export = FALSE, num_frames = 25)
cm <- cor(mtcars) animSolar(mat = cm, sun = 'mpg', export = FALSE, num_frames = 25)
This function creates a either a static or interactive bar plot of correlations between variables in a dataset.
corrBarplot( mat, interactive = TRUE, pal = colorRampPalette(c("cornflowerblue", "white", "tomato"))(100) )
corrBarplot( mat, interactive = TRUE, pal = colorRampPalette(c("cornflowerblue", "white", "tomato"))(100) )
mat |
A square correlation matrix to visualise. |
interactive |
A logical value specifying whether to create an interactive ggplotly plot, default is TRUE |
pal |
A colour palette for the bar plot, default is colorRampPalette(c("cornflowerblue", "white", "tomato"))(100). |
Creates a static or interactive bar plot displaying correlation values. By hovering mouse over a bar, the variables and correlation value is shown.
A static or interactive bar plot displaying correlations.
cm <- cor(mtcars) corrBarplot(mat = cm, interactive = TRUE)
cm <- cor(mtcars) corrBarplot(mat = cm, interactive = TRUE)
This function creates an interactive bubble plot of correlations between variables in a dataset.
corrBubble( mat, display = c("all", "upper", "lower"), pal = colorRampPalette(c("cornflowerblue", "white", "tomato"))(100) )
corrBubble( mat, display = c("all", "upper", "lower"), pal = colorRampPalette(c("cornflowerblue", "white", "tomato"))(100) )
mat |
A square correlation matrix to visualise. |
display |
A character vector specifying which part of the correlation matrix to display: 'all', 'upper', or 'lower', default is 'all'. |
pal |
A color palette for the bubble plot. |
Creates an interactive bubble plot displaying correlation values. By hovering mouse over a cell, the variables and correlation value is shown.
An interactive bubble plot displaying correlations.
cm <- cor(mtcars) corrBubble(mat = cm, display = 'all')
cm <- cor(mtcars) corrBubble(mat = cm, display = 'all')
This function creates a chord plot of correlations between variables in a dataset.
corrChord(mat, threshold = 0, circle = FALSE)
corrChord(mat, threshold = 0, circle = FALSE)
mat |
A square correlation matrix to visualise. |
threshold |
A numeric value indicating the minimum absolute correlation value to display in the plot. |
circle |
A logical value indicating whether to use a circular layout (TRUE) or linear layout (FALSE), default is FALSE. |
When using a large amount of data, this plot can quickly become over
complicated. It is recommended to filter the correlations using the threshold
argument to simplify the visualisation.
A chord plot displaying correlations.
cm <- cor(mtcars) corrChord(mat = cm, threshold = 0.8) corrChord(mat = cm, threshold = 0.8, circle = TRUE)
cm <- cor(mtcars) corrChord(mat = cm, threshold = 0.8) corrChord(mat = cm, threshold = 0.8, circle = TRUE)
This function creates a circular plot of correlations between variables in a dataset.
corrCircle(mat, threshold = 0, ticks = FALSE)
corrCircle(mat, threshold = 0, ticks = FALSE)
mat |
A square correlation matrix to visualise. |
threshold |
A numeric value indicating the minimum absolute correlation value to display in the plot. |
ticks |
A logical value indicating whether to display ticks (TRUE) or not (FALSE), default is FALSE. |
When using a large amount of data, this plot can quickly become over
complicated. It is recommended to filter the correlations using the threshold
argument to simplify the visualisation.
A circular chord plot object displaying the correlations between variables.
cm <- cor(mtcars) corrCircle(mat = cm, threshold = 0.8)
cm <- cor(mtcars) corrCircle(mat = cm, threshold = 0.8)
Create a correlation grid plot to visualize correlations among the columns of a dataset
corrGrid( mat, display = c("all", "upper", "lower"), type = c("square", "circle", "text", "pie"), showDiag = "TRUE", pal = colorRampPalette(c("darkblue", "white", "darkred"))(100) )
corrGrid( mat, display = c("all", "upper", "lower"), type = c("square", "circle", "text", "pie"), showDiag = "TRUE", pal = colorRampPalette(c("darkblue", "white", "darkred"))(100) )
mat |
A square correlation matrix to visualise. |
display |
A character string, specifying the display type, one of "all", "upper", or "lower" (default: "all"). |
type |
A character string, specifying the shape of the correlation coefficients, one of "square", "circle", or "text" (default: "square"). |
showDiag |
A logical value, if TRUE (default), the diagonal of the correlation matrix is shown. |
pal |
A color palette function, used for the correlation coefficient colors. |
A correlation grid plot
cm <- cor(mtcars) corr_grid_plot <- corrGrid(mat = cm, type = 'square')
cm <- cor(mtcars) corr_grid_plot <- corrGrid(mat = cm, type = 'square')
This function creates an interactive heatmap of correlations between variables in a dataset.
corrHeatmap( mat, display = c("all", "upper", "lower"), reorder = TRUE, pal = colorRampPalette(c("darkblue", "white", "darkred"))(100) )
corrHeatmap( mat, display = c("all", "upper", "lower"), reorder = TRUE, pal = colorRampPalette(c("darkblue", "white", "darkred"))(100) )
mat |
A square correlation matrix to visualise. |
display |
A character vector specifying which part of the correlation matrix to display: 'all', 'upper', or 'lower', default is 'all'. |
reorder |
A logical value indicating whether to reorder the heatmap based on hierarchical clustering, default is TRUE. |
pal |
A color palette for the heatmap. |
Creates an interactive heatmap displaying correlation values. By hovering mouse over a cell, the variables and correlation value is shown.
An interactive heatmap plot displaying correlations.
cm <- cor(mtcars) corrHeatmap(mat = cm, display = 'all')
cm <- cor(mtcars) corrHeatmap(mat = cm, display = 'all')
Creates an interactive Correlation Network Visualization
corrNetwork( mat, threshold = 0, layout = "layout_nicely", width = "100%", height = "400px", physics = TRUE )
corrNetwork( mat, threshold = 0, layout = "layout_nicely", width = "100%", height = "400px", physics = TRUE )
mat |
A square correlation matrix to visualise. |
threshold |
A numeric value indicating the minimum absolute correlation value to display in the plot. |
layout |
Any |
width |
The width of the viewing window. |
height |
The height of the viewing window. |
physics |
A logical value indicating whether to use physics-based layout. Default is TRUE. |
Each node in the network represents a variable where the width of the connecting edges represent the absolute value of the correlation. Positive correlations have red coloured edges whereas negative correlations have blue coloured edges.
A network plot displaying correlations.
ci <- cor(iris[1:4]) corrNetwork(mat = ci, threshold = 0.5) # Another example cm <- cor(mtcars) corrNetwork(mat = cm, threshold = 0.8, layout = 'layout_on_grid', physics = FALSE)
ci <- cor(iris[1:4]) corrNetwork(mat = ci, threshold = 0.5) # Another example cm <- cor(mtcars) corrNetwork(mat = cm, threshold = 0.8, layout = 'layout_on_grid', physics = FALSE)
This function creates a pairwise correlation plot with annotated correlation coefficients and optional coloring by a specified variable. The plot can be interactive or static.
corrPairs( data, method = c("pearson", "kendall", "spearman"), interactive = TRUE, col_by = NULL )
corrPairs( data, method = c("pearson", "kendall", "spearman"), interactive = TRUE, col_by = NULL )
data |
A data frame containing the variables to be plotted. |
method |
A character string specifying the correlation method. One of "pearson", "kendall", or "spearman". Default is "pearson". |
interactive |
A logical value indicating whether the output plot should be interactive (TRUE) or static (FALSE). Default is TRUE. |
col_by |
An optional character string specifying the name of the column in the data frame to be used for coloring points. Default is NULL. |
A ggplotly object (if interactive = TRUE) or a ggplot object (if interactive = FALSE) displaying the pairwise correlation plot with annotated correlation coefficients and optional coloring by the specified variable.
corrPairs(data = mtcars[,1:4], method = "pearson", interactive = TRUE, col_by = "cyl")
corrPairs(data = mtcars[,1:4], method = "pearson", interactive = TRUE, col_by = "cyl")
Create an interactive Sankey diagram to visualize correlations
corrSankey(mat, threshold = 0, colour = FALSE)
corrSankey(mat, threshold = 0, colour = FALSE)
mat |
A square correlation matrix to visualise. |
threshold |
A numeric value indicating the minimum absolute correlation value to include in the diagram. Default is 0 (include all correlations). |
colour |
A logical value indicating whether to color the links based on positive or negative correlation. Default is FALSE (links are grey). |
This function generates a Sankey diagram for a given data frame, correlation method, and correlation threshold, with an optional colour parameter.
A plotly Sankey diagram object.
cm <- cor(mtcars) corrSankey(mat = cm, threshold = 0.6) corrSankey(mat = cm, threshold = 0.8, colour = TRUE)
cm <- cor(mtcars) corrSankey(mat = cm, threshold = 0.6) corrSankey(mat = cm, threshold = 0.8, colour = TRUE)
Correlation Explorer Shiny App. This function creates a Shiny application to explore the correlation between variables in a given dataset.
corrShiny( data, x_var, y_var, color_var = NULL, size_var = NULL, correlation_method = "pearson" )
corrShiny( data, x_var, y_var, color_var = NULL, size_var = NULL, correlation_method = "pearson" )
data |
A data frame with the variables to be analyzed. |
x_var |
The name of the variable to be plotted on the X-axis. |
y_var |
The name of the variable to be plotted on the Y-axis. |
color_var |
The name of the variable to be used for coloring the points on the scatter plot. |
size_var |
The name of the variable to be used for sizing the points on the scatter plot. |
correlation_method |
The method to be used for computing the correlation coefficient, must be one of "pearson", "spearman" or "kendall". |
A Shiny app that displays a scatter plot and the correlation coefficient between two variables.
This function creates a solar system plot of correlations between variables in a dataset.
corrSolar(mat, sun = NULL)
corrSolar(mat, sun = NULL)
mat |
A square correlation matrix to visualise. |
sun |
A character string specifying the column name in the dataset to be treated as the 'sun' in the solar system plot. |
In a solar system correlation plot, the dependent variable of interest is positioned at the center, represented as the sun. The explanatory variables are depicted as planets orbiting around the sun, with their distance from the sun corresponding to the absolute value of their correlation with the dependent variable. Therefore, the greater the distance of a planet from the sun, the weaker the correlation between the explanatory variable and the dependent variable.
An solar system plot displaying correlations.
cm <- cor(mtcars) corrSolar(mat = cm, sun = 'mpg')
cm <- cor(mtcars) corrSolar(mat = cm, sun = 'mpg')
Convert a Matrix to Long Format.
matrix2long(mat)
matrix2long(mat)
mat |
A matrix to be converted into long format. |
This function converts a matrix into a long format data frame. The resulting data frame contains four columns: row, column, value, and id. The 'id' column assigns a unique identifier to each column group, making it easier to identify and analyze the data by column groups.
A data frame in long format with columns: row, column, value, and id.
# Create a matrix mat <- matrix(data = 1:9, nrow = 3, ncol = 3, dimnames = list(c("A", "B", "C"), c("X", "Y", "Z"))) long_format <- matrix2long(mat) long_format # Using correlation matrix matrix2long(cor(mtcars))
# Create a matrix mat <- matrix(data = 1:9, nrow = 3, ncol = 3, dimnames = list(c("A", "B", "C"), c("X", "Y", "Z"))) long_format <- matrix2long(mat) long_format # Using correlation matrix matrix2long(cor(mtcars))