---
title: "colouR"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{colouR}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction to the colouR Package
Welcome to the `colouR` package, a useful tool for analyzing and utilizing the colors in images, as well as providing color palettes inspired by Radiohead and Taylor Swift album covers. Whether you are a designer looking for inspiration, a data analyst searching for unique ways to visualize data, or a music lover wanting to incorporate your favorite album colors into your projects, this package is for you. It is recommended to view this instructional guide via the GitHub page: https://alaninglis.github.io/colouR/articles/colouR.html
## Package Overview
The `colouR` package provides a set of functions that allows you to:
* Extract color values from an image (either JPG or PNG).
* Obtain the top $n$ colors in the image based on their frequency.
* Group and average colors in the image.
* Utilize color palettes inspired by Radiohead and Taylor Swift album covers.
* These functionalities make the `colouR` package a versatile and easy-to-use tool for exploring and working with colors in images.
## Key Functions
Some of the main functions included in the `colouR` package are:
* `getTopCol()`: Extracts the top n colors from an image, with options to exclude black and white shades, and to group and average colors.
* `colPalette()`: Creates a color palette based on a specified album cover from either Radiohead or Taylor Swift discography.
* `scaleColor()`: Provides a ggplot2-compatible color scale based on the selected album cover palette, for both discrete and continuous data.
* `scaleFill()`: Provides a ggplot2-compatible fill scale based on the selected album cover palette, for both discrete and continuous data.
* `groupCols`: This function takes a vector of hex color values and groups them using k-means clustering in the RGB color space.
* `avgHex`: This function takes a data frame with two columns: one for the hex color values and another for the group labels. It calculates the average color for each group and returns a data frame with the group labels and their corresponding average hex colors.
* `img2pal`: Creates a Colour Palette from an input image.
* `plotPalette`: This function takes a data frame with a column of colors and plots the colors as a color palette.
In addition, we provide several utility functions, all of which are demonstrated in this document.
## Getting Started
To begin using the `colouR` package, simply install it from GitHub, load it into your R session, and start exploring the world of colors in images.
```{r}
# install.packages("devtools")
#devtools::install_github("AlanInglis/colouR")
# Load the package
library(colouR)
# Load ggplot2 for making some plots
library(ggplot2)
```
# Get the top n colours in an image
The first function we demonstrate is the `getTopCol` function. This function reads an image file, extracts the colors, and returns the top n colors based on their frequency in the image. Optionally, black and white shades can be excluded, and the colors can be grouped and averaged (more on colour averaging later!). This function can take in a .jpg, .jpeg, or .png or a url pointing to an image using any of these formats, via the `path` argument and returns the top `n` colours used in the image.
## Function Arguments
The arguments for this function are:
* `path` Character, the path to the image file (either jpg or png).
* `n` Integer, the number of top colors to return. If NULL (default), return all colors.
* `exclude` Logical, whether to exclude black and white shades. Default is TRUE.
* `sig` Integer, the number of decimal places for the color percentage. Default is 4.
* `avgCols` Logical, whether to average the colors by groups. Default is TRUE.
* `n_clusters` Integer, the number of clusters to use for grouping colors. Default is 5.
* `customExclude` Character vector. Optional vector of custom color codes in HEX format to be excluded.
## Input Image
To begin, lets first take a look at a raw image:
```{r, out.height='30%', out.width='30%', fig.align="center"}
knitr::include_graphics("https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png")
```
Figure 1: The _"towering inferno of physical perfection"_ that is Bender.
## Top 10 colours in image
In the code below we obtain the top 10 most frequent colours used in the image without using any colour grouping or averaging by setting `avgCols = FALSE`. Additionally, we chose not to exclude any black or white shades by setting `exclude = FALSE` (note: the `exclude` argument excludes many black and white shades, however this list is far from exhaustive and, consequently, blacks and white will most likely still be included. However we do allow you to provide additional black and white hex codes to be included in the exclude function... more on that below). The output of the `getTopCols` when setting the outlined parameters is a data frame with three columns, that is, the top colors, their frequency, and percentage in the image.
```{r, top10cols}
set.seed(1701) # for reproducability
top10 <- getTopCol(path = "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png",
n = 10,
avgCols = FALSE,
exclude = FALSE)
# take a look at the top 100 most frequent colours in the image:
top10
```
## Plot top 10 colours
Plotting the top 10 colours, we can see that the colour white dominates the image with over 51% of the image being white. Since most of this white is probably from the background of the image, this result is not very useful.
```{r, top10colsPlot, out.height='70%', out.width='70%', fig.align="center"}
# order factors
top10$hex <- factor(top10$hex, levels = top10$hex)
# plot
ggplot(top10, aes(x = hex, y = freq)) +
geom_bar(stat = 'identity', fill = top10$hex) +
theme_dark() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
xlab('HEX colour code') +
ylab('Frequency')
```
Figure 2: Frequency of colours used in the image. We can see that white dominates the image.
## Exclude unwanted colours
Since most of this white is probably from the background of the image, this result is not very useful. To exclude white and black shades we set `exclude = TRUE` (more on this below).
```{r, top10colsExclude}
set.seed(1701) # for reproducability
top10exclude <- getTopCol(path = "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png",
n = 10,
avgCols = FALSE,
exclude = TRUE,
customExclude = NULL)
```
Now, plotting these colours gives a more truer representation of the colours used in the image:
```{r, top10colsPlotExclude, out.height='70%', out.width='70%', fig.align="center"}
# order factors
top10exclude$hex <- factor(top10exclude$hex, levels = top10exclude$hex)
# plot
ggplot(top10exclude, aes(x = hex, y = freq)) +
geom_bar(stat = 'identity', fill = top10exclude$hex) +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
xlab('HEX colour code') +
ylab('Frequency')
```
Figure 3: Frequency of colours used in the image with white and black colours excluded.
## Average top colours
In Figure 3, we can see that there are a lot of similar colours. That is, many of the colours are different shades of a metallic blue. By setting `avgCols = TRUE`, we can group together colours with similar shades into $n$ groups via the `n_clusters` argument and average over them to produce a single colour. In this case, we are setting `n_clusters = 5` (this eliminates the need to set the `n` argument).
```{r, top10colsAvg}
set.seed(1701) # for reproducability
top10avg <- getTopCol(path = "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png",
avgCols = TRUE,
exclude = TRUE,
n_clusters = 5)
```
```{r, top10colsPlotAvg, out.height='70%', out.width='70%', fig.align="center"}
# order factors
top10avg$avg_color <- factor(top10avg$avg_color, levels = top10avg$avg_color)
# plot
ggplot(top10avg, aes(x = avg_color, y = freq)) +
geom_bar(stat = 'identity', fill = top10avg$avg_color) +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
xlab('Average colour') +
ylab('Frequency')
```
Figure 4: Frequency of averaged colours used in the image with white and black colours excluded.
# Excude custom colours
In Figure 4, we can see that several black shade have slipped through the `exclude` filter. However, we can provide additional hex codes by passing them to the `excludeCols` argument. To illustrate this point, we will use the previously created dataframe of top 10 colours, with the inbuilt black and white shades removed. Examining the colours we have:
```{r, look_at_top10}
top10exclude
```
However, if we want to exclude any of these colours, we can pass them as a character vector of hex values to the `customExclude` argument, as follows:
```{r, exclude_custom}
coloursToExclude <- c("#A9C5DA", "#7CA5C1", "#C9E0F0", "#5A8595", "#FFFAC2")
top10exclude <- getTopCol(path = "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png",
n = 10,
avgCols = FALSE,
exclude = TRUE,
customExclude = coloursToExclude)
```
Now when we look at the `top10exclude` object, it should not contain any of the colours selected.
```{r, exclude_view}
top10exclude
```
# Grouping colours
As we have already seen, in `colouR` we provide an option to group and average colours in the `getTopCol` function. The function used to group the colours is the `groupCols` function. This function takes a vector of hex color values and converts them to the RGB colour space. It then groups them into `n_clusters` using k-means clustering. For example, if we take a vector of colours like the one below, we can see that there are some unique colours and some colours that are similar. To begin, lets take a look at the colour palette, we can do this by using the `plotPalette` function:
```{r, palette, out.height='70%', out.width='70%', fig.align="center"}
hex_colors <- c("#FF0000", "#00FF00", "#0000FF", "#FFFF00", "#FF00FF", "#1050FF", "#ffff50")
plotPalette(hex_colors)
```
Figure 5: Colour palette.
To group the colours into, say, 4 groups we set `n_clusters = 4`. The output is a data frame with two columns. One containing the hex value and another containing the group number.
```{r, group}
cols <- c("#FF0000", "#00FF00", "#0000FF", "#FFFF00", "#FF00FF", "#1050FF", "#ffff50")
set.seed(1701) # for reproducability
grCol <- groupCols(hex_colors = cols, n_clusters = 4)
```
## Plot Groups
Arranging the data frame by group and plotting gives us:
```{r, groupPlot, out.height='70%', out.width='70%', fig.align="center"}
# order factors
grCol$hex_color <- factor(grCol$hex_color, levels = grCol$hex_color)
# plot
ggplot(grCol, aes(x = hex_color, y = group)) +
geom_bar(stat = 'identity', fill = grCol$hex_color) +
theme_bw() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
xlab('Average colour') +
ylab('Group')
```
Figure 6: Grouped colours.
or using the `plotPalette` function:
```{r, palette2, out.height='70%', out.width='70%', fig.align="center"}
plotPalette(df = grCol, color_col = 'hex_color')
```
Figure 7: Grouped colour palette.
We can see that the green colour is in a single group, the two blue colours are grouped together, along with the two yellow colours. The red and violet colours are also in a group.
# Average colours
The `avgHex` function takes a data frame with two columns: one for the hex color values and another for the group labels. It calculates the average color for each group and returns a data frame with the group labels and their corresponding average hex colors. Using the grouped colours from before we get:
```{r, avg, out.height='70%', out.width='70%', fig.align="center"}
set.seed(1701)
avgCl <- avgHex(df = grCol, group_col = 'group', hex_col = 'hex_color')
avgCl
plotPalette(df = avgCl, color_col = 'avg_color')
```
Figure 8: Averaged colour palette.
In Figure 8, we can see that the four groups have been averaged into single colours.
# Build Colour Palette from Image
The `img2pal` function automates some of the above processes and creates a custom palette, ready to use, directly from an input image. The function arguments mirror those from the `gettpCol` function. In the example below, we are creating a colour palette of the top 10 most frequent colours, while grouping into 15 clusters and averaging the colours.
Using the same image of Bender from above, we can do the following:
```{r, img2palette, out.height='70%', out.width='70%', fig.align="center"}
pal <- img2pal(path = "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png",
n = 10,
avgCols = TRUE,
exclude = TRUE,
n_clusters = 15,
customExclude = NULL)
```
Figure 9: Colour palette created directly from input image.
And we can take a look at the hex codes for the colour palette by checking the newly created `pal`opbject:
```{r, look_at_palette}
pal
```
# Prebuilt Palettes
One useful feature of taking in an image and return the top $n$ colours is the ability to turn that image into a colour palette. For fun, we provide colour palettes based on all the studio albums of both Radiohead and Taylor Swift. The palettes can be accessed by indexing either `radiohead_palettes` or `taylor_palettes`, as shown in the code below. It should be noted, that when creating these custom palettes, the top 10 average colours were chosen.
## Radiohead
The full list of names for Radiohead are:
* `pabloHoney`: Pablo Honey
* `Bends` : The Bends
* `okComputer`: OK Computer
* `KID_A` : Kid A
* `Amnesiac` : Amnesiac
* `httt` : Hail to the Theif
* `inRainbows`: In Rainbows
* `tkol` : The King of Limbs
* `amsp` : A Moon Shaped Pool
## Taylor Swift
The full list of names for Taylor Swift are:
* `tSwift` : Taylor Swift
* `fearless` : Fearless
* `speakNow` : Speak Now
* `red` : Red
* `1989` : 1989
* `reputation`: Reputation
* `lover` : Lover
* `folklore` : Folklore
* `evermore` : Evermore
* `midnights` : Midnights
```{r, pals}
radiohead_palettes$pabloHoney
taylor_palettes$red
```
## Palette Plots
To view any of the palettes, we can use the `plotPalette` function:
```{r, rhpal, out.height='70%', out.width='70%', fig.align="center"}
plotPalette(radiohead_palettes$okComputer)
```
Figure 10: Radiohead OK Computer colour palette.
```{r, typal, out.height='70%', out.width='70%', fig.align="center"}
plotPalette(taylor_palettes$red)
```
Figure 11: Taylor Swift Red colour palette.
Additionally, to create a larger colour palette we provide the `colPalette` function. This function generates a custom color palette based on the specified `palette` name. The color palettes are sourced from two predefined lists: `taylor_palettes` and `radiohead_palettes`. For example
```{r, typal20, out.height='70%', out.width='70%', fig.align="center"}
# Create a color palette based on a Taylor Swift album cover
tswift_palette <- colPalette(palette = "evermore")
tpal <- tswift_palette(20)
plotPalette(tpal)
```
Figure 12: Taylor Swift Repuatation colour palette.
# Use in ggplot
For convenience, we also provide functionality to use these palettes as either a scale fill or scale colour (similar to the `ggplot2` `scale_color` and `scale_fill` functions).
```{r rhpalgg, out.height='70%', out.width='70%', fig.align="center"}
library(dplyr)
# Apply a Radiohead color scale to a ggplot2 plot
# Create a summary data frame with counts per manufacturer for the mpg data
manufacturer_counts <- mpg %>%
group_by(manufacturer) %>%
summarize(count = n())
# sort the data
mpgsort <- manufacturer_counts[order(manufacturer_counts$count, decreasing = TRUE), ]
# order factors
mpgsort$manufacturer <- factor(mpgsort$manufacturer, levels = mpgsort$manufacturer)
# Create the plot using a Radiohead palette
ggplot(mpgsort, aes(x = manufacturer, y= count, fill = manufacturer)) +
geom_bar(stat = 'identity') +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scaleFill(palette = "pabloHoney", guide = "none")
```
Figure 13: Displaying data using a Radiohead colour palette and `scaleFill`.
To do the same using a Taylor Swift palette:
```{r, typalgg, out.height='70%', out.width='70%', fig.align="center"}
# Create the plot using a Taylor Swift palette
ggplot(mpgsort, aes(x = manufacturer, y= count, fill = manufacturer)) +
geom_bar(stat = 'identity') +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scaleFill(palette = "evermore", guide = "none")
```
Figure 14: Displaying data using a Taylor Swift colour palette and `scaleFill`.
Similarly, to use `scaleColor`:
```{r, rhpalgg2, out.height='70%', out.width='100%', fig.align="center"}
# Create the plot using a Radiohead palette
ggplot(mpg[1:122,], aes(x = displ, y = cty, color = manufacturer)) +
geom_point(size = 2) +
scaleColor(palette = 'tkol') +
theme_minimal()
```
Figure 15: Displaying data using a Radiohead colour palette and `scaleColor`.
And using a Taylor Swift palette:
```{r, typalgg2, out.height='70%', out.width='100%', fig.align="center"}
# Create the plot using a Taylor Swift palette
ggplot(mpg[1:122,], aes(x = displ, y = cty, color = manufacturer)) +
geom_point(size = 2) +
scaleColor(palette = 'tSwift') +
theme_minimal()
```
Figure 16: Displaying data using a Taylor Swift colour palette and `scaleColor`.
Of course, we can use these palettes in a more _traditional_ way by passing the palette to ggplot. For example:
```{r, tyhm, out.height='70%', out.width='100%', fig.align="center"}
# Dummy data
x <- LETTERS[1:20]
y <- paste0("var", seq(1,20))
data <- expand.grid(X=x, Y=y)
data$Z <- runif(400, 0, 5)
# Set a Taylor Swift palette of two colours
pal <- taylor_palettes$tSwift[c(6,5)]
# Create a heatmap
ggplot(data, aes(x = X, y = Y)) +
geom_tile(aes(fill = Z)) +
scale_fill_gradientn(
colors = pal, name = "Z value",
guide = guide_colorbar(
order = 1,
frame.colour = "black",
ticks.colour = "black"
), oob = scales::squish
) +
xlab('') + ylab('') +
theme_bw()
```
Figure 17: Heatmap displaying data using a Taylor Swift colour palette.
# Utility Functions
In this section we take a brief look at some of the utility functions used in `colouR`. Thes include a useful little function that returns the file extension of a given file. For example:
```{r, ext}
fileName <- "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png"
getExtension(file = fileName)
# another example
getExtension(file = "example.txt")
```
We can see that the returned values are .png and .txt, respectivley.
Additionally, we provide a function that reads an image file (PNG or JPG) from a URL and returns the image data. This is done via the `read_image_from_url` function. It returns an object containing the image data. If the image is a JPG, the object will be of class "array". If the image is a PNG, the object will be of class "matrix".
Using the image of Bender from before we can get the image data. The resulting object can then be used, for example:
```{r, url, out.height='100%', out.width='100%', fig.align="center"}
urlName <- "https://raw.githubusercontent.com/AlanInglis/colouR/master/images/bender.png"
image <- read_image_from_url(path = urlName)
# set up a plot
plot(c(100, 250), c(300, 550), type = "n", xlab = "", ylab = "")
rasterImage(image,100,300,150,550)
```
Figure 18: Reading in an image from a url.