This post provides the basics concerning
stacked area chart with R and
ggplot2
. It takes into account several input format
types and show how to customize the output.
The data frame used as input to build a stacked area chart requires 3 columns:
x
: numeric variable used for the X axis, often it
is a time.
y
: numeric variable used for the Y axis. What are
we looking at?
group
: one shape will be done per group.
The chart is built using the geom_area()
function.
# Packages
library(ggplot2)
library(dplyr)
# create data
as.numeric(rep(seq(1,7),each=7)) # x Axis
time <- runif(49, 10, 100) # y Axis
value <- rep(LETTERS[1:7],times=7) # group, one shape per group
group <- data.frame(time, value, group)
data <-
# stacked area chart
ggplot(data, aes(x=time, y=value, fill=group)) +
geom_area()
ggplot2
The gallery offers a post dedicated to reordering with ggplot2. This step can be tricky but the code below shows how to:
factor()
function.
sort()
# Give a specific order:
$group <- factor(data$group , levels=c("B", "A", "D", "E", "G", "F", "C") )
data
# Plot again
ggplot(data, aes(x=time, y=value, fill=group)) +
geom_area()
# Note: you can also sort levels alphabetically:
levels(data$group)
myLevels <-$group <- factor(data$group , levels=sort(myLevels) )
data
# Note: sort following values at time = 5
data %>%
myLevels <- filter(time==6) %>%
arrange(value)
$group <- factor(data$group , levels=myLevels$group ) data
In a proportional stacked area graph, the sum of each year is always equal to hundred and value of each group is represented through percentages.
To make it, you have to calculate these percentages first. This
can be done using dplyr
of with base R
.
# Compute percentages with dplyr
library(dplyr)
data %>%
data <- group_by(time, group) %>%
summarise(n = sum(value)) %>%
mutate(percentage = n / sum(n))
# Plot
ggplot(data, aes(x=time, y=percentage, fill=group)) +
geom_area(alpha=0.6 , size=1, colour="black")
# Note: compute percentages without dplyr:
function(vec){
my_fun <-as.numeric(vec[2]) / sum(data$value[data$time==vec[1]]) *100
}$percentage <- apply(data , 1 , my_fun) data
Let’s improve the chart general appearance:
viridis
color scaletheme_ipsum
of the hrbrthemes
package
ggtitle
# Library
library(viridis)
library(hrbrthemes)
# Plot
ggplot(data, aes(x=time, y=value, fill=group)) +
geom_area(alpha=0.6 , size=.5, colour="white") +
scale_fill_viridis(discrete = T) +
theme_ipsum() +
ggtitle("The race between ...")
👋 After crafting hundreds of R charts over 12 years, I've distilled my top 10 tips and tricks. Receive them via email! One insight per day for the next 10 days! 🔥