An Introduction to ggplot2
University of Wisconsin-Madison
July 8, 2026
ggplot2 is a popular R package for creating data visualizations
“Grammar of Graphics” - build visualization by combining independent components
Key Strengths:
Data: Specifies the dataset for the plot
Mapping: set of instructions on how to “map” data
aes() to define aesthetics of plotLayers: defines how to display the mapped data
geom_(): defines geometry to determine how data is displayed (points, lines, etc.). More on this soon!Installation Requirements:
💡 Tip: Update R and packages regularly for latest features
Penguins dataset from the plamerpenguin package contains body measurements of penguins on three islands in the Palmer Archipelago, Antartica. We will use this dataset as an example of how to build figures using ggplot2.
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<fct> <fct> <dbl> <dbl> <int> <int>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
7 Adelie Torgersen 38.9 17.8 181 3625
8 Adelie Torgersen 39.2 19.6 195 4675
9 Adelie Torgersen 34.1 18.1 193 3475
10 Adelie Torgersen 42 20.2 190 4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>
You may need to run install.packages(“palmerpenguins”) if it wasn’t previously loaded on your system.
What is the relationship between flipper length and body mass index of penguins?
Step 1: begin with the function ggplot(), telling it what data to use
Step 2: add mapping layer, which defines the aesthetics (layout) of your plot, including defining your x and y axes
The geom_*() layer defines your plot type. We will create a scatterplot with penguins.
| Plot type | geom_ |
|---|---|
| Bar chart | geom_bar() |
| Line chart | geom_line() |
| Boxplot | geom_boxplot() |
| Scatterplot | geom_point() |
| Violin plot | geom_violin() |
| Histogram | geom_histogram() |
Does the relationship between flipper length and body mass index differ by species?
[1] "species" "island" "bill_length_mm"
[4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
[7] "sex" "year"
Modify aesthetics to categorize species by color.
Next we will add smooth curves displaying the realtionship between body mass and flipper length by species.
What if we want a single line of best fit, but still want to categorize by species?
People perceive colors differently, so it is generally not a good idea to categorize only by color. In addition to color, we can also categorize by shapes.
We likely do not want the variable name in our dataset to be the title of our axes. We can specify the “labels” of our plot using labs() in a new layer. We can also add a title and subtitle in this function.
ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point(aes(color = species, shape = species)) +
geom_smooth(method = "lm") +
labs(
title = "Body mass and flipper length",
subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins",
x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Species",
shape = "Species"
)Once you are satisfied with the content of your figure, you can save it as a function. This allows you to quickly call and reproduce the figure.
my.plot <- ggplot(
data = penguins,
mapping = aes(x = flipper_length_mm, y = body_mass_g)
) +
geom_point(aes(color = species, shape = species)) +
geom_smooth(method = "lm") +
labs(
title = "Body mass and flipper length",
subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins",
x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Species",
shape = "Species"
)Optional component: controls visuals of the plot that are not controlled by the data
Built-in themes: theme_*() functions
| theme_*() | Description |
|---|---|
| theme_grey() | Default theme: light grey background with white gridlines |
| theme_bw() | Black and white theme with white background and gridlines |
| theme_minimal() | Minimalist: white background and grey gridlines |
| theme_classic() | Classic R: white background with solid axes and no gridlines |
| theme_void() | Empty theme: removes all backgrounds, axes, gridlines and labels |
theme() allows you to manually adjust your theme based your preferences. For example, you can adjust the locations of the title, subtitle, and legend.
The function element_blank() removes an element entirely. Setting panel.grid.major and panel.grid.minor to element_blank() will remove them from your figure.
What if you don’t want gridlines but still want a solid line to signify your x and y axes? You can tell ggplot to draw axes lines using axis.line = element_line()
Just like you set your graph as a function, you can also set your theme as a function. This will allow you to easily add the same theme to other graphs.
Facets can be used to separate small multiples or different subsets of data based on one or more variables. This is a quick and powerful way to show patterns and trends within subsets of data.
facet_wrap(~ species) allows us to look at the relationship between body mass and flipper length for different species in separate figures.
Apply my.theme to your new figure!
# Install and load ggpubr package
install.packages("ggpubr")
library(ggpubr)
# Density plot
ggdensity(
penguins,
x = "body_mass_g",
add = "mean",
color = "species",
fill = "species"
)
# Boxplot
ggboxplot(
penguins,
x = "flipper_length_mm",
y = "body_mass_g",
add = "jitter",
color = "species"
)
# Lollipop chart
ggdotchart(
penguins,
x = "flipper_length_mm",
y = "body_mass_g",
color = "species",
sorting = "ascending", # sorts data in ascending order
add = "segments", # draws lines from y=0 to data point
ggtheme = theme_pubr()
)Complete description and additional plots: rpkgs.datanovia.com/ggpubr/
Work through the activity for your level — and continue to the next if you finish early!
Open visualization_activities.qmd in RStudio to get started
| Level | Group | Activity |
|---|---|---|
| 🟢 Beginner | Activity 1 | Fill in the blanks |
| 🟡 Intermediate | Activity 2 | Update the facet plots |
| 🔴 Advanced | Activity 3 | Create figure using new dataset |
Finished early? Try the bonus question or move to the next level!
R for Lifestyle and Brain Health (R-LAB)