A Scatterplot displays the
relationship between 2 numeric variables. Each dot represents an observation. Their position on the X
(horizontal) and Y (vertical) axis represents the values of the 2 variables. Using ggplot2, scatterplots are built thanks
geom_point geom. If you're not familiar with ggplot2 at all,
try this course as an introduction.
Scatterplots are built with ggplot2 thanks to the
geom_point() function. Discover a basic use case in graph #272, and learn how to custom it with next examples below.
The most basic scatterplot you can build with R and ggplot2.
Simply explains how to call the
Custom marker features
The geom_point() function has option to custom color, stroke, shape, size and more. Learn how to call them.
Map marker feature to variable
Ggplot2 makes it a breeze to map a variable to a marker feature. Here is an example where marker color depends on its category.
Map to several features
Extension of the previous concept: several features can be mapped to variables in the same time
Annotate with geom_text
geom_text() allows to add annotation to one, several or all markers of your chart.
Annotate with geom_label
Very close to geom_text, geom_label produces a label wrapped in a rectangle. This example also explains how to apply labels to a selection of markers.
Scatterplot with rug
Add rug on X and Y axis to describe the numeric variable distribution. Show how geom_rug() works.
Add marginal distribution around your scatterplot with ggExtra and the ggMarginal function.
Base R is also a good option to build a scatterplot, using the
plot() function. The chart #13 below will guide you through its basic usage. Following examples allow a greater level of customization.
The lattice XYplot() allows to build one scatterplot for each level of a factor automatically.
Correlation of discrete variables
Make the circle size proportional to number of data points when working with discrete variables.
the mtext() function
mtext() allows to add text in margin. Useful to add an unique title for several charts.
A Manhattan plot is a particular type of scatterplot used in genomics. The X axis displays the position of a genetic variant on the genome. Each chromosome is usually represented using a different color. The Y axis shows p-value of the association test with a phenotypic trait.
The web is full of astonishing R charts made by awesome bloggers. The R graph gallery tries to display some of the best creations and explain how their source code works. If you want to display your work here, please drop me a word or even better, submit a Pull Request!
ggRepel allows to add multiple labels with no overlap automatically. Here is a good looking scatterplot using it!