27 Position scales
Position scales
Scales are added to ggplot2 plots in much the same way as other layers. First a +
, followed by the name of the scale. Options are specified within the parentheses of the scale name.
Scales are either continuous or discrete, depending on the data. A bar chart contains one continuous variable (a count) and one discrete variable (the different categories, generally one for each bar), whereas a line chart generally has two continuous variables.
The most common position scales are scale_x_continuous
and scale_y_continuous
, which simply map linearly from a data value to a location on the plot. If you plot continuous data on an axis, ggplot will automatically add these to your plot. You only need to specify them explicitly if you want to change any of the default options.
In this chart, which we have seen before, ggplot has automatically created continuous scales to represent the years
and gdpPercap.
Each point is placed in a position based on these scales. These are represented on the chart as the x and y axis, so we know what each position refers to.
There are two useful things we can change with the default continuous scale: limits and breaks.
Limits specify the limits of the scale - the start and end points of the x and y axes.
Breaks specify which points and ticks should be drawn on the x and y axes.
To make changes to the default scale, we add scale_x_continuous
or scale_y_continuous
following a + sign to the plot.
Limits
To change the limits, add limits =
followed by the start and end point of the limit you want, within a vector using c()
, as in the example below:
You can see that it has limited the years to 1980 at earliest and 2000 at the latest.
Try it yourself: set the limits of the y axis to 10000 and 50000
Breaks
Breaks control which axis ticks, lines and labels are shown. Ggplot picks defaults according to the data but you can override them.
Change the breaks by adding breaks =
to the plot, followed by a vector (a list) of the values you want to specify.
Try it yourself: Set the breaks for the y axis to every 5000 instead of 10000
It would quickly get tedious if you wanted to specify a large number of axis ticks. R has a function called seq()
, which allows you to create a sequence of numbers according to simple rules. seq()
needs a from
, to
and by
argument. For example we could specify that the plot show values every two years by adding the following:
Minor breaks
Note that changing the breaks also changes the vertical lines in the plot. By default, ggplot draws a line for each break, as well as a fainter line in between each break, called minor breaks. You can override the positions for these minor breaks, too.
You can also add different labels for your axis, using the βlabelsβ argument:
Scales with bar charts
Bar charts using geom_col have a discrete scale instead of a continuous scale on the x axis. Each bar represents a category rather than a numeric value. The name of this is scale_x_discrete
, which is added automatically by ggplot.
By default, this will be ordered alphabetically. Reordering these is a very common data visualisation task. Take the following example:
In many cases we would want to reorder the countries by the value on the y axis - so that the highest appear at the top. We can do this not within the scale but by using reorder
in the code of the plot itself:
Coord_flip()
A final useful function is coord_flip
. This flips the coordinates of a plot, and reverses the position of the x and y axes.
Note that we have removed the desc() so that the highest value is at the top rather than the bottom.
This is particularly useful for a chart with lots of values, because the labels are much easier to read when they are placed along the y-axis.