Taking above answers and some fine-tuning (and for whatever it's worth), here is a way of achieving two scales via sec_axis
:
Assume a simple (and purely fictional) data set dt
: for five days, it tracks the number of interruptions VS productivity:
when numinter prod
1 2018-03-20 1 0.95
2 2018-03-21 5 0.50
3 2018-03-23 4 0.70
4 2018-03-24 3 0.75
5 2018-03-25 4 0.60
(the ranges of both columns differ by about factor 5).
The following code will draw both series that they use up the whole y axis:
ggplot() +
geom_bar(mapping = aes(x = dt$when, y = dt$numinter), stat = "identity", fill = "grey") +
geom_line(mapping = aes(x = dt$when, y = dt$prod*5), size = 2, color = "blue") +
scale_x_date(name = "Day", labels = NULL) +
scale_y_continuous(name = "Interruptions/day",
sec.axis = sec_axis(~./5, name = "Productivity % of best",
labels = function(b) { paste0(round(b * 100, 0), "%")})) +
theme(
axis.title.y = element_text(color = "grey"),
axis.title.y.right = element_text(color = "blue"))
Here's the result (above code + some color tweaking):
The point (aside from using sec_axis
when specifying the y_scale is to multiply each value the 2nd data series with 5 when specifying the series. In order to get the labels right in the sec_axis definition, it then needs dividing by 5 (and formatting). So a crucial part in above code is really *5
in the geom_line and ~./5
in sec_axis (a formula dividing the current value .
by 5).
In comparison (I don't want to judge the approaches here), this is how two charts on top of one another look like:
You can judge for yourself which one better transports the message (“Don’t disrupt people at work!”). Guess that's a fair way to decide.
The full code for both images (it's not really more than what's above, just complete and ready to run) is here: https://gist.github.com/sebastianrothbucher/de847063f32fdff02c83b75f59c36a7d a more detailed explanation here: https://sebastianrothbucher.github.io/datascience/r/visualization/ggplot/2018/03/24/two-scales-ggplot-r.html