[r] Histogram with Logarithmic Scale and custom breaks

I'm trying to generate a histogram in R with a logarithmic scale for y. Currently I do:

hist(mydata$V3, breaks=c(0,1,2,3,4,5,25))

This gives me a histogram, but the density between 0 to 1 is so great (about a million values difference) that you can barely make out any of the other bars.

Then I've tried doing:

mydata_hist <- hist(mydata$V3, breaks=c(0,1,2,3,4,5,25), plot=FALSE)
plot(rpd_hist$counts, log="xy", pch=20, col="blue")

It gives me sorta what I want, but the bottom shows me the values 1-6 rather than 0, 1, 2, 3, 4, 5, 25. It's also showing the data as points rather than bars. barplot works but then I don't get any bottom axis.

This question is related to r histogram logarithm

The answer is


Run the hist() function without making a graph, log-transform the counts, and then draw the figure.

hist.data = hist(my.data, plot=F)
hist.data$counts = log(hist.data$counts, 2)
plot(hist.data)

It should look just like the regular histogram, but the y-axis will be log2 Frequency.


Another option would be to use the package ggplot2.

ggplot(mydata, aes(x = V3)) + geom_histogram() + scale_x_log10()

Here's a pretty ggplot2 solution:

library(ggplot2)
library(scales)  # makes pretty labels on the x-axis

breaks=c(0,1,2,3,4,5,25)

ggplot(mydata,aes(x = V3)) + 
  geom_histogram(breaks = log10(breaks)) + 
  scale_x_log10(
    breaks = breaks,
    labels = scales::trans_format("log10", scales::math_format(10^.x))
  )

Note that to set the breaks in geom_histogram, they had to be transformed to work with scale_x_log10


I've put together a function that behaves identically to hist in the default case, but accepts the log argument. It uses several tricks from other posters, but adds a few of its own. hist(x) and myhist(x) look identical.

The original problem would be solved with:

myhist(mydata$V3, breaks=c(0,1,2,3,4,5,25), log="xy")

The function:

myhist <- function(x, ..., breaks="Sturges",
                   main = paste("Histogram of", xname),
                   xlab = xname,
                   ylab = "Frequency") {
  xname = paste(deparse(substitute(x), 500), collapse="\n")
  h = hist(x, breaks=breaks, plot=FALSE)
  plot(h$breaks, c(NA,h$counts), type='S', main=main,
       xlab=xlab, ylab=ylab, axes=FALSE, ...)
  axis(1)
  axis(2)
  lines(h$breaks, c(h$counts,NA), type='s')
  lines(h$breaks, c(NA,h$counts), type='h')
  lines(h$breaks, c(h$counts,NA), type='h')
  lines(h$breaks, rep(0,length(h$breaks)), type='S')
  invisible(h)
}

Exercise for the reader: Unfortunately, not everything that works with hist works with myhist as it stands. That should be fixable with a bit more effort, though.


It's not entirely clear from your question whether you want a logged x-axis or a logged y-axis. A logged y-axis is not a good idea when using bars because they are anchored at zero, which becomes negative infinity when logged. You can work around this problem by using a frequency polygon or density plot.


Dirk's answer is a great one. If you want an appearance like what hist produces, you can also try this:

buckets <- c(0,1,2,3,4,5,25)
mydata_hist <- hist(mydata$V3, breaks=buckets, plot=FALSE)
bp <- barplot(mydata_hist$count, log="y", col="white", names.arg=buckets)
text(bp, mydata_hist$counts, labels=mydata_hist$counts, pos=1)

The last line is optional, it adds value labels just under the top of each bar. This can be useful for log scale graphs, but can also be omitted.

I also pass main, xlab, and ylab parameters to provide a plot title, x-axis label, and y-axis label.


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to histogram

Why isn't this code to plot a histogram on a continuous value Pandas column working? Make Frequency Histogram for Factor Variables Overlay normal curve to histogram in R Plotting histograms from grouped data in a pandas DataFrame save a pandas.Series histogram plot to file changing default x range in histogram matplotlib How does numpy.histogram() work? Fitting a histogram with python Bin size in Matplotlib (Histogram) Plot two histograms on single chart with matplotlib

Examples related to logarithm

ValueError: math domain error How do you do natural logs (e.g. "ln()") with numpy in Python? Python math module How to expand and compute log(a + b)? Log to the base 2 in python How do you calculate log base 2 in Java for integers? Histogram with Logarithmic Scale and custom breaks Plot logarithmic axes with matplotlib in python