[r] Overlay normal curve to histogram in R

This is an implementation of aforementioned StanLe's anwer, also fixing the case where his answer would produce no curve when using densities.

This replaces the existing but hidden hist.default() function, to only add the normalcurve parameter (which defaults to TRUE).

The first three lines are to support roxygen2 for package building.

#' @noRd
#' @exportMethod hist.default
#' @export
hist.default <- function(x,
                         breaks = "Sturges",
                         freq = NULL,
                         include.lowest = TRUE,
                         normalcurve = TRUE,
                         right = TRUE,
                         density = NULL,
                         angle = 45,
                         col = NULL,
                         border = NULL,
                         main = paste("Histogram of", xname),
                         ylim = NULL,
                         xlab = xname,
                         ylab = NULL,
                         axes = TRUE,
                         plot = TRUE,
                         labels = FALSE,
                         warn.unused = TRUE,
                         ...)  {

  # https://stackoverflow.com/a/20078645/4575331
  xname <- paste(deparse(substitute(x), 500), collapse = "\n")

  suppressWarnings(
    h <- graphics::hist.default(
      x = x,
      breaks = breaks,
      freq = freq,
      include.lowest = include.lowest,
      right = right,
      density = density,
      angle = angle,
      col = col,
      border = border,
      main = main,
      ylim = ylim,
      xlab = xlab,
      ylab = ylab,
      axes = axes,
      plot = plot,
      labels = labels,
      warn.unused = warn.unused,
      ...
    )
  )

  if (normalcurve == TRUE & plot == TRUE) {
    x <- x[!is.na(x)]
    xfit <- seq(min(x), max(x), length = 40)
    yfit <- dnorm(xfit, mean = mean(x), sd = sd(x))
    if (isTRUE(freq) | (is.null(freq) & is.null(density))) {
      yfit <- yfit * diff(h$mids[1:2]) * length(x)
    }
    lines(xfit, yfit, col = "black", lwd = 2)
  }

  if (plot == TRUE) {
    invisible(h)
  } else {
    h
  }
}

Quick example:

hist(g)

enter image description here

For dates it's bit different. For reference:

#' @noRd
#' @exportMethod hist.Date
#' @export
hist.Date <- function(x,
                      breaks = "months",
                      format = "%b",
                      normalcurve = TRUE,
                      xlab = xname,
                      plot = TRUE,
                      freq = NULL,
                      density = NULL,
                      start.on.monday = TRUE,
                      right = TRUE,
                      ...)  {

  # https://stackoverflow.com/a/20078645/4575331
  xname <- paste(deparse(substitute(x), 500), collapse = "\n")

  suppressWarnings(
    h <- graphics:::hist.Date(
      x = x,
      breaks = breaks,
      format = format,
      freq = freq,
      density = density,
      start.on.monday = start.on.monday,
      right = right,
      xlab = xlab,
      plot = plot,
      ...
    )
  )

  if (normalcurve == TRUE & plot == TRUE) {
    x <- x[!is.na(x)]
    xfit <- seq(min(x), max(x), length = 40)
    yfit <- dnorm(xfit, mean = mean(x), sd = sd(x))
    if (isTRUE(freq) | (is.null(freq) & is.null(density))) {
      yfit <- as.double(yfit) * diff(h$mids[1:2]) * length(x)
    }
    lines(xfit, yfit, col = "black", lwd = 2)
  }

  if (plot == TRUE) {
    invisible(h)
  } else {
    h
  }
}

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to plot

Fine control over the font size in Seaborn plots for academic papers Why do many examples use `fig, ax = plt.subplots()` in Matplotlib/pyplot/python Modify the legend of pandas bar plot Format y axis as percent Simple line plots using seaborn Plot bar graph from Pandas DataFrame Plotting multiple lines, in different colors, with pandas dataframe Plotting in a non-blocking way with Matplotlib What does the error "arguments imply differing number of rows: x, y" mean? matplotlib get ylim values

Examples related to histogram

Why isn't this code to plot a histogram on a continuous value Pandas column working? Make Frequency Histogram for Factor Variables Overlay normal curve to histogram in R Plotting histograms from grouped data in a pandas DataFrame save a pandas.Series histogram plot to file changing default x range in histogram matplotlib How does numpy.histogram() work? Fitting a histogram with python Bin size in Matplotlib (Histogram) Plot two histograms on single chart with matplotlib

Examples related to gaussian

Overlay normal curve to histogram in R Gaussian fit for Python Plotting of 1-dimensional Gaussian distribution function Gaussian filter in MATLAB