[r] Adding a regression line on a ggplot

I'm trying hard to add a regression line on a ggplot. I first tried with abline but I didn't manage to make it work. Then I tried this...

data = data.frame(x.plot=rep(seq(1,5),10),y.plot=rnorm(50))
ggplot(data,aes(x.plot,y.plot))+stat_summary(fun.data=mean_cl_normal) +
   geom_smooth(method='lm',formula=data$y.plot~data$x.plot)

But it is not working either.

This question is related to r ggplot2 regression linear-regression

The answer is


I found this function on a blog

 ggplotRegression <- function (fit) {

    `require(ggplot2)

    ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
      geom_point() +
      stat_smooth(method = "lm", col = "red") +
      labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                         "Intercept =",signif(fit$coef[[1]],5 ),
                         " Slope =",signif(fit$coef[[2]], 5),
                         " P =",signif(summary(fit)$coef[2,4], 5)))
    }`

once you loaded the function you could simply

ggplotRegression(fit)

you can also go for ggplotregression( y ~ x + z + Q, data)

Hope this helps.


If you want to fit other type of models, like a dose-response curve using logistic models you would also need to create more data points with the function predict if you want to have a smoother regression line:

fit: your fit of a logistic regression curve

#Create a range of doses:
mm <- data.frame(DOSE = seq(0, max(data$DOSE), length.out = 100))
#Create a new data frame for ggplot using predict and your range of new 
#doses:
fit.ggplot=data.frame(y=predict(fit, newdata=mm),x=mm$DOSE)

ggplot(data=data,aes(x=log10(DOSE),y=log(viability)))+geom_point()+
geom_line(data=fit.ggplot,aes(x=log10(x),y=log(y)))

As I just figured, in case you have a model fitted on multiple linear regression, the above mentioned solution won't work.

You have to create your line manually as a dataframe that contains predicted values for your original dataframe (in your case data).

It would look like this:

# read dataset
df = mtcars

# create multiple linear model
lm_fit <- lm(mpg ~ cyl + hp, data=df)
summary(lm_fit)

# save predictions of the model in the new data frame 
# together with variable you want to plot against
predicted_df <- data.frame(mpg_pred = predict(lm_fit, df), hp=df$hp)

# this is the predicted line of multiple linear regression
ggplot(data = df, aes(x = mpg, y = hp)) + 
  geom_point(color='blue') +
  geom_line(color='red',data = predicted_df, aes(x=mpg_pred, y=hp))

Multiple LR

# this is predicted line comparing only chosen variables
ggplot(data = df, aes(x = mpg, y = hp)) + 
  geom_point(color='blue') +
  geom_smooth(method = "lm", se = FALSE)

Single LR


The simple solution using geom_abline:

geom_abline(slope = coef(data.lm)[[2]], intercept = coef(data.lm)[[1]])

Where data.lm is an lm object, and coef(data.lm) looks something like this:

> coef(data.lm)
(Intercept)    DepDelay 
  -2.006045    1.025109 

The numeric indexing assumes that (Intercept) is listed first, which is the case if the model includes an intercept. If you have some other linear model object, just plug in the slope and intercept values similarly.


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to ggplot2

Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph Saving a high resolution image in R Change bar plot colour in geom_bar with ggplot2 in r Remove legend ggplot 2.2 Remove all of x axis labels in ggplot Changing fonts in ggplot2 Explain ggplot2 warning: "Removed k rows containing missing values" Error: package or namespace load failed for ggplot2 and for data.table In R, dealing with Error: ggplot2 doesn't know how to deal with data of class numeric

Examples related to regression

how to use the Box-Cox power transformation in R Find p-value (significance) in scikit-learn LinearRegression Run an OLS regression with Pandas Data Frame fitting data with numpy Adding a regression line on a ggplot Quadratic and cubic regression in Excel Extract regression coefficient values How to force R to use a specified factor level as reference in a regression? What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least squares regression?

Examples related to linear-regression

Accuracy Score ValueError: Can't Handle mix of binary and continuous target TensorFlow: "Attempting to use uninitialized value" in variable initialization gradient descent using python and numpy Adding a regression line on a ggplot How to calculate the 95% confidence interval for the slope in a linear regression model in R What is the difference between linear regression and logistic regression? Multiple linear regression in Python Add regression line equation and R^2 on graph Linear regression with matplotlib / numpy How to force R to use a specified factor level as reference in a regression?