Skip to content

ggplot2 - regression statistics how to display on plot

3 messages · Durant, James T. (ATSDR/DTEM/PRMSB), Bryan Hanson, Dennis Murphy

#
Jim, you can use the function appended below, which is part of package HandyStuff.  If you want an example, see ?lmEqn after installing HandyStuff, available at github.com/bryanhanson/HandyStuff.  Bryan

lmEqn <-
function(df = NULL, y = NULL, x = NULL,
	method = "lm", leg.loc = c(0, 0),
	xlab = NULL, ylab = NULL, title = NULL, ...) {
	
# work out a linear model first

	y1 <- df[,y]
	x1 <- df[,x]
	if (method == "lm") mod <- lm(y1 ~ x1)
	if (method == "rlm") mod <- rlm(y1 ~ x1)
	m <- mod$coef[2]
	b <- mod$coef[1]
	r2 <- round(cor(x1, y1)^2, 4)
	lab <- paste("m =", round(m, 3), "b =", round(b, 3), "r^2 =", r2, sep = "  ")
	if (method == "lm") lab <- paste("linear model: ", lab)
	if (method == "rlm") lab <- paste("robust linear model: ", lab)

# now plot the data

	p <- ggplot(df, aes_string(x = x, y = y))
	p <- p + geom_point()
	if (method == "lm") p <- p + geom_smooth(method = "lm")
	if (method == "rlm") p <- p + geom_smooth(method = "rlm")
	p <- p + geom_text(aes(x, y, label = lab), size = 5, hjust = 0, vjust = 0,
	data = data.frame(x = leg.loc[1], y = leg.loc[2]), label = lab)
	
	if (!is.null(title)) p <- p + opts(title = title)
    if (!is.null(xlab)) p <- p + xlab(xlab)
    if (!is.null(ylab)) p <- p + ylab(ylab)

	p
	
	}
On Nov 10, 2011, at 8:45 AM, Durant, James T. (ATSDR/DTEM/PRMSB) wrote:

            
#
Hi:

Here's an example of how one might do this in a specific example using
geom_text().

# Some fake data:
df <- data.frame(x = 1:10, y = 0.5 + (1:10) + rnorm(10))

# Fit a linear model to the data and save the model object:
mod <- lm(y ~ x, data = df)

# Create a list of character strings - the first component
# produces the fitted model, the second produces a
# string to compute R^2, but in plotmath syntax.

rout <- list(paste('Fitted model: ', round(coef(mod)[1], 3), ' + ',
                               round(coef(mod)[2], 3), ' x', sep = ''),
              paste('R^2 == ', round(summary(mod)[['r.squared']], 3),
sep = '')  )

# This looks ugly, but I'm using the round() function to make the
# equations look more sane. coef(mod) extracts the model
# coefficients (intercept, then slope), and
# summary(mod)[['r.squared']] extracts R^2.

# See what they look like:
rout

# Notice that the first component of rout is simply a text string
# that can be passed as is, but the second string needs to be
# wrapped inside an expression, which is what parse = TRUE
# in geom_text() does.

# Now construct the plot; given the (x, y) extent of the data, the
# coordinates make sense, but you have to adapt them to
# your data.

ggplot(df, aes(x, y)) +
   geom_smooth(method = 'lm') +
   geom_text(aes(x = 2, y = 10, label = rout[[1]]), hjust = 0) +
   geom_text(aes(x = 2, y = 9.5, label = rout[[2]]), hjust = 0, parse = TRUE )

hjust = 0 makes the text strings flush left, parse = TRUE in the
second call converts the second string in rout into an expression.

This is fine if your plot is a one-off or two-off deal; the code above
gives you a sense of the programming required. OTOH, if you need to do
this a lot, then you're better off with a function that can automate
the process, which Bryan was kind enough to provide.

I might add that there is a dedicated ggplot2 listserv on google:
ggplot2 at googlegroups.com, for which questions like these are well
suited.

HTH,
Dennis

On Thu, Nov 10, 2011 at 5:45 AM, Durant, James T. (ATSDR/DTEM/PRMSB)
<hzd3 at cdc.gov> wrote: