Skip to content

R package to automatically produce combination plot?

16 messages · Jason Rupert, stephen sefick, Patrizio Frederic +3 more

#
By any chance is there an R package that automatically produces the plot shown at the following link:
http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html

That is an R package to produce on plot that has the following:
(a) a vertically oriented histogram, 
(b) associated barplot, and
(c) quantile-quantile plot (Q-Q Plot).

This is based on a class lecture from University of Pennsylvania: 
stat.wharton.upenn.edu/~mcjon/stat-431/lecture-02.pdf

I am pretty confident I can put one together, but just wanted to check that there does not already exist an R package to output such a plot. 

Thanks again.
#
I guess no reply means there is not an existing package to produce the plot?

I will post the results of my script to hopefully help others who are trying to formulate the same plot. 

Thanks again.
--- On Mon, 3/16/09, Jason Rupert <jasonkrupert at yahoo.com> wrote:

            
#
Here is what I have so far:
I noticed that I can rotate a boxplot via "horizontal", but apparently "hist" does not have that functionality.

I tried stacking the plots vertically:
test_data<-rnorm(100)
par(mfrow=c(3,1)) # 3 rows by 1 columns layout of plots
hist(test_data)
boxplot(test_data, horizontal=TRUE)
qqnorm(test_data)

However, I would have to rotate the QQnorm plot, which would be pretty confusing and I think non-standard.   

Thank you again for any feedback and insight regarding trying to reproduce the JMP figure shown at:
http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html
--- On Tue, 3/17/09, Jason Rupert <jasonkrupert at yahoo.com> wrote:

            
#
I believe that hist will return a vector that could be passed to  
barplot:

  h.islands <- hist(islands)

 > barplot(h.islands$intensities, horiz=TRUE)  # or
 > barplot(h.islands$counts, horiz=TRUE)

David Winsemius
On Mar 17, 2009, at 11:38 AM, Jason Rupert wrote:

            
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
I don't know if this will help, but look at the R graph gallery.
There may be something there, with code, that will work.
On Tue, Mar 17, 2009 at 10:39 AM, Jason Rupert <jasonkrupert at yahoo.com> wrote:

  
    
#
Jason,
be carefully to the order of intensities (counts or densities):

x=rnorm(1000)
par(mfrow=c(2,2))
h=hist(x,breaks=bk<-c(-5,-3,-2,-1,-.5,0,1,3,5))
barplot(rev(h$intensities),rev(bk[2:9]-bk[1:8]),space=0,horiz=T) # compare to
axis(2)
barplot(h$intensities,bk[2:9]-bk[1:8],space=0,horiz=T)
axis(2)

Patrizio

2009/3/17 David Winsemius <dwinsemius at comcast.net>:
#
Awesome.  

This seems to produce the vertical histogram as needed, but is there then a way to come back and add on a probability distribution line?
That is, add something like the following to the barplot:
points(density(x), type='l', lwd=3, col='red')


Thanks again.
--- On Tue, 3/17/09, Patrizio Frederic <frederic.patrizio at gmail.com> wrote:

            
#
Can you reverse the x and y columns of the density object?

plot(density(c(-20,rep(0,98),20))$y, density(c(-20,rep(0,98),20))$x,
              xlim = c(0,1), ylim=c(-4,4), type='l', lwd=3, col='red')

--  
David Winsemius
On Mar 17, 2009, at 2:38 PM, Jason Rupert wrote:

            
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
Hi
Jason Rupert wrote:
I don't know of an existing function that does that particular
arrangement of plots, but here's a start ...


breaks <- seq(-3.5, 3.5, by=.5)
yrange <- range(breaks)

histdata <- hist(y, breaks=breaks, plot=FALSE)
densitydata <- density(y)

par(oma=rep(3, 4), mar=rep(0, 4))

layout(matrix(1:3, ncol=3), widths=c(.2, .1, 1), respect=TRUE)

barplot(histdata$density, horiz=TRUE, space=0,
        xlim=c(0, max(histdata$density)*1.1), axes=FALSE)
par(new=TRUE)
plot(densitydata$y, densitydata$x,
     xlim=c(0, max(histdata$density)*1.1), ylim=yrange,
     type="l", ann=FALSE, axes=FALSE)
axis(2)
box()
boxplot(y, ylim=yrange, axes=FALSE)
box()
qqnorm(y, axes=FALSE, ylim=yrange, xlim=yrange, main="")
abline(0, 1)
axis(1)
box()


... the important bits are the call to layout() to get the plots at
different sizes and the numerous efforts made to make sure that the
coordinate systems of the various plots are coherent.

Paul

  
    
#
Paul, 

This is great!  Like you said it is really close.  I made a few changes and for some reason the y-axis label magically came back.  I tried to remove it but it wouldn't stay away.  

Also, for some reason the title exceeds the margins of the layout.  I am going ot mess around with this a bit more, but any suggestions for fixing this are also greatly appreciated.


Thank you again.


y<-rnorm(1000)

yrange <- range(y)

#histdata <- hist(y, breaks=breaks, plot=FALSE)
histdata <- hist(y, plot=FALSE)
densitydata <- density(y)

par(oma=rep(3, 4), mar=rep(0, 4))

layout(matrix(1:3, ncol=3), widths=c(.2, .1, 1), respect=TRUE)

barplot(histdata$density, horiz=TRUE, space=0,
        xlim=c(0, max(histdata$density)*1.1), axes=FALSE)
par(new=TRUE)
plot(densitydata$y, densitydata$x,
     xlim=c(0, max(histdata$density)*1.1), ylim=yrange,
     type="l", ann=FALSE, axes=FALSE)
axis(2)
box()

boxplot(y, ylim=yrange, axes=FALSE)
box()

# qqnorm(y, axes=FALSE, ylim=yrange, xlim=yrange, main="")
qqnorm(y, , main="")
axis(2, labels = FALSE)
#abline(0, 1)
qqline(y, col="red", lwd=3)
box()

kurtosis_val <-(sum((y - mean(y))^4 ))/(var(y)*var(y))/length(y)-3

skewness<-function(x)
{
    m_skew=mean(x)
    me_skew=median(x)
    s_skew=sqrt(var(x))
    sk_skew=(m_skew-me_skew)/s_skew
    return(sk_skew)
}


title_text<-paste("Title Text", "\n Mean = ", format(mean(y), digits=4, scientific=F), " Standard Deviation = ", format(sd(y), digits=4, scientific=F), "\n Skewness = ", format(skewness(y), digits=4, scientific=F), "Kurtosis =", format(kurtosis_val, digits=4, scientific=F), sep="")

mtext(title_text, NORTH<-3, line=0, adj=0.5, cex=1.2, col="red", outer=TRUE)
--- On Tue, 3/17/09, Paul Murrell <p.murrell at auckland.ac.nz> wrote:

            
#
Hi
Jason Rupert wrote:
You have removed the 'axes=FALSE' from the qqplot() call.

Also, you have removed some of the explicit setting of 'ylim', 'xlim',
and 'breaks', which means that your plots are using subtly different
coordinate systems.
The title fits (just) for me.  If you need more room, bump up the value
in the par(oma) setting.  You could also reduce the 'line' value in your
mtext() call.

Paul

  
    
#
Using ggplot2:

library(ggplot2)
test_data<-rnorm(100)
a=data.frame(obs=test_data,condition='None')
p1=qplot(
	data=a
	,x=obs
	,geom='histogram'
	)+coord_flip()
p2=qplot(
	data=a
	,y=obs
	,x=condition
	,geom='boxplot'
	)+opts(
		axis.text.y=theme_blank()
		,axis.title.y=theme_blank()
	)+coord_flip()
p3=qplot(
	sample=test_data
	,stat='qq'
	,distribution=qnorm
	)+coord_flip()

print(p1,vp=viewport(width=1,height=1/3,x=.5,y=1/3*.5))
print(p2,vp=viewport(width=1,height=1/3,x=.5,y=1/3+1/3*.5))
print(p3,vp=viewport(width=1,height=1/3,x=.5,y=2/3+1/3*.5))
On Tue, Mar 17, 2009 at 12:38 PM, Jason Rupert <jasonkrupert at yahoo.com> wrote:

  
    
#
Nice work, Mike. Actually I think he was looking for it done this way.

  library(ggplot2)
test_data<-rnorm(100)
a=data.frame(obs=test_data,condition='None')
p1=qplot(
	data=a
	,x=obs
	,geom='histogram'
	)+coord_flip()
p2=qplot(
	data=a
	,y=obs
	,x=condition
	,geom='boxplot'
	)+opts(
		axis.text.y=theme_blank()
		,axis.title.y=theme_blank()
	)
p3=qplot(
	sample=test_data
	,stat='qq'
	,distribution=qnorm
	)

print(p1,vp=viewport(width=1/6,height=1,y=.5,x=1/6*.5))
print(p2,vp=viewport(width=1/6,height=1,y=.5,x=1/6+1/6*.5))
print(p3,vp=viewport(width=2/3,height=1,y=.5,x=1/3+2/3*.5))

Perhaps with a red line through the hinge points. And probably need to  
get the axes aligned proerly but it's very close.
#
I actually pasted the wrong code! This attempts to replicate the
original request to replicate a JMP graphic:

library(ggplot2)
test_data<-rnorm(100,mean=10,sd=4)
a = data.frame(obs = test_data,condition = 'None')
p1 = ggplot(
	data = a
	,aes(
		x = obs
	)
)+geom_histogram(
	aes(
		y = ..density..
		)
)+geom_density(
)+scale_x_continuous(
	limits = range(a$obs)
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
)+coord_flip(
)
p2 = ggplot(
	data = a
	,aes(
		x = condition
		,y = obs
	)
)+geom_boxplot(
)+scale_y_continuous(
	limits = range(a$obs)
)+scale_x_discrete(
	name = ''
	,labels = ''
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
	,axis.ticks = theme_blank()
	,axis.text.y = theme_blank()
	,axis.title.y = theme_blank()
)
p3 = ggplot(
	data = a
	,aes(
		sample = (obs-mean(obs))/sd(obs)
	)
)+stat_qq(
	distribution=qnorm
)+geom_abline(
	intercept=0
	,slope=1
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
	,axis.ticks = theme_blank()
	,axis.text.y = theme_blank()
	,axis.title.y = theme_blank()
)
	

print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5))
print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5))
print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5))
On Tue, Mar 17, 2009 at 6:36 PM, David Winsemius <dwinsemius at comcast.net> wrote:

  
    
#
Paul, 

I really appreciate all your help.

Here is what I ended up with:
http://n2.nabble.com/Can-R-produce-this-plot--td2489288.html#a2494992

This is really close.  I might try to figure out a way to add the median to the "boxplot", but this is really close and captures a ton of information on one plot. 

Thank you again for your help.  This is great for a newbie like me to see the outstanding capability.
--- On Tue, 3/17/09, Paul Murrell <p.murrell at auckland.ac.nz> wrote:

            
8 days later
#
In case anyone is still interested, a slight improvement is to plot
both density and normal distributions on top of the empirical
histogram (previous version plotted only density):

library(ggplot2)
test_data<-rnorm(100,mean=10,sd=4)
a = data.frame(obs = test_data,condition = 'None')
p1 = ggplot(
	data = a
	,aes(
		x = obs
	)
)+geom_histogram(
	aes(
		y = ..density..
		)
)+stat_density(
	mapping=aes(ymax=max(..density..))
	,geom='path'
	,colour='red'
)+stat_function(
	fun = dnorm
	,args = list(
		m=mean(a$obs)
		,sd=sd(a$obs)
	)
	,colour = 'green'
)+scale_x_continuous(
	limits = range(a$obs)
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
)+coord_flip(
)
p2 = ggplot(
	data = a
	,aes(
		x = condition
		,y = obs
	)
)+geom_boxplot(
)+scale_y_continuous(
	limits = range(a$obs)
)+scale_x_discrete(
	name = ''
	,labels = ''
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
	,axis.ticks = theme_blank()
	,axis.text.y = theme_blank()
	,axis.title.y = theme_blank()
)
p3 = ggplot(
	data = a
	,aes(
		sample = (obs-mean(obs))/sd(obs)
	)
)+stat_qq(
	distribution=qnorm
)+geom_abline(
	intercept=0
	,slope=1
)+opts(
	panel.grid.minor = theme_blank()
	,panel.grid.major = theme_blank()
	,panel.background = theme_rect()
	,axis.ticks = theme_blank()
	,axis.text.y = theme_blank()
	,axis.title.y = theme_blank()
)


print(p1,vp = viewport(width = 1/3,height = 1,x = 1/3*.5,y = .5))
print(p2,vp = viewport(width = 1/3,height = 1,x = 1/3+1/3*.5,y = .5))
print(p3,vp = viewport(width = 1/3,height = 1,x = 2/3+1/3*.5,y = .5))