Skip to content

ggplot2 geom_boxplot limits

4 messages · Jacob Wegelin, Thierry Onkelinx

#
With base graphics, one can use the "ylim" argument to zoom in on a boxplot.

With ggplot2, using "limits" to try to zoom in on a boxplot *changes the box*.  Since the box usually indicates the 25th and 75th percentiles of a quantitative variable, this is puzzling.

The toy code below demonstrates this. In ggplot2, "zooming in" causes the two boxes to overlap, when they did not overlap in the full plot.  Also, the center lines --- which usually indicate the median of the variable --- change when one zooms in.

In base graphics, "zooming in" does not cause the boxes to overlap or, as far as I can see, the median line to move relative to the scale.

What is going on here?

pdf(file="toy-example.pdf")
set.seed(1)
toy1<-data.frame(Y=rnorm(500, mean=3), A="one")
toy2<-data.frame(Y=rnorm(500, mean=1.6), A="two")
toy<-rbind(toy1,toy2)
toy$A<-factor(toy$A)
library(ggplot2)
mybreaks<-signif(seq(from=min(toy$Y),to=max(toy$Y),by=0.5),digits=2)
mylimits<-c(0.61,3.7)
print(myplot<-ggplot(toy, aes(x=A,y=Y)) + geom_boxplot()+scale_y_continuous(breaks=mybreaks)+theme_bw())
print(myplot+scale_y_continuous(breaks=mybreaks,limits=mylimits))
boxplot(toy1$Y,toy2$Y)
boxplot(toy1$Y,toy2$Y, ylim=mylimits)
graphics.off()
R version 3.2.1 (2015-06-18)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.4 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] ggplot2_1.0.1

loaded via a namespace (and not attached):
  [1] MASS_7.3-40      colorspace_1.2-6 scales_0.2.5     magrittr_1.5     plyr_1.8.3       tools_3.2.1      gtable_0.1.2     reshape2_1.4.1
  [9] Rcpp_0.11.6      stringi_0.5-5    grid_3.2.1       stringr_1.0.0    digest_0.6.8     proto_0.3-10     munsell_0.4.2


Jacob A. Wegelin
#
Limits in scales set values outside the limits to NA. Hence the boxplots,
smoothers,... change. Use coord_cartesian() to "zoom in".

Op 20-jul.-2015 20:29 schreef "Jacob Wegelin" <jacobwegelin at fastmail.fm>:
boxplot.
box*.  Since the box usually indicates the 25th and 75th percentiles of a
quantitative variable, this is puzzling.
two boxes to overlap, when they did not overlap in the full plot.  Also,
the center lines --- which usually indicate the median of the variable ---
change when one zooms in.
far as I can see, the median line to move relative to the scale.
geom_boxplot()+scale_y_continuous(breaks=mybreaks)+theme_bw())
plyr_1.8.3       tools_3.2.1      gtable_0.1.2     reshape2_1.4.1
digest_0.6.8     proto_0.3-10     munsell_0.4.2
http://www.R-project.org/posting-guide.html

  
  
#
On 2015-07-20 Mon 15:19, Thierry Onkelinx wrote:
Thanks. What do I do if I also want to use coord_flip(), that is, if I want the boxes to lie horizontally *and* to zoom in?

myplot+coord_cartesian(ylim=mylimits) # zooms in

myplot+coord_cartesian(ylim=mylimits) + coord_flip() # flips but does not zoom

myplot + coord_flip()+coord_cartesian(ylim=mylimits) # zooms but does not flip

Jacob Wegelin
#
Here is the answer: http://rpubs.com/INBOstats/zoom_in

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-07-20 22:19 GMT+02:00 Jacob Wegelin <jacobwegelin at fastmail.fm>: