Agustin Lobo wrote:
Gracias!
This is really close. I'm just missing the "notches", which meaning
is explained in the help page
("If the notches of two plots do not overlap then the
medians are significantly different at the 5 percent level.")
but not their actual definition.
I presume that they are calculated using a t-value but taking the median
and m.a.d. instead of the mean and var, but this is nowhere specified.
With skewed distributions nothes often
make funny plots.
Right, and they look funny when n is small. The formula is Median ? 1.57*IQR/sqrt(n)
An example using your function, i.e., bx.example(y=rnorm(300)+runif(300)) should be in the help page! Why not sending it to rhelp? Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es On Fri, 22 Feb 2002, David James wrote:
Hola!
The function below does something close to that.
"bp.example" <-
function(y, xlab = "Fraction (p)", ylab = "Quantiles Q(p)", ...)
{
##
## adapted from Bill Cleveland's Visulizing Data (1993)
##
split.screen(fig = rbind(c(0,2/3, 0, 1), c(2/3, 1, 0, 1)))
screen(1)
par(cex=1); par(mar=c(5.1, 4.1, 2, 1))
q <- quantile(y, c(0.25, 0.5, 0.75))
y <- sort(y)
p <- ppoints(y)
iq <- q[3] - q[1]
bxp.adj<- q[c(1,3)] + c(-1.5, 1.5)*iq
lower.adj<- min(y[y>=bxp.adj[1]])
upper.adj<- max(y[y<=bxp.adj[2]])
plot(p, y, xlab=xlab, ylab=ylab, ...)
u <- par("usr")
b <- (q[3]-q[1])/.50
a <- q[1] - .25 * b
abline(a,b)
abline(h=q, col = 2, lty=2)
abline(h=c(lower.adj, upper.adj), col = 3, lty=2)
cxy <- par("cxy")
# lower right annotations
text(x = u[c(2,2)] - cxy[1],
y = c(lower.adj,q[1]) + cxy[2]/2,
c("lower adjecent", "lower quartile (Q1)"), adj=1)
text(x = u[2]-cxy[1], lower.adj- 0.75*cxy[2],
'min(y[y >= Q1 - 1.5 * IQR])', adj=1)
# upper left annotations
text(x = u[c(1,1,1)] + cxy[1],
y = c(q[-1], upper.adj) + cxy[2]/2,
c("median (Q2)", "upper quartile (Q3)", "upper adjecent"), adj=0)
text(x = u[1]+cxy[1], y=upper.adj-0.75*cxy[2],
'max(y[ y <= Q3 + 1.5 * IQR])', adj=0)
invisible(list(x = p, y = y, a = a, b = b))
screen(2)
ylim <- range(y)
par(cex=1); par(mar=c(5.1, 1.0, 2, 1))
boxplot(y, ylim = ylim, axes=F, col=2)
axis(2, labels=F)
box()
close.screen(all=T)
}
Agustin Lobo wrote:
I've always thought that it would most useful having a graphic example of boxplot including some text pointing to the main features of the boxplot and that would define and explain these features. Perhaps this could be made a simple function (i.e., boxplot.example()) and this function be included in the help entry. Then the user would just run boxplot.example() to see a graphic and commented example. It's more dificult to understand a text describing the boxplot function than just seeing a commented graphic example. Agus Dr. Agustin Lobo Instituto de Ciencias de la Tierra (CSIC) Lluis Sole Sabaris s/n 08028 Barcelona SPAIN tel 34 93409 5410 fax 34 93411 0012 alobo at ija.csic.es On 22 Feb 2002, Peter Dalgaard BSA wrote:
Jay Pfaffman <pfaffman at relaxpc.com> writes:
Another naive stats question. I'm trying to better understand what boxplots are telling me. I think what I see is the median and the boundaries of the 1st and 3rd quartiles. The whiskers represent the range of the data unless there are points which are outside "range" (default: 1.5) times the distance from the median to that quartile. Is that right?
Not quite. 1.5 times the length of the entire box.
I've read the documentation for boxplot numerous times, but don't quite understand it well enough to communicate it to my professor who's helping me with this project. (You'll be relieved to know that neither of us fancies ourself a statistician!)
boxplot.stats.Rd had a typo and got updated recently in the
development and patch versions to read
\item{coef}{this determines how far the plot ``whiskers'' extend out
from the box. If \code{coef} is positive, the whiskers extend to
the
most extreme data point which is no more than \code{coef} times
the length of the box away from the box. A value of zero causes
the whiskers
to extend to the data extremes (and no outliers be returned).}
(for some reason this hasn't yet found its way to the online snapshot
manuals in http://stat.ethz.ch/R-alpha/R-devel/doc/html/ and friends.
Martin?)
V&R (p. 122) claims that the hinges are "roughly quartiles," so perhaps my naive understanding is close enough.
Yes. The exact definition is slightly peculiar, but in compliance with the original definition by Tukey. So I'm told, anyway.
I've got a relatively small data set (n~=12). I think it would help
to see the data points plotted on top of the boxplots. Here's what
I'm doing now:
par(las=2,ps=14,mar=c(15, 4, 4, 2))
boxplot(split(ranks,c(1:25)), names=items, notch=T, horizontal=F, add=F)
If I could get the points of each of the 25 variables plotted on top
of the box, that'd be great.
Not sure what you're doing there, but maybe some code like this could help: x1<-rnorm(20) x2<-rnorm(20) boxplot(list(x1=x1,x2=x2)) points(cbind(1,x1)) points(cbind(2,x2)) -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-- David A. James Statistics Research, Room 2C-253 Phone: (908) 582-3082 Bell Labs, Lucent Technologies Fax: (908) 582-3340 Murray Hill, NJ 09794-0636
David A. James Statistics Research, Room 2C-253 Phone: (908) 582-3082 Bell Labs, Lucent Technologies Fax: (908) 582-3340 Murray Hill, NJ 09794-0636 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._