Skip to content
Back to formatted view

Raw Message

Message-ID: <Pine.LNX.4.63a.0510040858570.5842@homer23.u.washington.edu>
Date: 2005-10-04T16:11:59Z
From: Thomas Lumley
Subject: boxplot statistics
In-Reply-To: <ypx6irwdiqiz.fsf@uracil.uio.no>

On Tue, 4 Oct 2005, Karin Lagesen wrote:
>
> First, how does boxplot determine the size of the box? And is the line
> inside the box the mean or the median (or something completely
> different?) And how does it determine how long out the whiskers should
> go?

Part of the problem is that there are lots of different definitions of the 
quartiles (quantile() has 9 of them). If the number of observations is one 
more than a multiple of 4 then all the definitions agree, otherwise they 
are slightly different.

For the case where the number of observations is one more than a multiple 
of 4 the line in the middle is the median, the ends of the box are the 
upper and lower quartiles, and the whiskers extend to the furthest point 
that is within 1.5 box lengths from the end of the box.

When the number of observations is not one more than a multiple of four 
this is all still true, but you have to be careful about which definition 
of "quartile" you mean, for which you can read either the book referenced on 
the help page, or the code.

 	-thomas