Impaired boxplot functionality - mean instead of median
On Thu, 2005-12-01 at 19:40 +0300, Evgeniy Kachalin wrote:
Martin Maechler ÂÿÂøÂшÂõÂÑ‚:
Boxplots were invented by John W. Tukey and I think should be
counted among the top "small but smart" achievements from the
20th century. Very wisely he did *not* use mean and standard deviations.
Even though it's possible to draw boxplots that are not boxplots
(and people only recently explained how to do this with R on this
mailing list), I'm arguing very strongly against this.
If I see a boxplot - I'd want it to be a boxplot and not have
the silly (please excuse) 10%--------90% whiskers which
declare 20% of the points as outliers {in the boxplot sense}.
If you want the mean +/- sd plot, do *not* misuse boxplots
for them, please!
So I analize genetics data. I have some factor (gene variant, c(1,2,3)) and the quantitative variable corresponding to that factor. How do I visualize this situation? Compare mean of samples corresponding to factor values? Should boxplot support 'mean-in-the-middle', it would fit my needs ideally. How do I plot mean +/- SD plot? Also there is a way to rewrite boxplot.stats and replace "fivenum" there for self-made function. Then I would need to write self-made boxplot.formula (or boxplot.default?) function. And all this stuff would not be configurable. I'm still novice in R, so I need simple way to pre-visualize my data and estimate approximate result.
If you want means and SDs, you might want to look at: 1. plotCI() and plotmeans() in the gplots package 2. errbar() in the Hmisc package 3. Use plot() in conjunction with the arrows() or segments() functions, which is what the above end up doing in a convenient and unified approach. HTH, Marc Schwartz