Hi, I have a simple question about quartiles in R, especially how they are calculated using the boxplot. Quartiles (.25 and .75) in boxplot are different from the summary function and also don't match with the 9 types in the quantile function. See attachment for details. Can you give me the details on how the boxplot function does calculate these values? Cheers, Rene Brinkhuis (Netherlands) -------------- next part -------------- A non-text attachment was scrubbed... Name: Quartiles in R.pdf Type: application/pdf Size: 39729 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120113/9dadc934/attachment.pdf>
Quantiles in boxplot
5 messages · René Brinkhuis, Sarah Goslee, Peter Dalgaard
The explanation is in ?boxplot.stats (as the help for boxplot states).
Details:
The two ?hinges? are versions of the first and third quartile,
i.e., close to ?quantile(x, c(1,3)/4)?. The hinges equal the
quartiles for odd n (where ?n <- length(x)?) and differ for even
n. Whereas the quartiles only equal observations for ?n %% 4 ==
1? (n = 1 mod 4), the hinges do so _additionally_ for ?n %% 4 ==
2? (n = 2 mod 4), and are in the middle of two observations
otherwise.
And so on, with references.
Sarah
2012/1/13 Ren? Brinkhuis <rene.brinkhuis at live.nl>:
Hi, I have a simple question about quartiles in R, especially how they are calculated using the boxplot. Quartiles ?(.25 and .75) in boxplot are different from the summary function and also don't match with the 9 types in the quantile function. See attachment for details. Can you give me the details on how the boxplot function does calculate these values? Cheers, Rene Brinkhuis (Netherlands)
Sarah Goslee http://www.functionaldiversity.org
Hi,
Thanks for your reply.
That was exactly the piece of information I needed.
Based
on your information I created a custom function for calculating the
first and third quartile according to the 'boxplot logic'.
See attachment for additional details.
Kind regards,
Ren?.
BoxplotQuartiles <- function (v) {
v <- sort(v)
n <- length(v)
ql <- n/4 ## Length of a quartile
p1 <- ql*1 ## Position Q1 in vector
p3 <- ql*3 ## Position Q3 in vector
f1 <- p1%%1 ## Fractional part of p1
f3 <- p3%%1 ## Fractional part of p3
if (f1 == 0.25)
Q1 <- v[p1 + 0.75]
else if (f1 == 0.50)
Q1 <- v[p1 + 0.50]
else if (f1 == 0.75)
Q1 <- (v[p1 + 0.25] + v[p1 + 0.25 + 1.00]) / 2
else if (f1 == 0.00)
Q1 <- (v[p1 + 0.00] + v[p1 + 0.00 + 1.00]) / 2
else
Q1 <- 'Error in calculation Q1'
if (f3 == 0.25)
Q3 <- (v[p3 - 0.25] + v[p3 - 0.25 + 1.00]) / 2
else if (f3 == 0.50)
Q3 <- v[p3 + 0.50]
else if (f3 == 0.75)
Q3 <- v[p3 + 0.25]
else if (f3 == 0.00)
Q3 <- (v[p3 - 0.00] + v[p3 - 0.00 + 1.00]) / 2
else
Q3 <- 'Error in calculation Q3'
return(c(Q1, Q3))
}
Date: Fri, 13 Jan 2012 11:14:08 -0500
Subject: Re: [R] Quantiles in boxplot?
From: sarah.goslee at gmail.com
To: rene.brinkhuis at live.nl
CC: r-help at r-project.org
The explanation is in ?boxplot.stats (as the help for boxplot states).
Details:
The two ?hinges? are versions of the first and third quartile,
i.e., close to ?quantile(x, c(1,3)/4)?. The hinges equal the
quartiles for odd n (where ?n <- length(x)?) and differ for even
n. Whereas the quartiles only equal observations for ?n %% 4 ==
1? (n = 1 mod 4), the hinges do so _additionally_ for ?n %% 4 ==
2? (n = 2 mod 4), and are in the middle of two observations
otherwise.
And so on, with references.
Sarah
2012/1/13 Ren? Brinkhuis <rene.brinkhuis at live.nl>:
Hi, I have a simple question about quartiles in R, especially how they are calculated using the boxplot. Quartiles (.25 and .75) in boxplot are different from the summary function and also don't match with the 9 types in the quantile function. See attachment for details. Can you give me the details on how the boxplot function does calculate these values? Cheers, Rene Brinkhuis (Netherlands)
-- Sarah Goslee http://www.functionaldiversity.org
-------------- next part -------------- A non-text attachment was scrubbed... Name: Quartiles in R.pdf Type: application/pdf Size: 44792 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120114/3be8d3c2/attachment.pdf>
On Jan 14, 2012, at 16:07 , Ren? Brinkhuis wrote:
Based on your information I created a custom function for calculating the first and third quartile according to the 'boxplot logic'.
A more compact (though not as readable) version is afforded by stats:::fivenum. A convenient description is (I believe) that the hinges are the medians of the bottom and top halves of the sorted observations, with the middle observation counting in both groups if n is odd).
x <- rnorm(121) fivenum(x)
[1] -2.4596038 -0.6034689 0.1105829 0.6686026 2.2580863
median(sort(x)[1:floor((length(x)+1)/2)])
[1] -0.6034689
median(sort(x)[ceiling((length(x)+1)/2):length(x)])
[1] 0.6686026
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Thanks for your reaction! Nice description of the 'hinges' and beautiful compact code. R. Attachment: - updated document on quartiles in R
Subject: Re: [R] Quantiles in boxplot? From: pdalgd at gmail.com Date: Sat, 14 Jan 2012 17:40:09 +0100 CC: r-help at r-project.org To: rene.brinkhuis at live.nl On Jan 14, 2012, at 16:07 , Ren? Brinkhuis wrote:
Based on your information I created a custom function for calculating the first and third quartile according to the 'boxplot logic'.
A more compact (though not as readable) version is afforded by stats:::fivenum. A convenient description is (I believe) that the hinges are the medians of the bottom and top halves of the sorted observations, with the middle observation counting in both groups if n is odd).
x <- rnorm(121) fivenum(x)
[1] -2.4596038 -0.6034689 0.1105829 0.6686026 2.2580863
median(sort(x)[1:floor((length(x)+1)/2)])
[1] -0.6034689
median(sort(x)[ceiling((length(x)+1)/2):length(x)])
[1] 0.6686026 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
-------------- next part -------------- A non-text attachment was scrubbed... Name: Quartiles in R.pdf Type: application/pdf Size: 48903 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20120114/7f21a13e/attachment.pdf>