On Wed, May 31, 2017 at 4:39 PM, Joris Meys <jorismeys at gmail.com> wrote:
Seriously, if a method gives a wrong result, it's wrong. line() does NOT
implement the algorithm of Tukey, even not after the patch. We're not
discussing Excel here, are we?
The method of Tukey is rather clear, and it is NOT using the default
quantile definition from the quantile function. Actually, it doesn't even
use quantiles to define the groups. It just says that the groups should be
more or less equally spaced. As the method of Tukey relies on the medians
of the subgroups, it would make sense to pick a method that is
approximately unbiased with regard to the median. That would be type 8
imho.
To get the size of the outer groups, Tukey would've been more than happy
enough with a:
floor(length(dfr$time) / 3)
[1] 6
There you have the size of your left and right group, and now we can
discuss about which median type should be used for the robust fitting.
But I can honestly not understand why anyone in his right mind would
defend a method that is clearly wrong while not working at Microsoft's
spreadsheet department.
Cheers
Joris
On Wed, May 31, 2017 at 4:03 PM, Serguei Sokol <sokol at insa-toulouse.fr>
wrote:
Le 31/05/2017 ? 15:40, Joris Meys a ?crit :
+ sum(dfr$time <= quantile(dfr$time, 1./3., type = i))
+ })
[1] 8 8 6 6 6 6 8 6 6
Only the default (type = 7) and the first two types give the result
lines() gives now. I think there is plenty of reasons to give why any of
the other 6 types might be better suited in Tukey's method.
So to my mind, chaning the definition of line() to give sensible output
that is in accordance with the theory, does not imply any inconsistency
with the quantile definition in R. At least not with 6 out of the 9
different ones ;-)
Nice shot.
But OTOE (on the other end ;)
+ sum(dfr$time >= quantile(dfr$time, 2./3., type = i))
+ })
[1] 8 8 8 8 6 6 8 6 6
Here "8" gains 5 votes against 4 for "6". There were two defector methods
that changed the point number and should be discarded. Which leaves us
with the score 3:4, still in favor of "6" but the default method should
prevail
in my sens.
Serguei.
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 (0)9 264 61 79 <+32%209%20264%2061%2079>
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
--
Joris Meys
Statistical consultant
Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics
tel : +32 (0)9 264 61 79
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
[[alternative HTML version deleted]]