implementing Grubbs outlier test on a large dataframe

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090214/4c52ff30/attachment-0001.pl>
Sending each row of a datatframe, dfm,  as a vector to a function,  
fcn, is as simple as;

apply(dfm, 1, fcn)

e.g.:

 > dfm <- data.frame(x=rnorm(10), y=rnorm(10), z=rnorm(10))
 >
 > apply(dfm, 1, sum)
  [1]  0.7385838 -3.1819193  0.3415670 -0.6552601 -1.3470174  
-0.6446259 -0.6544967
  [8]  0.1778169 -0.3330527  0.6246071

And with the second argument set to 2, you would get a columnwise  
application of the function.

You need to show us what your function looks like to go any further. I  
am unclear how one could get a function that only operates on a single  
row to yield an outlier classification.
David Winsemius
On Feb 14, 2009, at 6:01 PM, John Malone wrote:

> Hi!
>
> I'm trying to implement an outlier test once/row in a large dataframe.
> Ideally, I'd do this then add the Pvalue results and the number  
> flagged as
> an outlier as two new separate columns to the dataframe.  Grubbs  
> outlier
> test requires a vector and I'm confused how to make each row of my  
> dataframe
> a vector, followed by doing a Grubbs test for each row containing  
> the vector
> of numbers I want to perform the outlier test on.
>
> I'm new to R and no doubt this is a simple problem. Any help you might
> provide would be greatly appreciated.
>
> Many thanks in advance!!
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi!

I'm trying to implement an outlier test once/row in a large dataframe.
Ideally, I'd do this then add the Pvalue results and the number flagged as
an outlier as two new separate columns to the dataframe.  Grubbs outlier
test requires a vector and I'm confused how to make each row of my dataframe
a vector, followed by doing a Grubbs test for each row containing the vector
of numbers I want to perform the outlier test on.

I'm new to R and no doubt this is a simple problem. Any help you might
provide would be greatly appreciated.

Many thanks in advance!!

	[[alternative HTML version deleted]]

John - you would be making a strong normality assumption.  You might 
reject H0 using Grubbs' test just because of non-normality, or you might 
fail to reject it just because of non-normality.  Is it really this 
straitforward to declare something an outlier?  What does outlier really 
mean?

The following is must reading.

@Article{fin06cal,
   author =               {Finney, David J.},
   title =                {Calibration guidelines challenge outlier 
practices},
   journal =      The American Statistician,
   year =                 2006,
   volume =               60,
   pages =                {309-313},
   annote =               {anticoagulant
therapy;bias;causation;ethics;objectivity;outliers;guidelines for
treatment of outliers;overview of types of outliers;letter to the editor 
and reply 61:187 May 2007}
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University