Skip to content

using MANOVA in R

10 messages · mandar oak, Peter Dalgaard, Torsten Hothorn +3 more

#
Hi All!

I wish to use R for MANOVA analysis. Where could I get
info on that?

__________________________________________________
Do You Yahoo!?
Send instant messages & get email alerts with Yahoo! Messenger.
http://im.yahoo.com/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Mon, 8 May 2000, mandar oak wrote:

            
Depends exactly what you mean by MANOVA.  Try ?aov which handles multiple
responses.  However, the MANOVA multivariate tests (Pillai, Wilks lambda
...) are not implemented in base R.  We would welcome submission of code
for those, of course.
#
Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:
At lest some of that sits in contributed packages "multilm" and
"norm". (Both of those are on my list of "things to try when I get the
time"... I believe multilm more or less *is* MANOVA, whereas norm
handles estimation in the presence of missing values. One interesting
question is whether you can do MANOVA with missings by combining the
two.)
#
On 9 May 2000, Peter Dalgaard BSA wrote:

            
Not really any of it!  multilm fits multivariate linear models and does
Hotelling T^2 tests. It does not group terms for AOV, and I think is only
applicable for continuous variates (not factor explantory variables).
Certainly the help and examples only discuss that case.

norm is about multivariate normal data, no regressors, AFAIK.  Various
people have pointed out the dangers of that: how often is iid on a 
multivriate normal population appropriate?
#
Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:
Had a closer look. The factor expl.var. would seem to be there
case-wise, so you can build the models, but Wilk's Lambda and friends
are absent, so you cannot (easily) do multi-df model reduction tests.
However, it's no too far off. Is the author planning to develop it
further?
Not often, but sometimes things fall apart into several homogeneous
groups and then it can be used as a building block. I've used
something like this for an RCT a while back back.
#
On 9 May 2000, Peter Dalgaard BSA wrote:

            
Now it is my turn, I think :-)

1) "multilm" is on my list, but there are some theoretical problems
(e.g. is the use of Moore-Penrose numerically appropriate ?) Any help by
other distributors would be very welcome!

2) "multilm" was written for the use of the stabilized multivariate tests
by Laeuter and Kropf, T^2 is only a by-product. Therefore the
documentation is heavily biased on this. 

3) I do think that one can take factors as design variables (there is an
Iris example, I think). 

4) I just had my third move this year (this time to Erlangen, Bavaria), so
time is a problem too :-)

Torsten


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Tue, 9 May 2000, Torsten Hothorn wrote:

            
One can fit the model, but as the Iris example shows, it assumes three
continuous vars, NOT one factor in a model formula, and so the information
given is not appropriate for a manova as I understand that term.
Manova includes a series of tests of submodels.

Here's an S example of manova:

       wafer.manova <- manova(cbind(pre.mean, post.mean) ~ maskdim +
          visc.tem + spinsp, wafer)

       summary(wafer.manova) # manova table with Pillai's trace

          Df Pillai Trace approx. F num df    den df    P-value 
  maskdim 1  0.42662      4.09229   2         11        0.04693
 visc.tem 2  0.5809       2.45605   4         24        0.07305
   spinsp 2  0.47904      1.88978   4         24        0.14495
Residuals 12                                                   

so factors in the model are grouped and several submodels have been
fitted.
1 day later
#
I hope this could help. It was originally written in Spanish, so several
errors migth have appeared in the translation process.
If so, let me know.


# Let?s define some kind of determinant function
# It coul be useful a function to make void lists with as many components as
the parameter ncomp states.
+{
+   aux  list('name'=NULL)
+   if (ncomp>1)
+     for (i in 2:ncomp)
+        aux  <- c(aux, list('name'=NULL))
+   for (i in 1:ncomp)
+        attributes(aux)$names[i] <- paste('Class',i)
+   return(aux)
+}

# Let?s define a function to evaluate Wilks statistic.
+  {
+   nclass  <-length(data.in)
+   nvars <- ncol(data.in[[1]])
+   aveQ <- DoClassList(nclass)
+   for (i in 1:nclass)
+    aveQ[[i]] <- apply(data.in[[i]],2,mean)
+ SQ  <- lapply(data.in,var)
+ W <- SQ[[1]]*(nrow(data.in[[1]])-1)
+ for (i in 2:nclass)
+   W <- W + SQ[[i]]*(nrow(data.in[[i]])-1)
+   aveTotal <- aveQ[[1]]*nrow(data.in[[1]])
+  for (i in 2:nclass)
+    aveTotal <- aveTotal + aveQ[[i]] *
+     nrow(data.in[[i]])
+ aveTotal <- aveTotal / sum(unlist( lapply(data.in,nrow)))
+ B <- nrow(data.in[[1]]) * as.matrix(aveQ[[1]] - aveTotal) +   %*%
t(as.matrix(aveQ[[1]] - aveTotal))
+ for (i in 2:nclass)
+  B <- B + nrow(data.in[[i]]) * as.matrix(aveQ[[i]] -
+  aveTotal) %*% t(as.matrix(aveQ[[i]] - aveTotal))
+ n  <- nvars
+ s  <- sum(unlist(lapply(data.in,nrow))) - nclass
+ t  <- nclass -1
+ Lambda <- det(W)/det(B+W)
+ w2  <- 1- (nrow(data.in[[i]])*Lambda)/(s+Lambda)
+ w  <-  w2 - (n^2+t^2)/(nrow(data.in[[i]])*3)*(1-w2)
+ if ((n^2+t^2-5) != 0)
+   {
+     a  <- s + t - (n + t + 1)/2
+     b  <- sqrt((n^2*t^2-4)/(n^2+t^2-5))
+     c  <- (n*t-2)/4
+     doubt <-  (a*b - 2*c) /(n*t) * (Lambda^(-1/b)-1)
+     return (list( valor=doubt, F=qf(0.99, n*t, (a*b - 2*c)),
+       L=(doubt < qf(0.99, n*t, (a*b - 2*c))), w=w))
+   }
+ else
+   {
+     doubt <- -( s - 0.5*(n-t+1))*log(Lambda)
+     return (list( value=doubt, chi2=qchisq(0.99, n*t),
+       L=(doubt < qchisq(0.99, n*t)),w=w))
+   }
+}

As an example of use let?s consider two binormal distributions class1 and
class2, with different mean vector but identical covariance matrix.
# Lets represent the two classes
+   col=c(rep("Blue",length(clase1[,1])),
+         rep("Red",length(clase2[,1]))) , pch=".")
# I hope you too have the vcellipse code that was sent to R-help. Nice
function ideed.
# I would have sent the picture but I?m sure that to be allowed in the list.
??

# And apply the test
value          chi2            L            w
9403.4041430    9.2103404          FALSE       0.8047111

As for the example distributions normality condition can be applied, this
result says NO to the null hipothesis and two different classes of behaviour
have been identified.

Of course, in a more general case, you must check that normality and
homogeneity of covariance matrix can be assesed.
But that is another war.... :-)
----------------------------------------------------------------------------
Manuel Castej?n Limas.
Project Management Area.        e-mail:manuel.castejon at dim.unirioja.es
Mechanical Engineering Dept.    http://www.unirioja.es/dptos/dim
University of La Rioja
c/ Luis de Ulloa 20, 26004 Logro?o
La Rioja
Spain.
----------------------------------------------------------------------------





-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
...

    Manuel> # Let´s define some kind of determinant function

      >> det <- function (x)   { prod(eigen(x)$values) }

      ....


We have been there before (a year ago on R-devel, see "eigenvalue"),
and since then had the following in   ?qr :
But since people seem to re-invent that teeny part of the wheel,
something like the following will be in R 1.1.x :

  ### From Doug Bates' 20 Apr 1999 post to R-devel;
  ### "method" idea from J.K.Lindsey's rmutil

  det <- function(x, method = c("qr","eigenvalues"))
  {
      if(!is.matrix(x))
	  stop("x must be a matrix")
      method <- match.arg(method) # ensures a method from above
      if(method == "qr")
	  prod(diag(qr(x)$qr)) *(-1)^(ncol(x)-1)
      else ## if(method == "eigenvalues")
	  Re(prod(eigen(x, only.values=TRUE)$values))
  }


    det                   package:base                   R Documentation

    Calculate the Determinant of a Matrix

    Description:

	 `det' calculates the determinant of a matrix either by `qr'
	 decomposition or from the `eigen'values.

    Usage:

	 det(x, method = c("qr","eigenvalues"))

    Arguments:

	   z: numeric matrix.

      method: `"qr"' (default) or `"eigenvalues"'

    Note:

	 Often, computing the determinant is not what you should be doing
	 to solve a given problem.

	 The `"qr"' method is much faster for large matrices.

    See Also:

	 `eigen', `qr', `svd'

    Examples:

	 (x <- matrix(1:4, ncol=2))
	 det(x)
	 det(x, method="eigenvalues")

	 det(print(cbind(1,1:3,c(2,0,1))))


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Dear people,

Besides of being looking for outlier detection methods in non normal
multivariate
distributions I am also interested in knowing about the different ways
people manage the bad behaviour of MANOVA as soon as the algorithm is used
with irregular number of samples in each class.

Any help would be appreciated.

And excuse me if I tried to write a det function when I needed to calcute
one and found none. I promise not to do it again :-)
I hope you understand that a year ago I wasn?t in the list, so I missed that
part. Anyway it was not mandatory to use my det function. If it was able to
make you all know that the det function in TestWilk meant determinant, I?m
happy enough.



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._