I hope this could help. It was originally written in Spanish, so several
errors migth have appeared in the translation process.
If so, let me know.
# Let?s define some kind of determinant function
det <- function (x) { prod(eigen(x)$values) }
# It coul be useful a function to make void lists with as many components as
the parameter ncomp states.
DoClassList <- function(ncomp)
+{
+ aux list('name'=NULL)
+ if (ncomp>1)
+ for (i in 2:ncomp)
+ aux <- c(aux, list('name'=NULL))
+ for (i in 1:ncomp)
+ attributes(aux)$names[i] <- paste('Class',i)
+ return(aux)
+}
# Let?s define a function to evaluate Wilks statistic.
TestWilk function (data.in)
+ {
+ nclass <-length(data.in)
+ nvars <- ncol(data.in[[1]])
+ aveQ <- DoClassList(nclass)
+ for (i in 1:nclass)
+ aveQ[[i]] <- apply(data.in[[i]],2,mean)
+ SQ <- lapply(data.in,var)
+ W <- SQ[[1]]*(nrow(data.in[[1]])-1)
+ for (i in 2:nclass)
+ W <- W + SQ[[i]]*(nrow(data.in[[i]])-1)
+ aveTotal <- aveQ[[1]]*nrow(data.in[[1]])
+ for (i in 2:nclass)
+ aveTotal <- aveTotal + aveQ[[i]] *
+ nrow(data.in[[i]])
+ aveTotal <- aveTotal / sum(unlist( lapply(data.in,nrow)))
+ B <- nrow(data.in[[1]]) * as.matrix(aveQ[[1]] - aveTotal) + %*%
t(as.matrix(aveQ[[1]] - aveTotal))
+ for (i in 2:nclass)
+ B <- B + nrow(data.in[[i]]) * as.matrix(aveQ[[i]] -
+ aveTotal) %*% t(as.matrix(aveQ[[i]] - aveTotal))
+ n <- nvars
+ s <- sum(unlist(lapply(data.in,nrow))) - nclass
+ t <- nclass -1
+ Lambda <- det(W)/det(B+W)
+ w2 <- 1- (nrow(data.in[[i]])*Lambda)/(s+Lambda)
+ w <- w2 - (n^2+t^2)/(nrow(data.in[[i]])*3)*(1-w2)
+ if ((n^2+t^2-5) != 0)
+ {
+ a <- s + t - (n + t + 1)/2
+ b <- sqrt((n^2*t^2-4)/(n^2+t^2-5))
+ c <- (n*t-2)/4
+ doubt <- (a*b - 2*c) /(n*t) * (Lambda^(-1/b)-1)
+ return (list( valor=doubt, F=qf(0.99, n*t, (a*b - 2*c)),
+ L=(doubt < qf(0.99, n*t, (a*b - 2*c))), w=w))
+ }
+ else
+ {
+ doubt <- -( s - 0.5*(n-t+1))*log(Lambda)
+ return (list( value=doubt, chi2=qchisq(0.99, n*t),
+ L=(doubt < qchisq(0.99, n*t)),w=w))
+ }
+}
As an example of use let?s consider two binormal distributions class1 and
class2, with different mean vector but identical covariance matrix.
# I hope you too have the vcellipse code that was sent to R-help. Nice
function ideed.
# I would have sent the picture but I?m sure that to be allowed in the list.
??
# And apply the test
value chi2 L w
9403.4041430 9.2103404 FALSE 0.8047111
As for the example distributions normality condition can be applied, this
result says NO to the null hipothesis and two different classes of behaviour
have been identified.
Of course, in a more general case, you must check that normality and
homogeneity of covariance matrix can be assesed.
But that is another war.... :-)
----------------------------------------------------------------------------
Manuel Castej?n Limas.
Project Management Area. e-mail:manuel.castejon at dim.unirioja.es
Mechanical Engineering Dept. http://www.unirioja.es/dptos/dim
University of La Rioja
c/ Luis de Ulloa 20, 26004 Logro?o
La Rioja
Spain.
----------------------------------------------------------------------------
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._