I sent this in with an old version, but it's in latest version as well. The fix is simple.
In the summary.table function, the parameter is calculated incorrectly
for a test of independence among all cells when the table is more than
2-way table.
Example:
Consider X:
, , c = C1
b
a B1 B2 B3
A1 3 2 3
A2 6 4 5
A3 3 2 0
, , c = C2
b
a B1 B2 B3
A1 0 0 1
A2 7 7 7
A3 3 1 3
, , c = C3
b
a B1 B2 B3
A1 3 3 3
A2 6 3 3
A3 1 0 3
, , c = C4
b
a B1 B2 B3
A1 0 0 2
A2 2 2 7
A3 4 1 0
Now, we construct a table from X (3x3x4):
X.table <- table(X)
summary(X.table)
Number of cases in table: 100
Number of factors: 3
Test for independence of all factors:
Chisq = 30.232, df = 12, p-value = 0.002576
Chi-squared approximation may be incorrect
The df is incorrect. It is calculated as 2*2*3 = 12, but should be 3*3*4 - 1 - (3-1) - (3-1) - (4-1) = 28 (df of interaction terms not in main effects only model). Now consider equivalent model tested in loglin:
loglin(X.table, list(1,2,3))
2 iterations: deviation 3.552714e-15
$lrt
[1] 38.32701
$pearson
[1] 30.23244
$df
[1] 28
$margin
$margin[[1]]
[1] "a"
$margin[[2]]
[1] "b"
$margin[[3]]
[1] "c"
The statistic is identical, but df different. The problem is in summary.table:
Current version of summary.table (INCORRECT VERSION):
function (object, ...)
{
if (!inherits(object, "table"))
stop("object must inherit from class table")
n.cases <- sum(object)
n.vars <- length(dim(object))
y <- list(n.vars = n.vars, n.cases = n.cases)
if (n.vars > 1) {
m <- vector("list", length = n.vars)
for (k in seq(along = m)) {
m[[k]] <- apply(object, k, sum)/n.cases
}
expected <- apply(do.call("expand.grid", m), 1, prod) *
n.cases
statistic <- sum((c(object) - expected)^2/expected)
parameter <- prod(sapply(m, length) - 1) #### Problem is on this line (works only for 2-way tables)
y <- c(y, list(statistic = statistic, parameter = parameter,
approx.ok = all(expected >= 5), p.value = pchisq(statistic,
parameter, lower.tail = FALSE), call = attr(object,
"call")))
}
class(y) <- "summary.table"
y
}
summary.table should be (CORRECTED VERSION)
function (object, ...)
{
if (!inherits(object, "table"))
stop("object must inherit from class table")
n.cases <- sum(object)
n.vars <- length(dim(object))
y <- list(n.vars = n.vars, n.cases = n.cases)
if (n.vars > 1) {
m <- vector("list", length = n.vars)
for (k in seq(along = m)) {
m[[k]] <- apply(object, k, sum)/n.cases
}
expected <- apply(do.call("expand.grid", m), 1, prod) *
n.cases
statistic <- sum((c(object) - expected)^2/expected)
parameter <- prod(sapply(m, length)) - (sum(sapply(m,
length) - 1) + 1) ### This line changed (works for all multiway tables)
y <- c(y, list(statistic = statistic, parameter = parameter,
approx.ok = all(expected >= 5), p.value = pchisq(statistic,
parameter, lower.tail = FALSE), call = attr(object,
"call")))
}
class(y) <- "summary.table"
y
}
Using the corrected summary.table (using the correct code above):
summary(X.table)
Number of cases in table: 100
Number of factors: 3
Test for independence of all factors:
Chisq = 30.232, df = 28, p-value = 0.3522
Chi-squared approximation may be incorrect
These are correct.
--Russell Reeve
rreeve@liposcience.com
--please do not edit the information below--
Version:
platform = i386-pc-mingw32
arch = i386
os = mingw32
system = i386, mingw32
status =
major = 1
minor = 6.2
year = 2003
month = 01
day = 10
language = R
Windows 2000 Professional (build 2195) Service Pack 2.0
Search Path:
.GlobalEnv, package:ctest, Autoloads, package:base
I sent this in with an old version, but it's in latest version as
well. The fix is simple. In the summary.table function, the parameter
is calculated incorrectly for a test of independence among all cells
when the table is more than 2-way table.
As I said last week, this is already fixed in r-devel. What is the
point of sending another bug report?
-k