Unbalanced factorial designs

Fri, Feb 27, 2009 9:34 PM #

Dear Ista,

I agree, nice job.  :-)  It is great that you are writing about these
issues, and even better that you are sharing it.

In my classes, I discuss correlated contrasts using the geometric
approach which is in many of the classical texts.  I am not sure how
your audience would respond to this.

I didn't notice any glaring errors in your interpretation, but I
didn't study it thoroughly, either.  I trust that some of the masters
on this list will catch mistakes much more efficiently than I do.

A few thoughts that in my view may improve your document:

1) Page 6, paragraph 2, line 5:  "The problem is that it is pretty
obvious...".  When I first read this, it was not obvious.  After
glancing at the table, it was plausible.  But after looking at a
*graph*, yes, it was obvious.  A visual display here might be a good
idea.  One option would be a two-way interaction plot.  There is one
in the HH package, and you can even get one in Rcmdr with
RcmdrPlugin.HH:

library(HH)
interaction2wt(salary ~ education + gender, data = D)

(the dataframe D is dumped at the bottom of this email).

2)  I got the same Type III and Type II tables as you did, but the
Type II table was different...?

library(car)
options(contrasts=c("contr.Sum", "contr.poly"))

salary.lm <- lm(salary ~ gender + education + gender:education, data = D)
summary(salary.lm)

Anova(salary.lm)  #  Type II tests
anova(salary.lm)  #  Type I tests
Anova(salary.lm, type = "III")  #  Type III tests


3)  I typically refrain from using the language that you have used
concerning "ignoring" and "controlling", etc.   I stick to the
parametric jargon of column, row, and grand means.  This makes for
precise statements without any fear of misunderstanding, but again,
instructors should always keep the background of the audience in mind.


Regardless of whether you like the ideas or not, the fact that you are
going to the trouble of preparing a document like this for the class
is admirable in its own right.  Keep it up.  :-)

Cheers,
Jay

P.S. a once-over with a spell check would be good (it's going to
students, right?).



# the data

D <-
structure(list(ID = 1:22, salary = c(24L, 26L, 25L, 24L, 27L,
24L, 27L, 23L, 15L, 17L, 20L, 16L, 25L, 29L, 27L, 19L, 18L, 21L,
20L, 21L, 22L, 19L), gender = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("Female", "Male"), class = "factor"), education =
structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("Collegeeducation", "Nocollegeeducation"
), class = "factor"), congend = structure(c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L), .Label = c("1", "?1"), class = "factor"), coneduc = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L), .Label = c("1", "?1"), class = "factor"),
    congendeduc = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1",
    "?1"), class = "factor")), .Names = c("ID", "salary", "gender",
"education", "congend", "coneduc", "congendeduc"), class =
"data.frame", row.names = c(NA, -22L))





***************************************************
G. Jay Kerns, Ph.D.
Associate Professor
Department of Mathematics & Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
E-mail: gkerns at ysu.edu
http://www.cc.ysu.edu/~gjkerns/

Ista Zahn

Sat, Feb 28, 2009 6:58 AM #

On Sat, Feb 28, 2009 at 12:34 AM, G. Jay Kerns <gkerns at ysu.edu> wrote:

Thanks, this is an excellent suggestion. You're right that it's much
clearer when you plot the data.

I'll investigate and try to figure out why we got different results, thanks.

I'll probably stick with the current terms, simply because they are
consistent with the textbook we are using.

Thanks for the encouragement!

-Ista