An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111203/0b946ecf/attachment.pl>
Logistic Regression with genetic component
2 messages · Danielle Duncan, Ben Bolker
Danielle Duncan <dlduncan2 <at> alaska.edu> writes:
Greetings, I have a question that I'd like to get input on. I have a classic toxicology study where I artificially fertilized and exposed embryos to a chemical and counted defects. In addition, I kept track of male-female pairs that I used to artificially fertilize and generate embryos with. I need to use logistic regression to model the response, but also check that the genetics of the pairings did or did not have an effect on the response. My data looks a bit like this: response matrix chemical concentration Genetic Matrix Present Absent Male Female 2 152 0.13 a 1 6 121 1 a 2 21 92 2 b 3 24 89 5 b 4 0 141 10 c 5 5 95 15 c 6 R code: DA<-cbind(Present, Absent) glm<-(DA ~ chemical concentration) If I do glm<-(DA ~ chemical concentration + Male + Female, I get every possible combination, but I only want specific pairs. So, I am thinking about doing: MF<-cbind(Male, Female) glm<-(DA ~ chemical concentration + MF)
You're on the right track. paste() is probably what
you want, although you can also use interaction() to
get the interactions and then droplevels() to get
rid of the unobserved crosses.
d <- read.table(textConnection("
Present Absent conc Male Female
2 152 0.13 a 1
6 121 1 a 2
21 92 2 b 3
24 89 5 b 4
0 141 10 c 5
5 95 15 c 6"),
header=TRUE)
Either of these should give you what you want:
d <- droplevels(transform(d,cross=interaction(Male,Female)))
levels(d$cross)
d <- transform(d,cross=paste(Male,Female,sep="."))
levels(d$cross)
You should be a little careful -- if each cross is exposed
only to a single concentration, and if you treat cross as
a fixed effect, you will overparameterize your model. If
you treat it as a random effect, e.g. using glmer in the
lme4 package:
glmer(cbind(Present,Absent)~conc+(1|cross),data=d)
you will effectively be fitting a model for overdispersion
(see the example in ?cbpp).