Dear all, For a research project on climate legislation in the U.S., I am analyzing data on the votes that Senators cast on several cap-and-trade bills in the period 2003-2008. For each Senator, we have data about how he or she voted regarding a certain bill (i.e., 'yea' or 'nay')--given, of course, that that Senator had a seat in Congress in the year that the bill was voted upon. We want to explain the voting behavior of these Senators given characteristics of the Senators and of their constituencies, that is, the states they represent, but at the same time take into account the nested structure of the data. Thus, the data looks as follows: state Senator bill vote FL 'Bill Nelson' 'CSA2003' 'yea' FL 'Bob Graham' 'CSA2003' 'yea' FL 'Bill Nelson' 'CSA2005' 'yea' FL 'Mel Martinez' 'CSA2005' 'nay' (See attachment for a sample of the data.) One choice to analyze such data seems to be a mixed model with both crossed and nested random factors. First, Senators are expected to behave consistently over time: their votes on different bills should be similar. Second, pairs of Senators represent the same state: for example, in 2003, Bill Nelson and Bob Graham both represented Florida. So, there seems to be a random effect of Senators, which are nested in states. Third, there would be a random effect of bill, which is crossed with states and Senators. Finally, the model should be logistic, as votes can be either 'yea' or 'nay'. 1. How should I specify such a model? Is it sufficient just to specify both the nested random effects of Senator and state, as well as the random effect of bill (in analogy to this post: http://r.789695.n4.nabble.com/lmer-crossed-random-effects-specification-td831762.html)? For example, in case of a model with only random intercepts for Senator, state and bill: dataSenate <- read.table("sampledata.txt", header = TRUE, sep = "\t", na.strings = c("-1")) dataSenate$state <- as.factor(dataSenate$state) dataSenate$Senator <- as.factor(dataSenate$Senator) dataSenate$bill <- as.factor(dataSenate$bill) library(lme4) interceptonly <- glmer(vote ~ 1 + (1 | state/Senator) + (1 | bill), data = dataSenate, family=binomial(link = "logit")) Or should I use the pdBlocked and pdIdent formulation that is suggested here: http://tolstoy.newcastle.edu.au/R/help/02b/2068.html? 2. This does not seem to be a balanced design: some Senators lost their seat in the period 2003-2008, so that many of them did not vote upon all three of the bills. In other words, for many Senator-bill-combinations, there are no data. Should this affect my interpretation of the results? Best regards, Clara Vandeweerdt Master in Comparative and International Politics, 2013 Faculty of Social Sciences KU Leuven Belgium -------------- next part -------------- state bill Senator vote WA CSA2003 Patty Murray 1 WA CSA2003 Maria Cantwell 1 WA ACSA2008 Patty Murray 1 WA ACSA2008 Maria Cantwell 1 WA CSA2005 Patty Murray 1 WA CSA2005 Maria Cantwell 1 DE CSA2003 Joseph Biden 1 DE CSA2003 Thomas Carper 1 DE ACSA2008 Joseph Biden -1 DE ACSA2008 Thomas Carper 1 DE CSA2005 Joseph Biden 1 DE CSA2005 Thomas Carper 1 WI CSA2003 Herbert Herb Kohl 1 WI CSA2003 Russell Feingold 1 WI ACSA2008 Herbert Herb Kohl 1 WI ACSA2008 Russell Feingold 1 WI CSA2005 Herbert Herb Kohl 1 WI CSA2005 Russell Feingold 0 WV CSA2003 John Jay Rockefeller 1 WV CSA2003 Robert Byrd 0 WV ACSA2008 John Jay Rockefeller 1 WV ACSA2008 Robert Byrd -1 WV CSA2005 John Jay Rockefeller 1 WV CSA2005 Robert Byrd 0 HI CSA2003 Daniel Akaka 1 HI CSA2003 Daniel Inouye 1 HI ACSA2008 Daniel Akaka 1 HI ACSA2008 Daniel Inouye 1 HI CSA2005 Daniel Akaka 1 HI CSA2005 Daniel Inouye 1 FL CSA2003 Bill Nelson 1 FL CSA2003 Bob Graham 1 FL ACSA2008 Bill Nelson 1 FL ACSA2008 Mel Martinez 1 FL CSA2005 Bill Nelson 1 FL CSA2005 Mel Martinez 0
lmer: Model with crossed and nested factors, unbalanced data
1 message · Clara Vandeweerdt