Hi, Mike I think the terminology is confusing everybody. Let me tell you how I understand what you say, you can try again to tell us what you want. Since I've just asked a similar kind of question comparing IRT scaling with lmer, this is fresh in my mind.
On Fri, Oct 29, 2010 at 11:23 AM, Mike Lawrence <Mike.Lawrence at dal.ca> wrote:
Hi folks, In some areas of psychology, we encounter binomial response data that, when aggregated to proportions and plotted against a continuous predictor variable, forms a sigmoid-like function. It is typical to use OLS to fit a probit function to this data, yielding measures of bias (mean of the Gaussian) and variability (SD of the Gaussian).
That kind of data is commonly called "grouped" data, as opposed to individual level data. In the olden days, that kind of grouped-data regression you describe was sometimes called a "minimum chi-square model". I have no idea what you mean "bias" in this context. I've never seen a probit function fitted with OLS, but I've seen a logistic transformation of the proportions on the left hand side leading to a regression like ln( prop/(1-prop)) = X b + e OLS is heteroskedastic, even in the olden days you'd have use WLS to estimate these coefficients. There is heteroskedasticity because 1) there are different numbers of observations in each group and 2) the variance of the error term is proportional to prop(1-prop). This model is NOT perfectly equivalent to a probit (or logistic) regression on individual level data, the kind where Pr(y=1 | x, b) = PHI(Xb) where PHI is the cumulative distribution of a Normal for probit or Logistic for logit. That individual model does not exactly coincide with the grouped proportion model., partly because the "e" term in the OLS regression has no direct counterpart in the individual level regression. The scaling of the parameters is arbitrary, they will be proportional to one another. These days, I'd suggest you fit the "proportion" model with a Beta regression, for which there is a very excellent R package (betareg). That is, if you want to analyze the grouped-level proportion data, that is. Between the group proportion and individual-level probit, the estimates of the b's are, at least in theory, estimating the same thing, except for some scaling effects. But they never really are the same. This
fitting is typically done within each individual and condition of interest separately, then the resulting parameters are submitted to 2 ANOVAs: one for bias, one for variability.
This one has me stumped. Can you supply some citations? "within each individual" is puzzling to me. Variability of what? bias in the sense of a known mismatch between a "true" parameter value and its estimate? And the 2 separate ANOVA, well, I think you need to write it down. I wonder if this analysis
might be achieved more efficiently using a single mixed effects model, but I'm having trouble figuring out how to approach coding this. Below is an example of data similar to that collected in this sort of research, where individuals fall into two groups (variable "group"), and are tested under two conditions (variable "cue") across a set of values from a continuous variable (variable "soa"), with each cue*soa combination tested repeatedly within each individual. A model like fit = lmer( ? ?formula = response ~ (1|id) + group*cue*soa ? ?, family = binomial( link='probit' ) ? ?, data = a )
This does not have random effects for group, cue, soa. it gives fixed estimates for group, cue, soa, and all interactions among them. I don't think you mean that.
employs the probit link, but of course yields estimates for the slope and intercept of a linear model on the probit scale, and I'm not sure how (if it's even possible) to convert the conclusions drawn on this scale to conclusions about the bias and variability parameters of interest. Thoughts?
I think I'm inclined to say that the "bias" and "variability" parameters you mention are not sensible, because I've never seen a publication that uses that approach you describe. My guess is that you are trying to replicate nonsense, which is, well, a time honored tradition :) But, in my last year of work in an interdisciplinary statistics center, I've learned that all of the fields have their own nicknames for things and so it is quite likely we have no idea what you are asking because the nicknames you use are different than the nicknames we use. In particular, I bet your claim of estimating probit models with OLS made some heads spin.
Mike
Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas