psychometric function fitting with lmer?

In some areas of psychology, we encounter binomial response data that,
when aggregated to proportions and plotted against a continuous
predictor variable, forms a sigmoid-like function.

Yes, what is called an empirical item characteristic curve (eICC)
It is typical to
use OLS to fit a probit function to this data, yielding measures of
bias (mean of the Gaussian) and variability (SD of the Gaussian).

This is a very odd way to compute bias. First, I don't know how one uses OLS
 to fit a probit model. Probit models are fit using (R)IGLS, not OLS. Second,
 why are you treating the observed data as a parameter estimate? Why don't you
 actually estimate the model parameters (i.e., the item parameters), which are
 asymptotically unbiased under certain estimation conditions. You can do this in a number of
 ways in R, lme4 can do this using lmer as described here:

 http://www.jstatsoft.org/v20/i02

 Or you can use JML methods for Rasch in the MiscPsycho package or you can use
 MML methods in the LTM package. What you seem to be doing is treating the eICC
 as some kind of parameter for the item; but this is not reasonable I don't
 think.
This
fitting is typically done within each individual and condition of
interest separately, then the resulting parameters are submitted to 2
ANOVAs: one for bias, one for variability. I wonder if this analysis
might be achieved more efficiently using a single mixed effects model,
but I'm having trouble figuring out how to approach coding this.

I'm not sure I can help you here as I am unclear on what you are doing
 exactly. Maybe if we elaborate a bit on what you are trying to do above, we
 can do this part next.
Below
is an example of data similar to that collected in this sort of
research, where individuals fall into two groups (variable "group"),
and are tested under two conditions (variable "cue") across a set of
values from a continuous variable (variable "soa"), with each cue*soa
combination tested repeatedly within each individual. A model like

fit = lmer(
    formula = response ~ (1|id) + group*cue*soa
    , family = binomial( link='probit' )
    , data = a
)

employs the probit link, but of course yields estimates for the slope
and intercept of a linear model on the probit scale, and I'm not sure
how (if it's even possible) to convert the conclusions drawn on this
scale to conclusions about the bias and variability parameters of
interest.

Thoughts?

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
? First, I don't know how one uses OLS
?to fit a probit model.
I've seen it done. Folks usually collapse responses to
means-per-value-on-the-x-axis  then either use a computationally
intensive search algorithm to minimize the squared error on the
proportion scale, or fit a simple linear function on the probit scale
(when they encounter means of 1 or 0, they "tweak" these values by
either dropping that data entirely or adding/subtracting some
arbitrary value).

Regardless, I suspect we both agree that these are inadvisable ways of
dealing with this data, but I'm not sure we are on the same page with
respect to the underlying paradigm motivating the data analysis.
Whereas the paper you provided appears to be discussing data derived
from questionnaires with different items, etc, I was thinking (and I
apologize for failing to be more clear on this earlier) of data
derived from studies of temporal order judgement and other
psychophysical discrimination studies. Here's an example that I
happened to find while searching google for an article not behind a
pay-wall:

http://www.psych.ut.ee/~jyri/en/Murd-Kreegipuu-Allik_Perception2009.pdf

In such studies, individuals are provided two stimuli and asked "which
one is more X", where the stimuli are manipulated to explore a variety
of values for the difference of X between them. For example, in
temporal order judgements, we ask which of two successive stimuli came
first, right or left, then plot proportion of "right first" responses,
accumulated over many trials, as a function of the amount of time by
which the right stimulus led the right stimulus (SOA, or stimulus
onset asynchrony, where negative values mean the right stimulus
followed the left stimulus). This typically yields a sigmoidal
function where people are unlikely to say "right-first" when the left
stimulus leads by a lot (large negative SOA values) and very likely to
say "right-first" when the right stimulus leads by a lot (large
positive SOA values. The place where this function crosses 50% is
termed the point of subjective simultaneity (PSS) and the slope of the
function indexes the participants' sensitivity (shallow slopes
indicate poor sensitivity, sharp slopes indicate good sensitivity).
Researchers are often then interested in how various experimental
manipulations affect these two characteristics of performance.
Second,
?why are you treating the observed data as a parameter estimate? Why don't you
?actually estimate the model parameters (i.e., the item parameters), which are
?asymptotically unbiased under certain estimation conditions. You can do this in a number of
?ways in R, lme4 can do this using lmer as described here:

?http://www.jstatsoft.org/v20/i02

?Or you can use JML methods for Rasch in the MiscPsycho package or you can use
?MML methods in the LTM package. What you seem to be doing is treating the eICC
?as some kind of parameter for the item; but this is not reasonable I don't
?think.

This
fitting is typically done within each individual and condition of
interest separately, then the resulting parameters are submitted to 2
ANOVAs: one for bias, one for variability. I wonder if this analysis
might be achieved more efficiently using a single mixed effects model,
but I'm having trouble figuring out how to approach coding this.

?I'm not sure I can help you here as I am unclear on what you are doing
?exactly. Maybe if we elaborate a bit on what you are trying to do above, we
?can do this part next.

Below
is an example of data similar to that collected in this sort of
research, where individuals fall into two groups (variable "group"),
and are tested under two conditions (variable "cue") across a set of
values from a continuous variable (variable "soa"), with each cue*soa
combination tested repeatedly within each individual. A model like

fit = lmer(
? ? formula = response ~ (1|id) + group*cue*soa
? ? , family = binomial( link='probit' )
? ? , data = a
)

employs the probit link, but of course yields estimates for the slope
and intercept of a linear model on the probit scale, and I'm not sure
how (if it's even possible) to convert the conclusions drawn on this
scale to conclusions about the bias and variability parameters of
interest.

Thoughts?

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

The info below is helpful, more comments below.
-----Original Message-----
From: mike.lwrnc at gmail.com [mailto:mike.lwrnc at gmail.com] On Behalf Of Mike
Lawrence
Sent: Friday, October 29, 2010 3:41 PM
To: Doran, Harold
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] psychometric function fitting with lmer?

On Fri, Oct 29, 2010 at 3:29 PM, Doran, Harold <HDoran at air.org> wrote:
? First, I don't know how one uses OLS
?to fit a probit model.
I've seen it done. Folks usually collapse responses to
means-per-value-on-the-x-axis  then either use a computationally
intensive search algorithm to minimize the squared error on the
proportion scale, or fit a simple linear function on the probit scale
(when they encounter means of 1 or 0, they "tweak" these values by
either dropping that data entirely or adding/subtracting some
arbitrary value).
I think you're right, this seems like an inadvisable way to estimate some model parameters. I'll try and ignore this part of the email and focus on the info below. I will most likely explode if I try and figure out a) how this was done and b) why someone would do it this way when there are well-known ways to estimate model parameters in such cases.
Regardless, I suspect we both agree that these are inadvisable ways of
dealing with this data, but I'm not sure we are on the same page with
respect to the underlying paradigm motivating the data analysis.
Whereas the paper you provided appears to be discussing data derived
from questionnaires with different items, etc, I was thinking (and I
apologize for failing to be more clear on this earlier) of data
derived from studies of temporal order judgement and other
psychophysical discrimination studies. Here's an example that I
happened to find while searching google for an article not behind a
pay-wall:

http://www.psych.ut.ee/~jyri/en/Murd-Kreegipuu-Allik_Perception2009.pdf

In such studies, individuals are provided two stimuli and asked "which
one is more X", where the stimuli are manipulated to explore a variety
of values for the difference of X between them. For example, in
temporal order judgements, we ask which of two successive stimuli came
first, right or left, then plot proportion of "right first" responses,
accumulated over many trials, as a function of the amount of time by
which the right stimulus led the right stimulus (SOA, or stimulus
onset asynchrony, where negative values mean the right stimulus
followed the left stimulus). 
OK, with you here so far, though this kind of thing is a bit away from my field of study. So, let me simplify for sake of argument, we have binary responses at this point where 1 = respondent answer 'right' and 0 = respondent answer 'left'. You also have some observed characteristics of these individuals, call them x. 

Now, you use the terms "likely" and "unlikely" below is something of a colloquial sense, but we can actually quantify this in a model such as:

Pr(1|\theta, \beta) = 1/[1 + exp(beta-theta)]

Which gives the conditional probability that some individual with theta (being an aptitude of some form) will choose the answer "right" also conditional on \beta, which is a characteristic of the item/task itself. Now, you state that you have other observed characteristics, such as "time". You can further condition on these observed characteristics to get the conditional probabilities directly, which seems to be what you are after. If this is right, then the methods in the paper I linked are directly related to this problem, just based on a different data set. It is a general modeling strategy you can employ with lmer.

What I don't understand is what bias or variance are you trying to get at? Bias refers to the property that \beta - E[\hat{\beta}] = 0, which would not hold if the parameter estimate were biased. Maybe I am still a bit unclear on the issue.
This typically yields a sigmoidal
function where people are unlikely to say "right-first" when the left
stimulus leads by a lot (large negative SOA values) and very likely to
say "right-first" when the right stimulus leads by a lot (large
positive SOA values. The place where this function crosses 50% is
termed the point of subjective simultaneity (PSS) and the slope of the
function indexes the participants' sensitivity (shallow slopes
indicate poor sensitivity, sharp slopes indicate good sensitivity).
Researchers are often then interested in how various experimental
manipulations affect these two characteristics of performance.
Now, if the slope of the curve matters, and it often does, then lmer cannot be used to estimate such a model because the model we demonstrate (Rasch model) assumes all items/tasks have a constant slope. But, other models extend the conditional probability above and can do this. I believe you can accomplish this using LTM package.

Second,
?why are you treating the observed data as a parameter estimate? Why don't
you
?actually estimate the model parameters (i.e., the item parameters), which
are
?asymptotically unbiased under certain estimation conditions. You can do
this in a number of
?ways in R, lme4 can do this using lmer as described here:

?http://www.jstatsoft.org/v20/i02

?Or you can use JML methods for Rasch in the MiscPsycho package or you can
use
?MML methods in the LTM package. What you seem to be doing is treating the
eICC
?as some kind of parameter for the item; but this is not reasonable I don't
?think.

This
fitting is typically done within each individual and condition of
interest separately, then the resulting parameters are submitted to 2
ANOVAs: one for bias, one for variability. I wonder if this analysis
might be achieved more efficiently using a single mixed effects model,
but I'm having trouble figuring out how to approach coding this.

?I'm not sure I can help you here as I am unclear on what you are doing
?exactly. Maybe if we elaborate a bit on what you are trying to do above, we
?can do this part next.

Below
is an example of data similar to that collected in this sort of
research, where individuals fall into two groups (variable "group"),
and are tested under two conditions (variable "cue") across a set of
values from a continuous variable (variable "soa"), with each cue*soa
combination tested repeatedly within each individual. A model like

fit = lmer(
? ? formula = response ~ (1|id) + group*cue*soa
? ? , family = binomial( link='probit' )
? ? , data = a
)

employs the probit link, but of course yields estimates for the slope
and intercept of a linear model on the probit scale, and I'm not sure
how (if it's even possible) to convert the conclusions drawn on this
scale to conclusions about the bias and variability parameters of
interest.

Thoughts?

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models