Extrapolating values from a glm fit

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110126/56d09e7f/attachment.pl>

Dear R-help,

I have fitted a glm logistic function to dichotomous forced choices
responses varying according to time interval between two stimulus. x  
values
are time separation in miliseconds, and the y values are proportion
responses for one of the stimulus. Now I am trying to extrapolate x  
values
for the y value (proportion) at .25, .5, and .75. I have tried several
predict parameters, and they don't appear to be working. Is this  
correct use
and understanding of the predict function? It would be nice to know  
the
parameters for the glm best fit, but all I really need are the  
extrapolated
x values for those proportions. Thank you for your help. Here is the  
code:

x <-
c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167,
-150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7,
16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167,
183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)

y <-
c(0, 0.333333333333333, 0, 0, 0, 0, 0, 0, 0, 0.333333333333333,
0, 0.133333333333333, 0.238095238095238, 0.527777777777778,
0.566666666666667,
0.845238095238095, 0.55, 1, 0.888888888888889, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)

weight <-
c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7,
14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)

mylogit <- glm(y~x,weights=weight, family = binomial)

# now I try plotting the predicted value, and it looks like a good  
fit,
hopefully I can access what the glm is doing

ypred <- predict(mylogit,newdata=as.data.frame(x),type="response")
plot(x, ypred,type="l")
points(x,y)

# so I try to predict the x value when y (proportion) is at .5, but
something is wrong..
Right. Predict goes in the other direction ... x predicts y.

Perhaps if you created a function of y w.r.t. x and then inverted it.

?approxfun  # and see the posting earlier this week "Inverse  
Prediction with splines" where it was demonstrated how to reverse the  
roles of x and y.
predict(mylogit,newdata=as.data.frame(0.5))

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110126/17fed947/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110127/2c1637c8/attachment.pl>
Even when I try to predict y values from x, let's say I want to predict y at
x=0. Looking at the graph from the provided syntax, I would expect y to be
about 0.85. Is this right:

predict(mylogit,newdata=as.data.frame(0),type="response")
Your original problem was the use of `newdata = as.data.frame(0.5)`.
There are two problems here: i) if you don't name the input (x = 0.5,
say) then you get a data frame with the name(s) "0.5":
as.data.frame(0.5)
0.5
1 0.5

and ii) if you do name it, you still get a data frame with name(s) "0.5"
as.data.frame(x = 0.5)
0.5
1 0.5

In both cases, predict wants to find a variable with the name `x` in the
object supplied to `newdata`. It finds `x` but your `x` in the global
workspace, but warns because it knows that `newdata` was a data frame
with a single row - so there was a mismatch and you likely made a
mistake.

In these cases, `data.frame()` is preferred to `as.data.frame()`:

predict(mylogit, newdata = data.frame(x = 0), type = "response")

or we can use a list, to save a few characters:

predict(mylogit, newdata = list(x = 0), type = "response")

which give:
predict(mylogit, newdata = list(x = 0), type = "response")
1 
0.813069
predict(mylogit, newdata = data.frame(x = 0), type = "response")
1 
0.813069 

In summary, use `data.frame()` or `list()` to create the object passed
as `newdata` and make sure you give the component containing the new
values a *name* that matches the predictor in the model formula.

HTH

G
# I get:

Warning message:
'newdata' had 1 rows but variable(s) found have 34 rows

# Why do I need 34 rows? So I try:

v=vector()
v[1:34]=0
predict(mylogit,newdata=as.data.frame(v),type="response")

# And I get a matrix of 34 values that appear to fit a logistic function,
instead of 0.85..

On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius <dwinsemius at comcast.net>wrote:

On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:

 Dear R-help,
I have fitted a glm logistic function to dichotomous forced choices
responses varying according to time interval between two stimulus. x
values
are time separation in miliseconds, and the y values are proportion
responses for one of the stimulus. Now I am trying to extrapolate x values
for the y value (proportion) at .25, .5, and .75. I have tried several
predict parameters, and they don't appear to be working. Is this correct
use
and understanding of the predict function? It would be nice to know the
parameters for the glm best fit, but all I really need are the
extrapolated
x values for those proportions. Thank you for your help. Here is the code:

x <-
c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167,
-150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7,
16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167,
183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)

y <-
c(0, 0.333333333333333, 0, 0, 0, 0, 0, 0, 0, 0.333333333333333,
0, 0.133333333333333, 0.238095238095238, 0.527777777777778,
0.566666666666667,
0.845238095238095, 0.55, 1, 0.888888888888889, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)

weight <-
c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7,
14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)

mylogit <- glm(y~x,weights=weight, family = binomial)

# now I try plotting the predicted value, and it looks like a good fit,
hopefully I can access what the glm is doing

ypred <- predict(mylogit,newdata=as.data.frame(x),type="response")
plot(x, ypred,type="l")
points(x,y)

# so I try to predict the x value when y (proportion) is at .5, but
something is wrong..

Right. Predict goes in the other direction ... x predicts y.

Perhaps if you created a function of y w.r.t. x and then inverted it.

?approxfun  # and see the posting earlier this week "Inverse Prediction
with splines" where it was demonstrated how to reverse the roles of x and y.

predict(mylogit,newdata=as.data.frame(0.5))

       [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110127/60f8d792/attachment.pl>
Thank you for the clarification, this makes sense now! dose.p from
MASS also does the job perfectly, in returning x values for specified
proportions. I'm curious, if I use the other parameter in predict.glm
type="link" (instead of type="response"), in the case of a logistic
binomial, what does this predict?
This is the predicted value(s) on the scale of the linear predictor, or
the link function, depending on terminology, hence "link".

Recall that in the GLM the response and the linear predictor are related
through a link function:

g(y) = eta

so for your model

logit(y) = eta

where eta is the linear predictor: beta_0 + beta_1 * x (in your
example).

beta_0 + beta_1 * x gives us the fitted value but on the untransformed
scale. This is the value given by predict() when type = "link" is used.
To get the predicted value on the response scale, we apply the inverse
of the link function g():

y = g(eta)^-1

The inverse of the logit function is the inverse-logit:

logit(eta)^-1 = exp(eta) / (1 + exp(eta))

So in R, we can see the relationship through a few simple commands.
First, the prediction for x = 0.5 on the response scale:
predict(mylogit, newdata = list(x = 0.5), type = "response")
1 
0.8149848 

Then we compute the same prediction but on the scale of the link
function:
(p1 <- predict(mylogit, newdata = list(x = 0.5), type = "link"))
1 
1.482732 

to which we apply the inverse-logit function giving us the same value we
got earlier for type = "response":
exp(p1) / (1 + exp(p1))
1 
0.8149848 

There is a function to do this for us, however, called plogis()
plogis(p1)
1 
0.8149848

One reason why you might want predictions on the scale of the link
function is for computation of confidence intervals using normal theory
(e.g. 2-sigma ~95% confidence intervals). On the response scale these
should be asymmetric and respect the scale of the response (bounding
etc), so you compute them on the link scale and apply the inverse of the
link function to get them on to the response scale.

HTH

G

On Wed, Jan 26, 2011 at 11:41 PM, Gavin Simpson
<gavin.simpson at ucl.ac.uk> wrote:
        On Wed, 2011-01-26 at 19:25 -1000, Ahnate Lim wrote:
        > Even when I try to predict y values from x, let's say I want
        to predict y at
        > x=0. Looking at the graph from the provided syntax, I would
        expect y to be
        > about 0.85. Is this right:
        >
        > predict(mylogit,newdata=as.data.frame(0),type="response")

        Your original problem was the use of `newdata =
        as.data.frame(0.5)`.
        There are two problems here: i) if you don't name the input (x
        = 0.5,
        say) then you get a data frame with the name(s) "0.5":

        > as.data.frame(0.5)

         0.5
        1 0.5

        and ii) if you do name it, you still get a data frame with
        name(s) "0.5"

        > as.data.frame(x = 0.5)
         0.5
        1 0.5

        In both cases, predict wants to find a variable with the name
        `x` in the
        object supplied to `newdata`. It finds `x` but your `x` in the
        global
        workspace, but warns because it knows that `newdata` was a
        data frame
        with a single row - so there was a mismatch and you likely
        made a
        mistake.

        In these cases, `data.frame()` is preferred to
        `as.data.frame()`:

        predict(mylogit, newdata = data.frame(x = 0), type =
        "response")

        or we can use a list, to save a few characters:

        predict(mylogit, newdata = list(x = 0), type = "response")

        which give:

        > predict(mylogit, newdata = list(x = 0), type = "response")
              1
        0.813069
        > predict(mylogit, newdata = data.frame(x = 0), type =
        "response")
              1
        0.813069

        In summary, use `data.frame()` or `list()` to create the
        object passed
        as `newdata` and make sure you give the component containing
        the new
        values a *name* that matches the predictor in the model
        formula.

        HTH

        G

        >
        > # I get:
        >
        > Warning message:
        > 'newdata' had 1 rows but variable(s) found have 34 rows
        >
        > # Why do I need 34 rows? So I try:
        >
        > v=vector()
        > v[1:34]=0
        > predict(mylogit,newdata=as.data.frame(v),type="response")
        >
        > # And I get a matrix of 34 values that appear to fit a
        logistic function,
        > instead of 0.85..
        >
        >
        >
        >
        > On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius
        <dwinsemius at comcast.net>wrote:
        >
        > >
        > > On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:
        > >
        > >  Dear R-help,
        > >>
        > >> I have fitted a glm logistic function to dichotomous
        forced choices
        > >> responses varying according to time interval between two
        stimulus. x
        > >> values
        > >> are time separation in miliseconds, and the y values are
        proportion
        > >> responses for one of the stimulus. Now I am trying to
        extrapolate x values
        > >> for the y value (proportion) at .25, .5, and .75. I have
        tried several
        > >> predict parameters, and they don't appear to be working.
        Is this correct
        > >> use
        > >> and understanding of the predict function? It would be
        nice to know the
        > >> parameters for the glm best fit, but all I really need
        are the
        > >> extrapolated
        > >> x values for those proportions. Thank you for your help.
        Here is the code:
        > >>
        > >> x <-
        > >> c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7,
        -167,
        > >> -150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1,
        -33.4, -16.7,
        > >> 16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3,
        167,
        > >> 183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)
        > >>
        > >> y <-
        > >> c(0, 0.333333333333333, 0, 0, 0, 0, 0, 0, 0,
        0.333333333333333,
        > >> 0, 0.133333333333333, 0.238095238095238,
        0.527777777777778,
        > >> 0.566666666666667,
        > >> 0.845238095238095, 0.55, 1, 0.888888888888889, 1, 1, 1,
        1, 1,
        > >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)
        > >>
        > >> weight <-
        > >> c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16,
        11, 7,
        > >> 14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)
        > >>
        > >> mylogit <- glm(y~x,weights=weight, family = binomial)
        > >>
        > >> # now I try plotting the predicted value, and it looks
        like a good fit,
        > >> hopefully I can access what the glm is doing
        > >>
        > >> ypred <-
        predict(mylogit,newdata=as.data.frame(x),type="response")
        > >> plot(x, ypred,type="l")
        > >> points(x,y)
        > >>
        > >> # so I try to predict the x value when y (proportion) is
        at .5, but
        > >> something is wrong..
        > >>
        > >
        > > Right. Predict goes in the other direction ... x predicts
        y.
        > >
        > > Perhaps if you created a function of y w.r.t. x and then
        inverted it.
        > >
        > > ?approxfun  # and see the posting earlier this week
        "Inverse Prediction
        > > with splines" where it was demonstrated how to reverse the
        roles of x and y.
        > >
        > >>
        > >> predict(mylogit,newdata=as.data.frame(0.5))
        > >>
        > >>        [[alternative HTML version deleted]]
        > >>
        > >> ______________________________________________
        > >> R-help at r-project.org mailing list
        > >> https://stat.ethz.ch/mailman/listinfo/r-help
        > >> PLEASE do read the posting guide
        > >> http://www.R-project.org/posting-guide.html
        > >> and provide commented, minimal, self-contained,
        reproducible code.
        > >>
        > >
        > > David Winsemius, MD
        > > West Hartford, CT
        > >
        > >
        >
        >       [[alternative HTML version deleted]]
        >
        > ______________________________________________
        > R-help at r-project.org mailing list
        > https://stat.ethz.ch/mailman/listinfo/r-help
        > PLEASE do read the posting guide
        http://www.R-project.org/posting-guide.html
        > and provide commented, minimal, self-contained, reproducible
        code.

        --
        %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
        %~%~%~%
         Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
         ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
         Pearson Building,             [e]
        gavin.simpsonATNOSPAMucl.ac.uk
         Gower Street, London          [w]
        http://www.ucl.ac.uk/~ucfagls/
         UK. WC1E 6BT.                 [w]
        http://www.freshwaters.org.uk
        %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~
        %~%~%~%

%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%