Skip to content

Extrapolating values from a glm fit

7 messages · David Winsemius, Dennis Murphy, Gavin Simpson +1 more

#
On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:

            
Right. Predict goes in the other direction ... x predicts y.

Perhaps if you created a function of y w.r.t. x and then inverted it.

?approxfun  # and see the posting earlier this week "Inverse  
Prediction with splines" where it was demonstrated how to reverse the  
roles of x and y.
David Winsemius, MD
West Hartford, CT
#
On Wed, 2011-01-26 at 19:25 -1000, Ahnate Lim wrote:
Your original problem was the use of `newdata = as.data.frame(0.5)`.
There are two problems here: i) if you don't name the input (x = 0.5,
say) then you get a data frame with the name(s) "0.5":
0.5
1 0.5

and ii) if you do name it, you still get a data frame with name(s) "0.5"
0.5
1 0.5

In both cases, predict wants to find a variable with the name `x` in the
object supplied to `newdata`. It finds `x` but your `x` in the global
workspace, but warns because it knows that `newdata` was a data frame
with a single row - so there was a mismatch and you likely made a
mistake.

In these cases, `data.frame()` is preferred to `as.data.frame()`:

predict(mylogit, newdata = data.frame(x = 0), type = "response")

or we can use a list, to save a few characters:

predict(mylogit, newdata = list(x = 0), type = "response")

which give:
1 
0.813069
1 
0.813069 

In summary, use `data.frame()` or `list()` to create the object passed
as `newdata` and make sure you give the component containing the new
values a *name* that matches the predictor in the model formula.

HTH

G

  
    
#
On Thu, 2011-01-27 at 00:10 -1000, Ahnate Lim wrote:
This is the predicted value(s) on the scale of the linear predictor, or
the link function, depending on terminology, hence "link".

Recall that in the GLM the response and the linear predictor are related
through a link function:

g(y) = eta

so for your model

logit(y) = eta

where eta is the linear predictor: beta_0 + beta_1 * x (in your
example).

beta_0 + beta_1 * x gives us the fitted value but on the untransformed
scale. This is the value given by predict() when type = "link" is used.
To get the predicted value on the response scale, we apply the inverse
of the link function g():

y = g(eta)^-1

The inverse of the logit function is the inverse-logit:

logit(eta)^-1 = exp(eta) / (1 + exp(eta))

So in R, we can see the relationship through a few simple commands.
First, the prediction for x = 0.5 on the response scale:
1 
0.8149848 

Then we compute the same prediction but on the scale of the link
function:
1 
1.482732 

to which we apply the inverse-logit function giving us the same value we
got earlier for type = "response":
1 
0.8149848 

There is a function to do this for us, however, called plogis()
1 
0.8149848

One reason why you might want predictions on the scale of the link
function is for computation of confidence intervals using normal theory
(e.g. 2-sigma ~95% confidence intervals). On the response scale these
should be asymmetric and respect the scale of the response (bounding
etc), so you compute them on the link scale and apply the inverse of the
link function to get them on to the response scale.

HTH

G