Predictions with GAM

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/78729915/attachment-0001.pl>
Dear,

I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
I have been able to get some of my GAM's working with the example shown
below:
*
model1<-gam(nsdall ~ s(jdaylitr2), data=datansd)
newd1 <- data.frame(jdaylitr2=(244:304))
pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,

You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?

This is not tested, but why not something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                    mapID = rep(levels(datansd$mapID), each = 100))

Then use newd2 in your call to predict.

I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                    mapID = rep(sort(unique(datansd$mapID)), 
                                each = 100))

Does that work for you?

HTH

G
The problem I am encountering now is that I cannot seem to get it done for
the following type of model:

*model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my first
model I need all columns to have the same amount of rows or else R will not
except it ofcourse, the col.names need to at least include day and mapID.
This way I can not get a prediction working for this GAM, I am confused
because of this part in the model: *s(day,by=mapID).

*I have been reading through the GAM, an introduction with R book from Wood,
S. but could not find anything about predictions with BY in the model.

I hope someone can help me out with this,

Sincerely yours,

Robbert Langenberg

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/4e82c6a8/attachment-0002.bin>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/aeb2aae5/attachment-0001.pl>
Hi,

Robbert Langenberg <mcrelay <at> gmail.com> writes:
I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
The problem I am encountering now is that I cannot seem to get it done for
the following type of model:

*model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my first
model I need all columns to have the same amount of rows or else R will not
except it ofcourse, the col.names need to at least include day and mapID.
This way I can not get a prediction working for this GAM, I am confused
because of this part in the model: *s(day,by=mapID).

I hope someone can help me out with this,

Sincerely yours,

Robbert Langenberg
I'm not sure that this will work for you, but I had a similar
situation and was able to get predict to work (after helpful
advice from Simon Wood) with a by variable by generating 
a model matrix for a model with the interaction of the covariate 
and the by term, something like

model.matrix(~ day:mapID - 1, data = mergeday)

in your case.
I added the appropriate columns into my data frame and 
also to the newdata for predict.  You can see an example 
in the appendix of

http://www.journalofvision.org/8/16/10/

HTH,

Ken
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
D?partement Neurosciences Int?gratives
18 avenue du Doyen L?pine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr
Ken Knoblauch <ken.knoblauch <at> inserm.fr> writes:
Robbert Langenberg <mcrelay <at> gmail.com> writes:
I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
The problem I am encountering now is that I cannot seem to get it done for
the following type of model:

*model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my first
model I need all columns to have the same amount of rows or else R will not
except it ofcourse, the col.names need to at least include day and mapID.
This way I can not get a prediction working for this GAM, I am confused
because of this part in the model: *s(day,by=mapID).
Sincerely yours,
Robbert Langenberg
I'm not sure that this will work for you, but I had a similar
situation and was able to get predict to work (after helpful
advice from Simon Wood) with a by variable by generating 
a model matrix for a model with the interaction of the covariate 
and the by term, something like

model.matrix(~ day:mapID - 1, data = mergeday)

in your case.
I added the appropriate columns into my data frame and 
also to the newdata for predict.  You can see an example 
in the appendix of
Let me correct this before someone else does for me,
(It's getting late on Friday afternoon!)
The interaction in the model matrix would not be
with day.  In my case, I had a covariate for the by
variable and a factor with two levels, so I used
the interaction to separate them with respect to
the smooths.  I think that you might be able to get
the columns from the model matrix columns from
the mapID term by itself.  Sorry for the misstep.

Ken
Robbert,
Thanks for the swift reply,

I might have been a bit sloppy with describing my datasets and problem. I
showed the first model as an example of the type of GAM that I had been
able to use the predict function on. What I am looking for is how to
predict my m3:
model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)

When I plot this I get 8 different graphs. Each showing me a different
habitat type with on the x-axis days and on the y-axis s(day,2,81):mapID.
With predict I was hoping to get the scale of the y-axis right for a
selection of days (for example 244,304).

I have tried to reform the script you gave me to match my dataset in m3,
but it all did not seem to work.
--- Can you explain how it didn't work? What exactly was wrong with what 
predict.gam gave you? [when I try predict.gam for a gam with factor by 
variables, I seem to get what I'd expect, so I need a bit more detail to 
figure out what the issue is.] 

-- incidentally, is there any reason for omitting the main effect of `mapID' 
from the model? I'd usually expect to see 
  y_no~s(day,by=mapID)+mapID
as the model formula in this sort of situation. (see ?gam.models for details)

best,
Simon
newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(levels(mergeday$mapID), each = 100))

newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(sort(unique(mergeday$mapID)),
                               each = 100))

I am guessing it must have something to do with the  by in s(day,by=mapID).
I haven't come across any examples that used a GAM with by and then used
the predict function.

(A sample of the dataset:
          mapID                  day   y_no
Urban Areas and Water    25      1
Urban Areas and Water    26      1
Early Succesional Forest  27      0
Agriculture                       28      0
Early Succesional Forest  29      0
Mature Coniferous Forest  30      0)

I am sorry that I have to bother you even more with this, and I hope that
my additional explanation about my problem might help solve it.

Sincerely yours,

Robbert Langenberg

2009/1/16 Gavin Simpson <gavin.simpson at ucl.ac.uk>

On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
Dear,

I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
I have been able to get some of my GAM's working with the example shown
below:
*
model1<-gam(nsdall ~ s(jdaylitr2), data=datansd)
newd1 <- data.frame(jdaylitr2=(244:304))
pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,

You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?

This is not tested, but why not something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(levels(datansd$mapID), each = 100))

Then use newd2 in your call to predict.

I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(sort(unique(datansd$mapID)),
                               each = 100))

Does that work for you?

HTH

G

The problem I am encountering now is that I cannot seem to get it done
for

the following type of model:

*model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my
first model I need all columns to have the same amount of rows or else
R will
not

except it ofcourse, the col.names need to at least include day and
mapID. This way I can not get a prediction working for this GAM, I am
confused because of this part in the model: *s(day,by=mapID).

*I have been reading through the GAM, an introduction with R book from
Wood,

S. but could not find anything about predictions with BY in the model.

I hope someone can help me out with this,

Sincerely yours,

Robbert Langenberg

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w]
http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> UK. WC1E
6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283
Here's a simulated example (although really for this model structure, one 
might as well  fit seperate models for each factor level). 

## Simulate some data with factor dependent smooths
n <- 400
x <- runif(n, 0, 1)
f1 <- 2 * sin(pi * x)
f2 <- exp(2 * x) - 3.75887
f3 <- 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 * x)^3 *
            (1 - x)^10
fac <- as.factor(c(rep(1, 100), rep(2, 100), rep(3, 200)))
fac.1 <- as.numeric(fac == 1)
fac.2 <- as.numeric(fac == 2)
fac.3 <- as.numeric(fac == 3)
f <- f1 * fac.1 + f2 * fac.2 + f3 * fac.3
y <- rpois(f,exp(f/4))

## fit gam, with a smooth of `x' for each level of `fac'
b <- gam(y~s(x,by=fac)+fac,family=poisson)
par(mfrow=c(2,2))
plot(b)

## produce plots on response scale, first the prediction...
np <- 200
newd <- data.frame(x=rep(seq(0,1,length=np),3),
                   fac=factor(c(rep(1,np),rep(2,np),rep(3,np))))
pv <- predict(b,newd,type="response")

## .. now the plotting
par(mfrow=c(2,2))
ind <- 1:np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=1)")
ind <- ind+np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=2)")
ind <- ind+np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=2)")
Thanks for the swift reply,

I might have been a bit sloppy with describing my datasets and problem. I
showed the first model as an example of the type of GAM that I had been
able to use the predict function on. What I am looking for is how to
predict my m3:
model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)

When I plot this I get 8 different graphs. Each showing me a different
habitat type with on the x-axis days and on the y-axis s(day,2,81):mapID.
With predict I was hoping to get the scale of the y-axis right for a
selection of days (for example 244,304).

I have tried to reform the script you gave me to match my dataset in m3,
but it all did not seem to work.

newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(levels(mergeday$mapID), each = 100))

newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(sort(unique(mergeday$mapID)),
                               each = 100))

I am guessing it must have something to do with the  by in s(day,by=mapID).
I haven't come across any examples that used a GAM with by and then used
the predict function.

(A sample of the dataset:
          mapID                  day   y_no
Urban Areas and Water    25      1
Urban Areas and Water    26      1
Early Succesional Forest  27      0
Agriculture                       28      0
Early Succesional Forest  29      0
Mature Coniferous Forest  30      0)

I am sorry that I have to bother you even more with this, and I hope that
my additional explanation about my problem might help solve it.

Sincerely yours,

Robbert Langenberg

2009/1/16 Gavin Simpson <gavin.simpson at ucl.ac.uk>

On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
Dear,

I am trying to get a prediction of my GAM on a response type. So that I
eventually get plots with the correct values on my ylab.
I have been able to get some of my GAM's working with the example shown
below:
*
model1<-gam(nsdall ~ s(jdaylitr2), data=datansd)
newd1 <- data.frame(jdaylitr2=(244:304))
pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,

You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?

This is not tested, but why not something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(levels(datansd$mapID), each = 100))

Then use newd2 in your call to predict.

I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:

newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
                   mapID = rep(sort(unique(datansd$mapID)),
                               each = 100))

Does that work for you?

HTH

G

The problem I am encountering now is that I cannot seem to get it done
for

the following type of model:

*model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)*

My mapID consists of 8 levels of which I get individual plots with *
plot(model3)*. When I do predict with a newdata in it just like my
first model I need all columns to have the same amount of rows or else
R will
not

except it ofcourse, the col.names need to at least include day and
mapID. This way I can not get a prediction working for this GAM, I am
confused because of this part in the model: *s(day,by=mapID).

*I have been reading through the GAM, an introduction with R book from
Wood,

S. but could not find anything about predictions with BY in the model.

I hope someone can help me out with this,

Sincerely yours,

Robbert Langenberg

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.

--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w]
http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> UK. WC1E
6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090121/0c4c8def/attachment-0001.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090121/7a414df2/attachment-0001.pl>