An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/78729915/attachment-0001.pl>
Predictions with GAM
9 messages · Gavin Simpson, Ken Knoblauch, Simon Wood +1 more
On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
Dear, I am trying to get a prediction of my GAM on a response type. So that I eventually get plots with the correct values on my ylab. I have been able to get some of my GAM's working with the example shown below: * model1<-gam(nsdall ~ s(jdaylitr2), data=datansd) newd1 <- data.frame(jdaylitr2=(244:304)) pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,
You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?
This is not tested, but why not something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(datansd$mapID), each = 100))
Then use newd2 in your call to predict.
I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(datansd$mapID)),
each = 100))
Does that work for you?
HTH
G
The problem I am encountering now is that I cannot seem to get it done for the following type of model: *model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)* My mapID consists of 8 levels of which I get individual plots with * plot(model3)*. When I do predict with a newdata in it just like my first model I need all columns to have the same amount of rows or else R will not except it ofcourse, the col.names need to at least include day and mapID. This way I can not get a prediction working for this GAM, I am confused because of this part in the model: *s(day,by=mapID). *I have been reading through the GAM, an introduction with R book from Wood, S. but could not find anything about predictions with BY in the model. I hope someone can help me out with this, Sincerely yours, Robbert Langenberg [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/4e82c6a8/attachment-0002.bin>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090116/aeb2aae5/attachment-0001.pl>
Hi, Robbert Langenberg <mcrelay <at> gmail.com> writes:
I am trying to get a prediction of my GAM on a response type. So that I eventually get plots with the correct values on my ylab. The problem I am encountering now is that I cannot seem to get it done for the following type of model: *model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)* My mapID consists of 8 levels of which I get individual plots with * plot(model3)*. When I do predict with a newdata in it just like my first model I need all columns to have the same amount of rows or else R will not except it ofcourse, the col.names need to at least include day and mapID. This way I can not get a prediction working for this GAM, I am confused because of this part in the model: *s(day,by=mapID). I hope someone can help me out with this, Sincerely yours, Robbert Langenberg
I'm not sure that this will work for you, but I had a similar situation and was able to get predict to work (after helpful advice from Simon Wood) with a by variable by generating a model matrix for a model with the interaction of the covariate and the by term, something like model.matrix(~ day:mapID - 1, data = mergeday) in your case. I added the appropriate columns into my data frame and also to the newdata for predict. You can see an example in the appendix of http://www.journalofvision.org/8/16/10/ HTH, Ken
Ken Knoblauch Inserm U846 Institut Cellule Souche et Cerveau D?partement Neurosciences Int?gratives 18 avenue du Doyen L?pine 69500 Bron France tel: +33 (0)4 72 91 34 77 fax: +33 (0)4 72 91 34 61 portable: +33 (0)6 84 10 64 10 http://www.sbri.fr
Ken Knoblauch <ken.knoblauch <at> inserm.fr> writes:
Robbert Langenberg <mcrelay <at> gmail.com> writes:
I am trying to get a prediction of my GAM on a response type. So that I eventually get plots with the correct values on my ylab. The problem I am encountering now is that I cannot seem to get it done for the following type of model: *model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)* My mapID consists of 8 levels of which I get individual plots with * plot(model3)*. When I do predict with a newdata in it just like my first model I need all columns to have the same amount of rows or else R will not except it ofcourse, the col.names need to at least include day and mapID. This way I can not get a prediction working for this GAM, I am confused because of this part in the model: *s(day,by=mapID). Sincerely yours, Robbert Langenberg
I'm not sure that this will work for you, but I had a similar situation and was able to get predict to work (after helpful advice from Simon Wood) with a by variable by generating a model matrix for a model with the interaction of the covariate and the by term, something like model.matrix(~ day:mapID - 1, data = mergeday) in your case. I added the appropriate columns into my data frame and also to the newdata for predict. You can see an example in the appendix of
Let me correct this before someone else does for me, (It's getting late on Friday afternoon!) The interaction in the model matrix would not be with day. In my case, I had a covariate for the by variable and a factor with two levels, so I used the interaction to separate them with respect to the smooths. I think that you might be able to get the columns from the model matrix columns from the mapID term by itself. Sorry for the misstep. Ken
2 days later
Robbert,
On Friday 16 January 2009 14:30, Robbert Langenberg wrote:
Thanks for the swift reply, I might have been a bit sloppy with describing my datasets and problem. I showed the first model as an example of the type of GAM that I had been able to use the predict function on. What I am looking for is how to predict my m3: model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday) When I plot this I get 8 different graphs. Each showing me a different habitat type with on the x-axis days and on the y-axis s(day,2,81):mapID. With predict I was hoping to get the scale of the y-axis right for a selection of days (for example 244,304). I have tried to reform the script you gave me to match my dataset in m3, but it all did not seem to work.
--- Can you explain how it didn't work? What exactly was wrong with what predict.gam gave you? [when I try predict.gam for a gam with factor by variables, I seem to get what I'd expect, so I need a bit more detail to figure out what the issue is.] -- incidentally, is there any reason for omitting the main effect of `mapID' from the model? I'd usually expect to see y_no~s(day,by=mapID)+mapID as the model formula in this sort of situation. (see ?gam.models for details) best, Simon
newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(mergeday$mapID), each = 100))
newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(mergeday$mapID)),
each = 100))
I am guessing it must have something to do with the by in s(day,by=mapID).
I haven't come across any examples that used a GAM with by and then used
the predict function.
(A sample of the dataset:
mapID day y_no
Urban Areas and Water 25 1
Urban Areas and Water 26 1
Early Succesional Forest 27 0
Agriculture 28 0
Early Succesional Forest 29 0
Mature Coniferous Forest 30 0)
I am sorry that I have to bother you even more with this, and I hope that
my additional explanation about my problem might help solve it.
Sincerely yours,
Robbert Langenberg
2009/1/16 Gavin Simpson <gavin.simpson at ucl.ac.uk>
On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
Dear, I am trying to get a prediction of my GAM on a response type. So that I eventually get plots with the correct values on my ylab. I have been able to get some of my GAM's working with the example shown below: * model1<-gam(nsdall ~ s(jdaylitr2), data=datansd) newd1 <- data.frame(jdaylitr2=(244:304)) pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,
You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?
This is not tested, but why not something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(datansd$mapID), each = 100))
Then use newd2 in your call to predict.
I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(datansd$mapID)),
each = 100))
Does that work for you?
HTH
G
The problem I am encountering now is that I cannot seem to get it done
for
the following type of model: *model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)* My mapID consists of 8 levels of which I get individual plots with * plot(model3)*. When I do predict with a newdata in it just like my first model I need all columns to have the same amount of rows or else R will
not
except it ofcourse, the col.names need to at least include day and mapID. This way I can not get a prediction working for this GAM, I am confused because of this part in the model: *s(day,by=mapID). *I have been reading through the GAM, an introduction with R book from
Wood,
S. but could not find anything about predictions with BY in the model.
I hope someone can help me out with this,
Sincerely yours,
Robbert Langenberg
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283
2 days later
Here's a simulated example (although really for this model structure, one
might as well fit seperate models for each factor level).
## Simulate some data with factor dependent smooths
n <- 400
x <- runif(n, 0, 1)
f1 <- 2 * sin(pi * x)
f2 <- exp(2 * x) - 3.75887
f3 <- 0.2 * x^11 * (10 * (1 - x))^6 + 10 * (10 * x)^3 *
(1 - x)^10
fac <- as.factor(c(rep(1, 100), rep(2, 100), rep(3, 200)))
fac.1 <- as.numeric(fac == 1)
fac.2 <- as.numeric(fac == 2)
fac.3 <- as.numeric(fac == 3)
f <- f1 * fac.1 + f2 * fac.2 + f3 * fac.3
y <- rpois(f,exp(f/4))
## fit gam, with a smooth of `x' for each level of `fac'
b <- gam(y~s(x,by=fac)+fac,family=poisson)
par(mfrow=c(2,2))
plot(b)
## produce plots on response scale, first the prediction...
np <- 200
newd <- data.frame(x=rep(seq(0,1,length=np),3),
fac=factor(c(rep(1,np),rep(2,np),rep(3,np))))
pv <- predict(b,newd,type="response")
## .. now the plotting
par(mfrow=c(2,2))
ind <- 1:np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=1)")
ind <- ind+np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=2)")
ind <- ind+np
plot(newd$x[ind],pv[ind],type="l",xlab="x",ylab="f(x,fac=2)")
On Friday 16 January 2009 14:30, Robbert Langenberg wrote:
Thanks for the swift reply,
I might have been a bit sloppy with describing my datasets and problem. I
showed the first model as an example of the type of GAM that I had been
able to use the predict function on. What I am looking for is how to
predict my m3:
model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)
When I plot this I get 8 different graphs. Each showing me a different
habitat type with on the x-axis days and on the y-axis s(day,2,81):mapID.
With predict I was hoping to get the scale of the y-axis right for a
selection of days (for example 244,304).
I have tried to reform the script you gave me to match my dataset in m3,
but it all did not seem to work.
newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(mergeday$mapID), each = 100))
newd2 <- data.frame(day = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(mergeday$mapID)),
each = 100))
I am guessing it must have something to do with the by in s(day,by=mapID).
I haven't come across any examples that used a GAM with by and then used
the predict function.
(A sample of the dataset:
mapID day y_no
Urban Areas and Water 25 1
Urban Areas and Water 26 1
Early Succesional Forest 27 0
Agriculture 28 0
Early Succesional Forest 29 0
Mature Coniferous Forest 30 0)
I am sorry that I have to bother you even more with this, and I hope that
my additional explanation about my problem might help solve it.
Sincerely yours,
Robbert Langenberg
2009/1/16 Gavin Simpson <gavin.simpson at ucl.ac.uk>
On Fri, 2009-01-16 at 12:36 +0100, Robbert Langenberg wrote:
Dear, I am trying to get a prediction of my GAM on a response type. So that I eventually get plots with the correct values on my ylab. I have been able to get some of my GAM's working with the example shown below: * model1<-gam(nsdall ~ s(jdaylitr2), data=datansd) newd1 <- data.frame(jdaylitr2=(244:304)) pred1 <- predict.gam(model1,newd1,type="response")*
Hi Robert,
You want predictions for the covariate over range 244:304 for each of
your 8 mapID's, yes?
This is not tested, but why not something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(levels(datansd$mapID), each = 100))
Then use newd2 in your call to predict.
I am assuming that datansd$mapID is a factor in the above. If it is just
some other indicator variable, then perhaps something like:
newd2 <- data.frame(jdaylitr2 = rep(seq(244, 304, length = 100), 8),
mapID = rep(sort(unique(datansd$mapID)),
each = 100))
Does that work for you?
HTH
G
The problem I am encountering now is that I cannot seem to get it done
for
the following type of model: *model3<-gam(y_no~s(day,by=mapID),family=binomial, data=mergeday)* My mapID consists of 8 levels of which I get individual plots with * plot(model3)*. When I do predict with a newdata in it just like my first model I need all columns to have the same amount of rows or else R will
not
except it ofcourse, the col.names need to at least include day and mapID. This way I can not get a prediction working for this GAM, I am confused because of this part in the model: *s(day,by=mapID). *I have been reading through the GAM, an introduction with R book from
Wood,
S. but could not find anything about predictions with BY in the model.
I hope someone can help me out with this,
Sincerely yours,
Robbert Langenberg
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090121/0c4c8def/attachment-0001.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090121/7a414df2/attachment-0001.pl>