physical constraint with gam

Thanks for the clarification!

On 12/05/16 02:29, Dominik Schneider wrote:

Hi again,
I'm looking for some clarification on 2 things.
1. On that last note, I realize that s(x1,x2) would be the other obvious
interaction to compare with - and I see that you recommend te(x1,x2) if
they are not on the same scale.

- yes that's right, s(x1,x2) gives an isotropic smooth, which is usually
only appropriate if x1 and x2 are naturally on the same scale.

2. If s(x1,by=x1) gives you a "parameter" value similar to a GLM when you
plot s(x1):x1, why does my function above return the same yhat as
predict(mdl,type='response') ?  Shouldn't each of the terms need to be
multiplied by the variable value before applying
rowSums()+attr(sterms,'constant') ??

predict returns s(x1)*x1 (plot.gam just plots s(x1), because in general
s(x1,by=x2) is not smooth). If you want to get s(x1) on its own you need to
do something like this:

x2 <- x1 ## copy x1
m <- gam(y~s(x1,by=x2)) ## model implementing s(x1,by=x1) using copy of x1
predict(m,data.frame(x1=x1,x2=rep(1,length(x2))),type="terms") ## now
predicted s(x1)*x2 = s(x1)

best,
Simon

Thanks again
Dominik

On Wed, May 11, 2016 at 10:11 AM, Dominik Schneider <
<Dominik.Schneider at colorado.edu>Dominik.Schneider at colorado.edu> wrote:

Hi Simon, Thanks for this explanation.
To make sure I understand, another way of explaining the y axis in my
original example is that it is the contribution to snowdepth relative to
the other variables (the example only had fsca, but my actual case has a
couple others). i.e. a negative s(fsca) of -0.5 simply means snowdepth 0.5
units below the intercept+s(x_i), where s(x_i) could also be negative in
the case where total snowdepth is less than the intercept value.

The use of by=fsca is really useful for interpreting the marginal impact
of the different variables. With my actual data, the term s(fsca):fsca is
never negative, which is much more intuitive. Is it appropriate to compare
magnitudes of e.g. s(x2):x2 / mean(x2) and s(x2):x2 / mean(x2)  where
mean(x_i) are the mean of the actual data?

Lastly, how would these two differ: s(x1,by=x2); or
s(x1,by=x1)*s(x2,by=x2) since interactions are surely present and i'm not
sure if a linear combination is enough.

Thanks!
Dominik

On Wed, May 11, 2016 at 3:11 AM, Simon Wood < <simon.wood at bath.edu>
simon.wood at bath.edu> wrote:

The spline having a positive value is not the same as a glm coefficient
having a positive value. When you plot a smooth, say s(x), that is
equivalent to plotting the line 'beta * x' in a GLM. It is not equivalent
to plotting 'beta'. The smooths in a gam are (usually) subject to
`sum-to-zero' identifiability constraints to avoid confounding via the
intercept, so they are bound to be negative over some part of the covariate
range. For example, if I have a model y ~ s(x) + s(z), I can't estimate the
mean level for s(x) and the mean level for s(z) as they are completely
confounded, and confounded with the model intercept term.

I suppose that if you want to interpret the smooths as glm parameters
varying with the covariate they relate to then you can do, by setting the
model up as a varying coefficient model, using the `by' argument to 's'...

gam(snowdepth~s(fsca,by=fsca),data=dat)

this model is `snowdepth_i = f(fsca_i) * fsca_i + e_i' . s(fsca,by=fsca)
is not confounded with the intercept, so no constraint is needed or
applied, and you can now interpret the smooth like a local GLM coefficient.

best,
Simon

On 11/05/16 01:30, Dominik Schneider wrote:

Hi,
Just getting into using GAM using the mgcv package. I've generated some
models and extracted the splines for each of the variables and started
visualizing them. I'm noticing that one of my variables is physically
unrealistic.

In the example below, my interpretation of the following plot is that
the
y-axis is basically the equivalent of a "parameter" value of a GLM; in
GAM
this value can change as the functional relationship changes between x
and
y. In my case, I am predicting snowdepth based on the fractional snow
covered area. In no case will snowdepth realistically decrease for a
unit
increase in fsca so my question is: *Is there a way to constrain the
spline
to positive values? *

Thanks
Dominik

library(mgcv)
library(dplyr)
library(ggplot2)
extract_splines=function(mdl){
   sterms=predict(mdl,type='terms')
   datplot=cbind(sterms,mdl$model) %>% tbl_df
   datplot$intercept=attr(sterms,'constant')
   datplot$yhat=rowSums(sterms)+attr(sterms,'constant')
   return(datplot)
}
dat=data_frame(snowdepth=runif(100,min =
0.001,max=6.7),fsca=runif(100,0.01,.99))
mdl=gam(snowdepth~s(fsca),data=dat)
termdF=extract_splines(mdl)
ggplot(termdF)+
   geom_line(aes(x=fsca,y=`s(fsca)`))

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
<http://www.R-project.org/posting-guide.html>
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
+44 (0)117 33 18273 <%2B44%20%280%29117%2033%2018273>
<http://www.maths.bris.ac.uk/%7Esw15190>
http://www.maths.bris.ac.uk/~sw15190

--
Simon Wood, School of Mathematics, University of Bristol BS8 1TW UK
+44 (0)117 33 18273     http://www.maths.bris.ac.uk/~sw15190

physical constraint with gam

Thread (8 messages)