Hi, Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try geoRglm but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values. Our final question about this is how to estimate standard errors for the regression kriging predictions of the binary variable? On the logit scale we are using rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0) for location s0 Is assuming independence of the two components adequate? var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0) Thanks again, David Maxwell & Vanessa Stelzenm?ller david.maxwell at cefas.co.uk -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of Tomislav Hengl Sent: 03 April 2008 09:04 To: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging I completely agree with Thierry. Take a look at this also: https://stat.ethz.ch/pipermail/r-sig-geo/2008-February/003176.html The instructions on how to run RK with binary variables in R you can find in sec 4.3.3 (Fig. 4.15) of my lecture notes. Hengl, T., 2007. A Practical Guide to Geostatistical Mapping of Environmental Variables. EUR 22904 EN Scientific and Technical Research series, Office for Official Publications of the European Communities, Luxemburg, 143 pp. http://bookshop.europa.eu/uri?target=EUB:NOTICE:LBNA22904:EN:HTML Tom Hengl http://spatial-analyst.net -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of ONKELINX, Thierry Sent: woensdag 2 april 2008 11:56 To: Vanessa Stelzenmuller (Cefas); r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging Dear Vanessa, What residuals did you use? The ones in the original scale or in the logit scale? Interpolate the residuals in the logit scale and add these to the model predictions in the logit scale. And the transform those values back to the original scale. This will prevent values outside the 0-1 range. Maybe you should have a loot at the geoRglm package. HTH, Thierry ---------------------------------------------------------------------------- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Vanessa Stelzenmuller (Cefas) Verzonden: woensdag 2 april 2008 11:32 Aan: r-sig-geo at stat.math.ethz.ch Onderwerp: [R-sig-Geo] question about regression kriging Hello, We work on the application of regression kriging to presence / absence data in the context of species distribution modelling. In R in a first step we fit the trend surfaces with logistic regression models. Then we fit a variogram to the regression residuals and interpolate the residuals with OK. Now we face the situation that when combining trend surfaces with residual surfaces for some locations our occurrence probability is <0 or >1. Thus taking into account the spatial structure of the data (residuals) has the potential to convert a predicted high occurrence probability into a low occurrence probability or vice versa. Are there some restriction for presence/ absence data for this approach? How to deal with these estimations (<0 and >1)? Many thanks Vanessa ________________________________ Dr. Vanessa Stelzenm?ller Marine Scientist (GIS), CEFAS Pakefield Road, Lowestoft, NR33 0HT, UK Tel.: +44 (0)1502 527779 www.cefas.co.uk *********************************************************************************** This email and any attachments are intended for the =\...{{dropped:20}}
question about regression kriging
12 messages · David Maxwell (Cefas), ONKELINX, Thierry, Tomislav Hengl +1 more
David Maxwell (Cefas) wrote:
Hi, Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try geoRglm but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values. Our final question about this is how to estimate standard errors for the regression kriging predictions of the binary variable? On the logit scale we are using rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0) for location s0 Is assuming independence of the two components adequate? var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
The two components of the regression-kriging model are not independent, hence you are doing a wrong thing if you are just summing them. You should use instead the universal kriging variance that is derived in gstat. The complete derivation of the Universal kriging variance is available in Cressie (1993; p.154), or even better Papritz and Stein (1999; p.94). See also pages 7-8 of our technical note: Hengl T., Heuvelink G.B.M. and Stein A., 2003. Comparison of kriging with external drift and regression-kriging. Technical report, International Institute for Geo-information Science and Earth Observation (ITC), Enschede, pp. 18. http://www.itc.nl/library/Papers_2003/misca/hengl_comparison.pdf Edzer is right, you can not back-transform prediction variance of the transformed variable (logits). However, you can standardize/normalize the UK variance by diving it with global variance (see e.g. http://dx.doi.org/10.1016/j.geoderma.2003.08.018), so that you can evaluate the success of prediction in relative terms (see also http://spatial-analyst.net/visualization.php). Tom Hengl http://spatial-analyst.net -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of Edzer Pebesma Sent: dinsdag 8 april 2008 20:50 To: David Maxwell (Cefas) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging
David Maxwell (Cefas) wrote:
Hi, Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try geoRglm
but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for the regression kriging
predictions of the binary variable?
On the logit scale we are using rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0) for location s0 Is assuming independence of the two components adequate? var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Dear David, An other option would be to use sequential gaussian simulation. That will allow to calculate confidence intervals in the logit scale. These can be back-transformed into the original scale because the logit transformation is monotone. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Edzer Pebesma Verzonden: dinsdag 8 april 2008 20:50 Aan: David Maxwell (Cefas) CC: r-sig-geo at stat.math.ethz.ch Onderwerp: Re: [R-sig-Geo] question about regression kriging
David Maxwell (Cefas) wrote:
Hi, Tom and Thierry, Thank you for your advice, the lecture notes are very
useful. We will try geoRglm but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for
the regression kriging predictions of the binary variable?
On the logit scale we are using rk prediction (s0) = glm prediction (s0) + kriged residual prediction
(s0)
for location s0 Is assuming independence of the two components adequate? var rk(s0) ~= var glm prediction (s0) + var kriged residual
prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Tom, I'm afraid things are harder than you sketch. In glm's, the parameter estimation is done using iteratively reweighted least squares, where the weights depend on a variance function that links the variance of observations to the mean. So, observations (residuals) are assumed to be unstationary, in principle, and because of the mean-dependency this changes over the iterations. The equations and references you mention afaik all assume a known, and fixed variogram, and one-step solutions, no iteration. Also, you falsly accuse me of claiming one cannot back-transform prediction variances. I did not claim this (I have seen suggestions on how to do this), I just asked how David would do this. -- Edzer
Tomislav Hengl wrote:
The two components of the regression-kriging model are not independent, hence you are doing a wrong thing if you are just summing them. You should use instead the universal kriging variance that is derived in gstat. The complete derivation of the Universal kriging variance is available in Cressie (1993; p.154), or even better Papritz and Stein (1999; p.94). See also pages 7-8 of our technical note: Hengl T., Heuvelink G.B.M. and Stein A., 2003. Comparison of kriging with external drift and regression-kriging. Technical report, International Institute for Geo-information Science and Earth Observation (ITC), Enschede, pp. 18. http://www.itc.nl/library/Papers_2003/misca/hengl_comparison.pdf Edzer is right, you can not back-transform prediction variance of the transformed variable (logits). However, you can standardize/normalize the UK variance by diving it with global variance (see e.g. http://dx.doi.org/10.1016/j.geoderma.2003.08.018), so that you can evaluate the success of prediction in relative terms (see also http://spatial-analyst.net/visualization.php). Tom Hengl http://spatial-analyst.net -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of Edzer Pebesma Sent: dinsdag 8 april 2008 20:50 To: David Maxwell (Cefas) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging David Maxwell (Cefas) wrote:
Hi,
Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try geoRglm
but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for the regression kriging
predictions of the binary variable?
On the logit scale we are using
rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Thierry, how would you setup a Gaussian simulation such that the end result is different from the case where these confidence intervals were directly computed from the kriged mean and variance and a Gaussian assumption on the errors? -- Edzer
ONKELINX, Thierry wrote:
Dear David, An other option would be to use sequential gaussian simulation. That will allow to calculate confidence intervals in the logit scale. These can be back-transformed into the original scale because the logit transformation is monotone. HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Edzer Pebesma Verzonden: dinsdag 8 april 2008 20:50 Aan: David Maxwell (Cefas) CC: r-sig-geo at stat.math.ethz.ch Onderwerp: Re: [R-sig-Geo] question about regression kriging David Maxwell (Cefas) wrote:
Hi,
Tom and Thierry, Thank you for your advice, the lecture notes are very
useful. We will try geoRglm but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for
the regression kriging predictions of the binary variable?
On the logit scale we are using
rk prediction (s0) = glm prediction (s0) + kriged residual prediction
(s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual
prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Edzer, One assumes a normal distribution of the predictions in every point when calculating a confidence interval based on the kriged mean and variance. AFAIK sequential Gaussian simulation doesn't have to yield normally distributed predictions per point. Therefore can the confidence intervals based on SGS differ from those based on the kriged mean and variance. Or am I missing something? Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Verzonden: woensdag 9 april 2008 10:57 Aan: ONKELINX, Thierry CC: David Maxwell (Cefas); r-sig-geo at stat.math.ethz.ch Onderwerp: Re: [R-sig-Geo] question about regression kriging Thierry, how would you setup a Gaussian simulation such that the end result is different from the case where these confidence intervals were directly computed from the kriged mean and variance and a Gaussian assumption on the errors? -- Edzer
ONKELINX, Thierry wrote:
Dear David, An other option would be to use sequential gaussian simulation. That will allow to calculate confidence intervals in the logit scale. These can be back-transformed into the original scale because the logit transformation is monotone. HTH, Thierry
------------------------------------------------------------------------
---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no
more
than asking him to perform a post-mortem examination: he may be able
to
say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Edzer Pebesma Verzonden: dinsdag 8 april 2008 20:50 Aan: David Maxwell (Cefas) CC: r-sig-geo at stat.math.ethz.ch Onderwerp: Re: [R-sig-Geo] question about regression kriging David Maxwell (Cefas) wrote:
Hi, Tom and Thierry, Thank you for your advice, the lecture notes are
very
useful. We will try geoRglm but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for
the regression kriging predictions of the binary variable?
On the logit scale we are using rk prediction (s0) = glm prediction (s0) + kriged residual
prediction
(s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual
prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the
correlation
is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined
using
universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Thierry, this is one of the frequent misconceptions in geostatistics. Sequential Gaussian simulation yields for each point a normal distribution, with mean and standard deviation equal (in the limit of very many simulations) to the kriged mean and standard deviation. -- Edzer
ONKELINX, Thierry wrote:
Edzer, One assumes a normal distribution of the predictions in every point when calculating a confidence interval based on the kriged mean and variance. AFAIK sequential Gaussian simulation doesn't have to yield normally distributed predictions per point. Therefore can the confidence intervals based on SGS differ from those based on the kriged mean and variance. Or am I missing something? Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Verzonden: woensdag 9 april 2008 10:57 Aan: ONKELINX, Thierry CC: David Maxwell (Cefas); r-sig-geo at stat.math.ethz.ch Onderwerp: Re: [R-sig-Geo] question about regression kriging Thierry, how would you setup a Gaussian simulation such that the end result is different from the case where these confidence intervals were directly computed from the kriged mean and variance and a Gaussian assumption on the errors? -- Edzer ONKELINX, Thierry wrote:
Dear David,
An other option would be to use sequential gaussian simulation. That
will allow to calculate confidence intervals in the logit scale. These
can be back-transformed into the original scale because the logit
transformation is monotone.
HTH,
Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no
more
than asking him to perform a post-mortem examination: he may be able
to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does
not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-sig-geo-bounces at stat.math.ethz.ch
[mailto:r-sig-geo-bounces at stat.math.ethz.ch] Namens Edzer Pebesma
Verzonden: dinsdag 8 april 2008 20:50
Aan: David Maxwell (Cefas)
CC: r-sig-geo at stat.math.ethz.ch
Onderwerp: Re: [R-sig-Geo] question about regression kriging
David Maxwell (Cefas) wrote:
Hi,
Tom and Thierry, Thank you for your advice, the lecture notes are
very
useful. We will try geoRglm but for now regression kriging using the
working residuals gives sensible answers even though there are some
issues with using working residuals, i.e. not Normally distributed,
occasional very large values and inv.logit(prediction type="link" +
working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for
the regression kriging predictions of the binary variable?
On the logit scale we are using
rk prediction (s0) = glm prediction (s0) + kriged residual
prediction
(s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual
prediction (s0)
In principle, no. The extreme case is prediction at observation
locations, where the correlation is -1 so that the final prediction
variance becomes zero. I never got to looking how large the
correlation
is otherwise, but that shouldn't be hard to do in the linear case, as
you can get the first and second separately, and also the combined
using
universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Hi, Both Edzer's example (extreme case when prediction is at observation locations) and Tom's tech report (pg8 "note that the Eq.(19) looks very much like (17), except it will give slightly lower values") suggest that assuming independence of the two variances will give values that are too large, if so this is useful to know. For the follow-up question: how to present the prediction uncertainty? I would follow the usual approach for a binary glm, calculate a confidence interval on the logit scale, then back-transform the limits to the 0,1 scale. If space to present mapped outputs is limited I plan to calculate the width of the confidence interval on the 0,1 scale and map this. Thanks again, this list is an excellent catalyst for learning David david.maxwell at cefas.co.uk -----Original Message----- From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Sent: 09 April 2008 09:49 To: Tomislav Hengl Cc: David Maxwell (Cefas); r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging Tom, I'm afraid things are harder than you sketch. In glm's, the parameter estimation is done using iteratively reweighted least squares, where the weights depend on a variance function that links the variance of observations to the mean. So, observations (residuals) are assumed to be unstationary, in principle, and because of the mean-dependency this changes over the iterations. The equations and references you mention afaik all assume a known, and fixed variogram, and one-step solutions, no iteration. Also, you falsly accuse me of claiming one cannot back-transform prediction variances. I did not claim this (I have seen suggestions on how to do this), I just asked how David would do this. -- Edzer
Tomislav Hengl wrote:
The two components of the regression-kriging model are not independent, hence you are doing a wrong thing if you are just summing them. You should use instead the universal kriging variance that is derived in gstat. The complete derivation of the Universal kriging variance is available in Cressie (1993; p.154), or even better Papritz and Stein (1999; p.94). See also pages 7-8 of our technical note: Hengl T., Heuvelink G.B.M. and Stein A., 2003. Comparison of kriging with external drift and regression-kriging. Technical report, International Institute for Geo-information Science and Earth Observation (ITC), Enschede, pp. 18. http://www.itc.nl/library/Papers_2003/misca/hengl_comparison.pdf Edzer is right, you can not back-transform prediction variance of the transformed variable (logits). However, you can standardize/normalize the UK variance by diving it with global variance (see e.g. http://dx.doi.org/10.1016/j.geoderma.2003.08.018), so that you can evaluate the success of prediction in relative terms (see also http://spatial-analyst.net/visualization.php). Tom Hengl http://spatial-analyst.net -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf Of Edzer Pebesma Sent: dinsdag 8 april 2008 20:50 To: David Maxwell (Cefas) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging David Maxwell (Cefas) wrote:
Hi,
Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try geoRglm
but for now regression kriging using the working residuals gives sensible answers even though there are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for the regression kriging
predictions of the binary variable?
On the logit scale we are using
rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
***********************************************************************************
This email and any attachments are intended for the name...{{dropped:10}}
Edzer, I do not want to get into too much discussion (this is definitively not a place to discuss basic mathematical/statistical theories). My experience is - things are complicated when trying to run regression-kriging with 0/1 variables. In practice, I guess you can fit GLM model, then estimate the variogram for residuals. Imagine that we have a point map showing binary observations (0/1) and a set of raster maps PC1-PC5 (pc.comps). Then, we can make predictions using glms and your package gstat: # regression (GLM) modelling:
occurrences.absences$pr = ifelse(is.na(occurrences.absences$value), 0, 1) vtbin.lm = glm(pr~PC1+PC2+PC3+PC4+PC5, binomial(link=logit), occurrences.absences) vtbin.lm$residual = vtbin.lm$model$occurrence - vtbin.lm$fitted.values
# variogram modelling (original scale):
vt.rebin = variogram(vtbin.lm$residual~1, occurrences.absences) vt.rbvgm = fit.variogram(vt.rebin, vgm(nugget=0.01, model="Exp", range=40000, psill=0.03))
# final predictions (A) regression-kriging:
vt.bin.trend = predict(vtbin.lm, newdata=pc.comps, type="response")
vt.gres = gstat(id=c("occur"), formula=vtbin.lm$residual~1, data=occurrences.absences,
model=vt.rbvgm)
vt.rkbin = predict.gstat(vt.gres, pc.comps, nmax=100, beta=1, BLUE=FALSE) vt.rkbin$pred.bin = vt.bin.trend + vt.rkbin$occur.pred
Now you have the predictions that can exceed 0-1 range, and no estimate of the UK variance (this is the problem David has, I think). What we can do instead is to run UK in gstat on logits first, then back-transform to original 0-1 scale: # Estimate the binary variable at logit-scale (as an average of the output of GLM and original 0/1 value). This is not possible if values are equal to 0/1, so we have to 'smooth' them:
vtbin.lm$prs = (10*occurrences.absences$pr+vtbin.lm$fitted.values)/11
# Now the values can be back-transformed to logit scale
occurrences.absences$logits = log((vtbin.lm$prs)/(1-(vtbin.lm$prs)))
# now fit variogram for residuals (logits):
vt.relogit = variogram(vtbin.lm$residuals~1, occurrences.absences) vt.logitvgm = fit.variogram(vt.relogit, vgm(nugget=0, model="Exp", range=40000, psill=4))
# final predictions (B) UK in gstat:
vt.prs = gstat(id=c("occur"), formula=logits~PC1+PC2+PC3+PC4+PC5, data=occurrences.absences,
model=vt.logitvgm)
vt.rkprs = predict.gstat(vt.prs, pc.comps, nmax=100, beta=1, BLUE=FALSE)
# Back-transform (-20 to 20 values) to original scale:
vt.rkprs$pr = exp(vt.rkprs$occur.pred)/(1+exp(vt.rkprs$occur.pred))
Obviously, predictions with (B) are more attractive than predictions with (A), because the values are within the 0-1 range and we also have the UK variance (but this is hard to interpret). Then, one can standardize the prediction variance using the global variance so that you can do some interpretation:
vt.rkprs$nvar = vt.rkprs$occur.var/var(occurrences.absences$logits)
Of course, (B) is only a short-cut solution. One really needs to develop a sound methodology to solve such problems (we at least offer a detailed discussion in http://dx.doi.org/10.1016/j.geoderma.2007.04.022). PS: How do you back-transform the GLM prediction variance to original scale (I was not aware of this, apologies)? A reference would do. Tom -----Original Message----- From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Sent: woensdag 9 april 2008 10:49 To: Tomislav Hengl Cc: 'David Maxwell (Cefas)'; r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging Tom, I'm afraid things are harder than you sketch. In glm's, the parameter estimation is done using iteratively reweighted least squares, where the weights depend on a variance function that links the variance of observations to the mean. So, observations (residuals) are assumed to be unstationary, in principle, and because of the mean-dependency this changes over the iterations. The equations and references you mention afaik all assume a known, and fixed variogram, and one-step solutions, no iteration. Also, you falsly accuse me of claiming one cannot back-transform prediction variances. I did not claim this (I have seen suggestions on how to do this), I just asked how David would do this. -- Edzer
Tomislav Hengl wrote:
The two components of the regression-kriging model are not independent, hence you are doing a
wrong
thing if you are just summing them. You should use instead the universal kriging variance that is derived in gstat. The complete derivation of the Universal kriging variance is available in
Cressie
(1993; p.154), or even better Papritz and Stein (1999; p.94). See also pages 7-8 of our technical note: Hengl T., Heuvelink G.B.M. and Stein A., 2003. Comparison of kriging with external drift and regression-kriging. Technical report, International Institute for Geo-information Science and
Earth
Observation (ITC), Enschede, pp. 18. http://www.itc.nl/library/Papers_2003/misca/hengl_comparison.pdf Edzer is right, you can not back-transform prediction variance of the transformed variable
(logits).
However, you can standardize/normalize the UK variance by diving it with global variance (see e.g. http://dx.doi.org/10.1016/j.geoderma.2003.08.018), so that you can evaluate the success of prediction in relative terms (see also http://spatial-analyst.net/visualization.php). Tom Hengl http://spatial-analyst.net -----Original Message----- From: r-sig-geo-bounces at stat.math.ethz.ch [mailto:r-sig-geo-bounces at stat.math.ethz.ch] On Behalf
Of
Edzer Pebesma Sent: dinsdag 8 april 2008 20:50 To: David Maxwell (Cefas) Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging David Maxwell (Cefas) wrote:
Hi, Tom and Thierry, Thank you for your advice, the lecture notes are very useful. We will try
geoRglm
but for now regression kriging using the working residuals gives sensible answers even though
there
are some issues with using working residuals, i.e. not Normally distributed, occasional very large values and inv.logit(prediction type="link" + working residual) doesn't quite give the observed values.
Our final question about this is how to estimate standard errors for the regression kriging
predictions of the binary variable?
On the logit scale we are using
rk prediction (s0) = glm prediction (s0) + kriged residual prediction (s0)
for location s0
Is assuming independence of the two components adequate?
var rk(s0) ~= var glm prediction (s0) + var kriged residual prediction (s0)
In principle, no. The extreme case is prediction at observation locations, where the correlation is -1 so that the final prediction variance becomes zero. I never got to looking how large the correlation is otherwise, but that shouldn't be hard to do in the linear case, as you can get the first and second separately, and also the combined using universal kriging. Another question: how do you transform this variance back to the observation scale? -- Edzer
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Tomislav Hengl wrote:
PS: How do you back-transform the GLM prediction variance to original scale (I was not aware of this, apologies)? A reference would do.
1. The white book, "Statistical Models in S" 2. The source of predict.glm (not for the faint of heart) 3. McCullogh & Nelder (I guess it's in there, but my copy is still in a box!) -- Edzer
Sorry, I meant "UK variance of the interpolated logits". I guess the solution is to either: (A) 1. Run 100 SG simulations using logits; 2. Back-transform the values to 0-1 scale; 3. Determine the variance per grid node (now in 0-1 scale). Running simulations could also be rather time-consuming (I often give up running SG simulations in gstat with multiple raster maps as predictors -- memory limit problems, too time-consuming). (B) 1. Determine the upper and lower 1-s confidence intervals -- 2 maps (in logit scale); 2. Back-transform the maps and derive the difference in 0-1 scale (estimation error); But I guess that a method to directly back-transform the UK variance from logit scale to 0-1 scale does not exist. Tom Hengl -----Original Message----- From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] Sent: woensdag 9 april 2008 14:48 To: Tomislav Hengl Cc: r-sig-geo at stat.math.ethz.ch Subject: Re: [R-sig-Geo] question about regression kriging
Tomislav Hengl wrote:
PS: How do you back-transform the GLM prediction variance to original scale (I was not aware of this, apologies)? A reference would do.
1. The white book, "Statistical Models in S" 2. The source of predict.glm (not for the faint of heart) 3. McCullogh & Nelder (I guess it's in there, but my copy is still in a box!) -- Edzer