Problem with gstat variogram estimation

Edzer,

Thank you.  I am now looking at REML since I have small data sets and I really need to use the duplicate data.

In the follow on I will need to accommodate adaptive sampling weights.  The first stage selection is systematic (with some minor modifications).  The second stage is sampling areas adjacent to any primary location that exceeds a threshold for one of the variables of interest.  Another reason, for example, that one might weight would be to reflect the number of (physical) component samples in a composite soil sample.  This may vary from laboratory result to laboratory result.  Is there any way to accommodate this in gstat?

Thanks,
John

John H. Carson Jr., PhD
Senior Statistician
Applied Sciences & Engineering 
Shaw Environmental & Infrastructure
16406 US Rte 224 East
Findlay, OH 45840
Phone 419-425-6156
Fax 419-425-6085
john.carson at shawgrp.com

http://www.shawgrp.com/
Shaw(tm) a world of Solutions(tm)

-----Original Message-----
From: Edzer Pebesma [mailto:edzer.pebesma at uni-muenster.de] 
Sent: Monday, October 12, 2009 10:38 AM
To: Carson, John
Cc: r-sig-geo at stat.math.ethz.ch
Subject: Re: [R-sig-Geo] Problem with gstat variogram estimation

John, thanks for sharing this with r-sig-geo.

As Thierry mentioned, the default model fitting procedure (fit.variogram
in package gstat) uses weighted least squares, with weights proportional
to N_h/(h^2). This explains why the first lag gets so much weight.

For pure nugget models, this of course makes little sense; for other
models it often does. Argument fit.method gives you somewhat more
control. Give it value 1 to have N_h weights; give it value 6 to do
unweighted averaging (I agree that this information should be in the
fit.variogram documentation). The SSErr values will be uncomparable
accross different weighting schemes, as you might expect.
--
Edzer

Problem with gstat variogram estimation

Thread (6 messages)