Skip to content

[DKIM] Re: FW: Spatial sampling design using information more than spatial coordinates [SEC=UNCLASSIFIED]

3 messages · Tomislav Hengl, Li Jin

#
Hi Tom 2  :-),



To be specific,  we need to do the spatial stratification first and need to stratify the survey area into n equal-area strata. The function stratify with equalArea in spcosa meets this need well.



And then within each stratum, we need to do further stratification based on a raster layer of 'elevation' that is a continuous layer and needs to be converted into 3 zones. We need to sample m samples within each stratum and these samples need to be evenly distributed within each zone. That is we need to use the area of each zone as a weight to decide how many samples should be allocated to each zone. And then do randomly sampling within each zone.



A further challenge is that they plan to sample 1 to 4 samples a day during 25 days, so we need to provide four designs with 25, 50, 75 and 100 samples respectively. The samples of the design with smaller sample size need to be nested in the samples of the design with larger sample size for operational reasons as the survey needs to meet other tasks and also to accommodate weather conditions at sea. Now you know the 'elevation' is bathymetry that is the depth of water. We plan to let n = 25, and m = 1 to 4. But it is hard to stratify if a stratum contains 3 zones and m <3.



Hope this clarify things a bit. Any further suggestions are appreciated!



Jin



-----Original Message-----
From: R-sig-Geo [mailto:r-sig-geo-bounces at r-project.org] On Behalf Of Tom Philippi
Sent: Thursday, 9 April 2015 9:32 AM
To: Tomislav Hengl
Cc: r-sig-geo
Subject: [DKIM] Re: [R-sig-Geo] FW: Spatial sampling design using information more than spatial coordinates [SEC=UNCLASSIFIED]



Jin--

Could you provide more information on what the needs of your client are that cannot be handled by either stratification or unequal probability based on your one or more covariates in spsurvey?  Also, is your sample frame points, polygons, or raster?  I assume your covariates are raster (e.g., DEM for elevation) and polygons (most soil maps)?



Tom 2





On Wed, Apr 8, 2015 at 6:39 AM, Tomislav Hengl <hengl at spatial-analyst.net<mailto:hengl at spatial-analyst.net>>

wrote:

        

            

        

            

        
        

            

        

            

        
        

            

        

            

        
        

            
_______________________________________________

R-sig-Geo mailing list

R-sig-Geo at r-project.org<mailto:R-sig-Geo at r-project.org>

https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.
-------------------------------------------------------------------------------------------------------------------------
1 day later
#
Jin,

When it comes to stratification prior to any model building (hence no 
prior predictions) I think that a robust way to stratify an area would 
be to use a combination of PCA and unsupervised fuzzy k-means. Here is 
an example:

R> library(GSIF)
R> library(plotKML)
R> library(sp)
R> data(eberg_grid)
R> gridded(eberg_grid) <- ~x+y
R> proj4string(eberg_grid) <- CRS("+init=epsg:31467")
R> formulaString <- ~ PRMGEO6+DEMSRT6+TWISRT6+TIRAST6
R> eberg_spc <- spc(eberg_grid, formulaString)
Converting PRMGEO6 to indicators...
Converting covariates to principal components...
R> kmeans.eberg <- kmeans(eberg_spc at predicted@data, 4)
R> eberg_grid$cluster.4 <- as.factor(kmeans.eberg$cluster)
R> spplot(eberg_grid["cluster.4"], col.regions=rainbow(4))

I had problems following the procedure you described (a combination of 
geographical clustering first, then sampling in feature space). Maybe 
you do only need to use the clhs 
(http://www.inside-r.org/packages/cran/clhs/docs) or lhs package because 
this is a standard statistical method when it comes to sampling in 
feature space? It all depends on what is the purpose of sampling - 
validation, model building (maximum representation)...

HTH

Tom #1
On 9-4-2015 3:49, Jin.Li at ga.gov.au wrote:
2 days later
#
Tom #1,

After discussion, the needs of the design now have been simplified as:
	1) stratify the survey area based on both spatial coordinates and bathymetry zones and 
	2) then pinpoint the locations for 100 samples. 

The purpose of this survey to my understanding is to collect some baseline information of seabed sediment and water properties as currently no information is available for these properties in the region to be surveyed.

I will try what you suggested and see if it makes our client happy. Thank you a lot for your kind help!

BTW, the function spsample.prob seems not available in the current GSIF (I am  using R version 3.1.0 and just updated GSIF). Where could I get it or should I install the latest version of R? 
Regards,
Jin



-----Original Message-----
From: Tomislav Hengl [mailto:hengl at spatial-analyst.net] 
Sent: Friday, 10 April 2015 8:09 PM
To: Li Jin; tephilippi at gmail.com
Cc: r-sig-geo at r-project.org; Ichsani Wheeler
Subject: Re: [DKIM] Re: [R-sig-Geo] FW: Spatial sampling design using information more than spatial coordinates [SEC=UNCLASSIFIED]


Jin,

When it comes to stratification prior to any model building (hence no prior predictions) I think that a robust way to stratify an area would be to use a combination of PCA and unsupervised fuzzy k-means. Here is an example:

R> library(GSIF)
R> library(plotKML)
R> library(sp)
R> data(eberg_grid)
R> gridded(eberg_grid) <- ~x+y
R> proj4string(eberg_grid) <- CRS("+init=epsg:31467") formulaString <- ~ 
R> PRMGEO6+DEMSRT6+TWISRT6+TIRAST6 eberg_spc <- spc(eberg_grid, 
R> formulaString)
Converting PRMGEO6 to indicators...
Converting covariates to principal components...
R> kmeans.eberg <- kmeans(eberg_spc at predicted@data, 4)
R> eberg_grid$cluster.4 <- as.factor(kmeans.eberg$cluster) 
R> spplot(eberg_grid["cluster.4"], col.regions=rainbow(4))

I had problems following the procedure you described (a combination of geographical clustering first, then sampling in feature space). Maybe you do only need to use the clhs
(http://www.inside-r.org/packages/cran/clhs/docs) or lhs package because this is a standard statistical method when it comes to sampling in feature space? It all depends on what is the purpose of sampling - validation, model building (maximum representation)...

HTH

Tom #1
On 9-4-2015 3:49, Jin.Li at ga.gov.au wrote:
Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.