Skip to content
Prev 22589 / 29559 Next

[DKIM] Re: FW: Spatial sampling design using information more than spatial coordinates [SEC=UNCLASSIFIED]

Tom #1,

After discussion, the needs of the design now have been simplified as:
	1) stratify the survey area based on both spatial coordinates and bathymetry zones and 
	2) then pinpoint the locations for 100 samples. 

The purpose of this survey to my understanding is to collect some baseline information of seabed sediment and water properties as currently no information is available for these properties in the region to be surveyed.

I will try what you suggested and see if it makes our client happy. Thank you a lot for your kind help!

BTW, the function spsample.prob seems not available in the current GSIF (I am  using R version 3.1.0 and just updated GSIF). Where could I get it or should I install the latest version of R? 
Regards,
Jin



-----Original Message-----
From: Tomislav Hengl [mailto:hengl at spatial-analyst.net] 
Sent: Friday, 10 April 2015 8:09 PM
To: Li Jin; tephilippi at gmail.com
Cc: r-sig-geo at r-project.org; Ichsani Wheeler
Subject: Re: [DKIM] Re: [R-sig-Geo] FW: Spatial sampling design using information more than spatial coordinates [SEC=UNCLASSIFIED]


Jin,

When it comes to stratification prior to any model building (hence no prior predictions) I think that a robust way to stratify an area would be to use a combination of PCA and unsupervised fuzzy k-means. Here is an example:

R> library(GSIF)
R> library(plotKML)
R> library(sp)
R> data(eberg_grid)
R> gridded(eberg_grid) <- ~x+y
R> proj4string(eberg_grid) <- CRS("+init=epsg:31467") formulaString <- ~ 
R> PRMGEO6+DEMSRT6+TWISRT6+TIRAST6 eberg_spc <- spc(eberg_grid, 
R> formulaString)
Converting PRMGEO6 to indicators...
Converting covariates to principal components...
R> kmeans.eberg <- kmeans(eberg_spc at predicted@data, 4)
R> eberg_grid$cluster.4 <- as.factor(kmeans.eberg$cluster) 
R> spplot(eberg_grid["cluster.4"], col.regions=rainbow(4))

I had problems following the procedure you described (a combination of geographical clustering first, then sampling in feature space). Maybe you do only need to use the clhs
(http://www.inside-r.org/packages/cran/clhs/docs) or lhs package because this is a standard statistical method when it comes to sampling in feature space? It all depends on what is the purpose of sampling - validation, model building (maximum representation)...

HTH

Tom #1
On 9-4-2015 3:49, Jin.Li at ga.gov.au wrote:
Geoscience Australia Disclaimer: This e-mail (and files transmitted with it) is intended only for the person or entity to which it is addressed. If you are not the intended recipient, then you have received this e-mail by mistake and any use, dissemination, forwarding, printing or copying of this e-mail and its file attachments is prohibited. The security of emails transmitted cannot be guaranteed; by forwarding or replying to this email, you acknowledge and accept these risks.