Error in simulation R-code
On Wed, 15 Jul 2009, Steve Hong wrote:
First I apologize all of you for annoying messages. Since I did not receive the mail I sent, I thought there might be some errors.
OK. Note that there may be latency issues - occasionally, it takes much longer for the mail servers to process submitted postings. You can also check in gmane, nabble, or the list archives to see whether postings have got through, for this list on: http://n2.nabble.com/R-sig-geo-f2731867.html http://news.gmane.org/gmane.comp.lang.r.geo https://stat.ethz.ch/pipermail/r-sig-geo/
On Wed, Jul 15, 2009 at 5:13 PM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
On Wed, 15 Jul 2009, Steve Hong wrote: Dear List,
I have a question about simulation code. Here are the code and error message. sim.sp <- function(data,CM,n,N)
+ {
+ C <- matrix(rep(NA,N),ncol=1)
+ for(i in 1:N)
+ {
+ j <- n
+ xx <- which(colSums(CM[j,])==1)
+ V <- names(xx)
+ V <- paste(V, collapse="+")
+ V <- paste("SBA~", V)
+ rd <- round(nrow(data)*(2/3))
+ d <- sample(seq(1:nrow(data)),rd)
+ dat1 <- data[d,]
+ dat2 <- data[-d,]
+ crd <- cbind(dat1$Longitude,dat1$Latitude)
+ dist80 <- dnearneigh(crd,0,100,longlat=F)
+ dist80sw <- nb2listw(dist80, style="B")
+ fm <- errorsarlm(as.formula(V), data=dat1, listw=dist60sw)
+ pred <- predict(fm,dat2)
+ C[i,1] <- cor(dat2$SBA,pred)
+ out <- cbind(C)
+ }
+ colMeans(out)
+ }
sim.sp(df2007.5k.s2,CM,1,1000)
Error in nb2listw(dist80, style = "B") : Empty neighbour sets found I guess it means that there are some observations without neighborhoods from random selection process. Is there any way to proceed simulation like using only ones with neighborhood sets? Any suggestion will be appreciated!!
This is the fourth separate posting of this message, in addition cross-posted to R-help. Please do consider the bandwidth costs and the negative consequences of cross-posting. If you can't wait for someone else to use their time to solve your (simple) problem, perhaps a little reflection is called for? Have you read ?dnearneigh and ?nb2listw? The examples in ?dnearneigh show how to use the maximum 1st nearest neighbour distances to set the d2= argument to a value ensuring that each observation has at least one neighbour. Given your use of Longitude and Latitude as coordinate names, are you sure that longlat= should be FALSE, or are the names simply careless? This could affect what you think 100 units is - if coordinates in m, it is 100m, if in US surveyors feet, then 100ft, if degrees, 100 degrees ... which affects the d2= value. dist80 looks an odd name for d2=100 as well, doesn't it? You check for zero neighbour counts by:
Dist80, Longitude, and Latitude are careless. Please ignore that. Actually, Longitude and Latitude are UTM values (In that case longlat=F, right?). I think '100' is 100 km. It was OK when I try to get predicted value without simulation.
OK. Please check the ranges of your coordinate vectors; if the y vector is roughly in the single digit thousands, you are that number of km from the Equator, if in single digit millions, they are metres.
any(card(nb) == 0) In ?nb2listw, you find the zero.policy= argument, which has a default value of FALSE, but which you can set to TRUE, so avoiding the error in your simulation if used consistently in subsequent function calls to functions taking that argument, like errorsarlm(). So: zp <- !any(card(nb) == 0) ..., zero.policy=zp, ...
Where should I add these codes? Can I add the first code (zp <-!...) in front of nb2listw? The second one should be in nb2listw(...., zero.policy=zp). Is that correct?
Yes, that's right. Add the ..., zero.policy=zp, ... to all the subsequent commands using the listw object too - the object does not record the fact that it's zero.policy was set to TRUE.
might be OK, unless you also need to check that there are any neighbours at all. I'm not at all sure that this simulation is going to get you anywhere sensible - why are you trying to do it? I do hope you are setting the seed before running it, otherwise you won't know what is going wrong in the situations you choose. You are posting from a gmail address, and so not give any affiliation in your signature. Is this a homework problem?
Fortunately, this is NOT a homework problem. I am post-graduated. I changed my email address to a gamil address from the work address (educational institute). I did that since I wanted to separate R-related emails from my work email address. I am in the lists of R-help, R-sig-geo, R-mixed, and R-ecology.
OK. When there are no indications of why work is being done, it has sometimes turned out to be a graduate (or undergraduate) who has been tasked by a clueless supervisor, who has then abandoned the unfortunate person with a tight deadline and no helpful advice. I'm still not sure that just reporting the correlations between the out-of-sample predictions and observed values gets you anywhere useful, without knowing how the repeated 2/3 samples affect the autoregressive coefficient, which in turn affects the model coefficients. That was more what I was lacking for understanding. In addition, we don't know whether the observations are very clustered in space, which may lead to very dense weights, and poorer performance by the spatial model, especially if those weights were not those that generated the data. Roger
Again, you only took 40 minutes in sending 4 copies of the same message to two lists. Replying has taken about the same time. Reading two help pages would have taken you much less.
Again, I sincerely apologize for sending same messages. It is totally my misstake. I thought it did not get there since I could not see it in my gmail account. If I use gmail address, can't I see that? Thank you!!
Roger Bivand
Thank you!
Steve Hong
[[alternative HTML version deleted]]
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
-- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no