Different results for same model in GEODA and R ?
On Thu, 17 Jan 2008, Ram Pandit wrote:
Dear all, I have found different results while using GEODA (spatial lag and error models) and spdep (lagsarlm and errorsarlm) models for the same data and same weight matrix.
Without making available your complete data set, your claim is worthless, because nobody can reproduce it - and not quoting the complete exact R code (history() of the whole session), and adding screen dumps (PrtScr) of GeoDa for all the steps taken. All such claims so far (and re. SpaceStat) over almost 10 years have been user misunderstandings or mistakes, and have been settled in threads on this list and the openspace list. Please make a bundle of all the files needed to reproduce the problem and put them on a webserver, indicating where they can be picked up (or if sensitive attach them to an email to me off-list). Step 1. See if a regular linear model can be reproduced - if not, you do not have the same data in both systems; Step 2. See if the summary numbers for the neighbours agree, if not, the GAL files are not being represented in the same way; Step 3. See if the weights agree (not so easy, but using a different variable with no autocorrelation, say a random variate, try a univariate Moran) Question: do you have any missing values, and if so how are they represented? I replied to this questioner off-list earlier without receiving an acknowledgement, seems to be in a hurry, and still has not been polite enough to give an affiliation. Please indicate your status (Professor of statistics, master's student in real estate, ...), it does help those who answer grasp why you might not understand. Seriously, there is an enormous difference in the pleasure of answering to a well constructed question with a reproducable example, and the frustration of trying to arrest unsubstantiated and non-reproducable "reports" like this, which in my experience are very likely to be user error, and which certainly could have been checked more thoroughly. Roger
GEODA indicated strong spatial lag and error dependency
by Moran's I and LR tests and highly significant coefficients of lagged
dependent variable and the lambda. However, in lagsarlm and errorsarlm
models in R I found both Rho and Lambda are insignificant by the LR tests
and also insignificant Moran's Is. Also the magnitude of coefficient
estimate differs for other model variables in these two applications.
What might have caused these differences? Is the parameter estimation by MLE
in GEODA and GLS (except Rho, which perhaps by MLE) in R made this
difference. Why moran's I is significant in one (GEODA) but not in other
(R)? What i am missing here?
Any clue and suggestion would be helpful to find this difference. Following
is the data description and sample model results:
I have used country based data from 124 countries with some islands on it.
Created a gal file in GEODA and run simultaneous models in GEODA and R.
1. sample GEODA out put for a model:
DIAGNOSTICS FOR SPATIAL DEPENDENCE
FOR WEIGHT MATRIX : gdpgi07.GAL (row-standardized weights)
TEST MI/DF VALUE PROB
Moran's I (error) 0.266382 4.1677032 0.0000308
Lagrange Multiplier (lag) 1 20.7711707 0.0000052
Robust LM (lag) 1 9.7078520 0.0018348
Lagrange Multiplier (error) 1 13.0279696 0.0003069
Robust LM (error) 1 1.9646508 0.1610168
Lagrange Multiplier (SARMA) 2 22.7358216 0.0000116
sample spatial lag model output for the same model in GEODA:
-----------------------------------------------------------------------
Variable Coefficient Std.Error z-value Probability
-----------------------------------------------------------------------
W_Y 0.5758454 0.05216534 11.03885 0.0000000
CONSTANT -43.92029 21.60366 -2.033003 0.0420521
X1 0.3878955 0.0528817 7.335156 0.0000000
X2 0.8597154 0.8199795 1.04846 0.2944269
-----------------------------------------------------------------------
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL LAG DEPENDENCE FOR WEIGHT MATRIX : gdpgi07.GAL
TEST DF VALUE PROB
Likelihood Ratio Test 1 37.85142 0.0000000
2. Following is the R results for the same model:
moran.test(Y,gdpgi07.queen,randomisation=FALSE,zero.policy=TRUE
,alternative="two.sided")
Moran's I test under normality
data: Y
weights: gdpgi07.queen
Moran I statistic standard deviate = -1.1922, p-value = 0.2332
alternative hypothesis: two.sided
sample estimates:
Moran I statistic Expectation Variance
-0.103175115 -0.008849558 0.006259578
Global Moran's I for regression residuals
data:
model: lm(formula = Y ~X1 + X2 +.......)
weights: gdpgi07.queen
Moran I statistic standard deviate = -0.7751, p-value = 0.4383
alternative hypothesis: two.sided
sample estimates:
Observed Moran's I Expectation Variance
-0.067836924 -0.006025453 0.006359538
Spatial lag model results:
model.lag<-lagsarlm(Y~X1+X2+......................,data=gdpgi,gdpgi07.queen,
zero.policy=TRUE)
summary(amph1.lag)
Type: lag
Regions with no neighbours included:
199 98 183 216 99 157 105 143 118 12
Coefficients: (asymptotic standard errors)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -59.5478780 25.4744792 -2.3376 0.01941
X1 0.4084015 0.0611086 6.6832 2.338e-11
X2 2.2414218 0.9592832 2.3366 0.01946
Rho: -0.056746 LR test value: 0.51535 p-value: 0.47283
thank you in advance.
Ram
Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: Roger.Bivand at nhh.no