Different results for same model in GEODA and R ?

Thu, Jan 17, 2008 1:12 PM

On Thu, 17 Jan 2008, Ram Pandit wrote:

Without making available your complete data set, your claim is worthless, 
because nobody can reproduce it - and not quoting the complete exact R 
code (history() of the whole session), and adding screen dumps (PrtScr) of 
GeoDa for all the steps taken. All such claims so far (and re. SpaceStat) 
over almost 10 years have been user misunderstandings or mistakes, and 
have been settled in threads on this list and the openspace list.

Please make a bundle of all the files needed to reproduce the problem and 
put them on a webserver, indicating where they can be picked up (or if 
sensitive attach them to an email to me off-list).

Step 1. See if a regular linear model can be reproduced - if not, you do 
not have the same data in both systems;

Step 2. See if the summary numbers for the neighbours agree, if not, the 
GAL files are not being represented in the same way;

Step 3. See if the weights agree (not so easy, but using a different 
variable with no autocorrelation, say a random variate, try a univariate 
Moran)

Question: do you have any missing values, and if so how are they 
represented?

I replied to this questioner off-list earlier without receiving an 
acknowledgement, seems to be in a hurry, and still has not been polite 
enough to give an affiliation. Please indicate your status (Professor of 
statistics, master's student in real estate, ...), it does help those who 
answer grasp why you might not understand.

Seriously, there is an enormous difference in the pleasure of answering to 
a well constructed question with a reproducable example, and the 
frustration of trying to arrest unsubstantiated and non-reproducable 
"reports" like this, which in my experience are very likely to be user 
error, and which certainly could have been checked more thoroughly.

Roger

GEODA indicated strong spatial lag and error dependency
by Moran's I and LR tests and highly significant coefficients of lagged
dependent variable and the lambda. However, in lagsarlm and errorsarlm
models in R I found both Rho and Lambda are insignificant by the LR tests
and also insignificant Moran's Is. Also the magnitude of coefficient
estimate differs for other model variables in these two applications.

What might have caused these differences? Is the parameter estimation by MLE
in GEODA and GLS (except Rho, which perhaps by MLE) in R made this
difference. Why moran's I is significant in one (GEODA) but not in other
(R)? What i am missing here?

Any clue and suggestion would be helpful to find this difference.  Following
is the data description and sample model results:
I have used country based data from 124 countries with some islands on it.
Created a gal file in GEODA and run simultaneous models in GEODA and R.

1.  sample GEODA out put for a model:

DIAGNOSTICS FOR SPATIAL DEPENDENCE
FOR WEIGHT MATRIX : gdpgi07.GAL  (row-standardized weights)
TEST                          MI/DF      VALUE          PROB
Moran's I (error)           0.266382     4.1677032      0.0000308
Lagrange Multiplier (lag)       1       20.7711707      0.0000052
Robust LM (lag)                 1        9.7078520      0.0018348
Lagrange Multiplier (error)     1       13.0279696      0.0003069
Robust LM (error)               1        1.9646508      0.1610168
Lagrange Multiplier (SARMA)     2       22.7358216      0.0000116

sample spatial lag model output for the same model in GEODA:
-----------------------------------------------------------------------
   Variable    Coefficient     Std.Error    z-value      Probability
-----------------------------------------------------------------------
      W_Y     0.5758454     0.05216534       11.03885    0.0000000
   CONSTANT     -43.92029       21.60366      -2.033003    0.0420521
        X1     0.3878955      0.0528817       7.335156    0.0000000
       X2     0.8597154      0.8199795        1.04846    0.2944269
   -----------------------------------------------------------------------

DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL LAG DEPENDENCE FOR WEIGHT MATRIX : gdpgi07.GAL
TEST                                     DF     VALUE         PROB
Likelihood Ratio Test                    1       37.85142     0.0000000

2. Following is the R results for the same model:

moran.test(Y,gdpgi07.queen,randomisation=FALSE,zero.policy=TRUE
,alternative="two.sided")

Moran's I test under normality

data:  Y
weights: gdpgi07.queen

Moran I statistic standard deviate = -1.1922, p-value = 0.2332
alternative hypothesis: two.sided
sample estimates:
Moran I statistic       Expectation          Variance
    -0.103175115      -0.008849558       0.006259578


      Global Moran's I for regression residuals

data:
model: lm(formula = Y ~X1 + X2 +.......)
weights: gdpgi07.queen

Moran I statistic standard deviate = -0.7751, p-value = 0.4383
alternative hypothesis: two.sided
sample estimates:
Observed Moran's I        Expectation           Variance
     -0.067836924       -0.006025453        0.006359538

Spatial lag model results:
model.lag<-lagsarlm(Y~X1+X2+......................,data=gdpgi,gdpgi07.queen,
zero.policy=TRUE)
summary(amph1.lag)
Type: lag
Regions with no neighbours included:
199 98 183 216 99 157 105 143 118 12
Coefficients: (asymptotic standard errors)
              Estimate  Std. Error z value  Pr(>|z|)
(Intercept) -59.5478780  25.4744792 -2.3376   0.01941
X1           0.4084015   0.0611086  6.6832 2.338e-11
X2          2.2414218   0.9592832  2.3366   0.01946

Rho: -0.056746 LR test value: 0.51535 p-value: 0.47283

thank you in advance.

Ram

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no

Different results for same model in GEODA and R ?

Thread (2 messages)