Correlation between covariates and intercept (spatstat) - R-SIG-Geo

Sun, Apr 17, 2016 3:36 AM #

Virginia Morera Pujol <morera.virginia at gmail.com> writes:

This is about the correlation between *estimates* of the model coefficients - in this case, the correlation between the estimated intercept and the estimated coefficient of the distance covariate. Extremely high correlations could cause problems with the identifiability of the model, but this is probably not a problem here. Moderately high correlations suggest that the t-tests for individual parameters (given in the printout for the model) are not independent. If we want to select the 'significant' covariates, we shouldn't use the model printout to discard more than one variable at a time.

Such transformations will change the correlation. Roughly speaking, that's because when you add a constant to the distance covariate, you are adding a multiple of the intercept onto the covariate. 

When you say the 'effect' of the covariate has increased, do you mean the coefficient of the covariate has increased, or the *effect term* (= coefficient x covariate value) has increased? I'd be surprised if this happens - the models should be equivalent as regards their fitted intensity, etc.

Adrian Baddeley


Prof Adrian Baddeley DSc FAA
Department of Mathematics and Statistics
Curtin University, Perth, Western Australia

Virginia Morera Pujol

Mon, Apr 18, 2016 12:24 AM #

Hello,

You are right, what changes is the coefficient of the covariate, but it's
effect (coef+covariate value) is the same in both models, and selecting the
"significant covariates" won't be much of a problem as I only have 2
covariates in my model.

Thank you very much for your kind help!

Virginia Morera
PhD Student
Department of Animal Biology
University of Barcelona

Aquest correu electr?nic i els annexos poden contenir informaci?
confidencial o protegida legalment i est? adre?at exclusivament a la
persona o entitat destinat?ria. Si no sou  el destinatari final o la
persona encarregada de rebre?l, no esteu autoritzat a llegir-lo,
retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar-ne el
contingut. Si heu rebut aquest correu electr?nic per error, us preguem que
n?informeu al remitent i que elimineu del sistema el missatge i el material
annex que pugui contenir. Gr?cies per la vostra col?laboraci?.

Este correo electr?nico y sus anexos pueden contener informaci?n
confidencial o legalmente protegida y est? exclusivamente dirigido a la
persona o entidad destinataria. Si usted no es el destinatario final o la
persona encargada de recibirlo, no est? autorizado a leerlo, retenerlo,
modificarlo, distribuirlo, copiarlo ni a revelar su contenido. Si ha
recibido este mensaje electr?nico por error, le rogamos que informe al
remitente y elimine del sistema el mensaje y el material anexo que pueda
contener. Gracias por su colaboraci?n.

This email message and any documents attached to it may contain
confidential or legally protected material and are intended solely for the
use of the individual or organization to whom they are addressed. We remind
you that if you are not the intended recipient of this email message or the
person responsible for processing it, then you are not authorized to read,
save, modify, send, copy or disclose any of its contents. If you have
received this email message by mistake, we kindly ask you to inform the
sender of this and to eliminate both the message and any attachments it
carries from your account.Thank you for your collaboration.

2016-04-17 12:36 GMT+02:00 Adrian Baddeley <adrian.baddeley at curtin.edu.au>:

Virginia Morera Pujol <morera.virginia at gmail.com> writes:

In trying a spatial model with spatstat I am running into a conceptual
problem. It might be more of a general modelling doubt than a specific
spatial problem, but I hope someone can help.

I am running a ppm() model that includes two covariates (as pixel

images),

one is primary productivity at sea, and the other is distance to a point
that is not included in the pattern window. That means there is no 0

value,

the range of values goes from 400 to 1400 approx.  When I run the model

and

look at the var-covar matrix using 'vcov(model, what = "corr")' , there

is

a very strong correlation (around -0.85) between the intercept and this
covariate. I am not sure that this is a problem, but [...]

This is about the correlation between *estimates* of the model
coefficients - in this case, the correlation between the estimated
intercept and the estimated coefficient of the distance covariate.
Extremely high correlations could cause problems with the identifiability
of the model, but this is probably not a problem here. Moderately high
correlations suggest that the t-tests for individual parameters (given in
the printout for the model) are not independent. If we want to select the
'significant' covariates, we shouldn't use the model printout to discard
more than one variable at a time.

 I have tried a couple of things just in case:

1/ centering the covariate values around the mean just changes the sign

of

the correlation (from -0.85 to +0.85 approx).

2/ normalizing the covariate values, so the values go from 0 to 1 makes

the

correlation between this covariate and the intercept almost 1 (0.99) It
also makes the effect of this covariate three orders of magnitude higher
than the effect of the other covariate, which didn't happen before and

was

not expected from the data.

Such transformations will change the correlation. Roughly speaking, that's
because when you add a constant to the distance covariate, you are adding a
multiple of the intercept onto the covariate.

When you say the 'effect' of the covariate has increased, do you mean the
coefficient of the covariate has increased, or the *effect term* (=
coefficient x covariate value) has increased? I'd be surprised if this
happens - the models should be equivalent as regards their fitted
intensity, etc.

Adrian Baddeley


Prof Adrian Baddeley DSc FAA
Department of Mathematics and Statistics
Curtin University, Perth, Western Australia