Log transformation and -Inf values for use in glm()

2 messages · Paul Warren Simonin, Stephen D. Weigand

Fri, Feb 6, 2009 1:25 PM #

Hello,
   I am writing regarding log transformation of data in a single  
matrix column, and subsequent use of these data in a glm model fit. I  
have a data matrix in which I am using the log function to transform  
the values. This transformation results in -Inf values in some places,  
though. I then receive an error when this matrix is used in the glm  
function, and would like to know this can be avoided.
   I have attempted several methods already including the use of  
na.exclue commands in the glm statement:

I have also attempted to use the is.finite command:

EarlyLn$yoyras<-EarlyLn[is.finite(EarlyLn$yoyras)==T,]

I know another option would be to use a type of find and replace  
command to remove entire rows of the matrix that contain 0's (before  
log transformation) or -Inf (after transformation), but I do not know  
how this is done.

Thank you for any advice or tips regarding conducting this  
transformation and feeding the data matrix into glm.

Sincerely,
Paul S.

1 day later

Stephen D. Weigand

Sat, Feb 7, 2009 4:59 PM #

Paul,

On Fri, Feb 6, 2009 at 3:25 PM, Paul Warren Simonin

<Paul.Simonin at uvm.edu> wrote:

In general, use syntax like this:

glm(yoyras ~ log(temp), data = EarlyLn, subset = temp > 0)

However, it's bad statistical practice to use a transformation that
causes you to lose data. One approach is to add a constant to temp
via:

glm(yoyras ~ log(temp + 1), data = EarlyLn, subset = temp > 0)

with the disadvantage being that the constant you choose is arbitrary
but affects your inferences.

Stephen
Rochester, MN USA