Log transformation and -Inf values for use in glm()
Paul, On Fri, Feb 6, 2009 at 3:25 PM, Paul Warren Simonin
<Paul.Simonin at uvm.edu> wrote:
Hello, I am writing regarding log transformation of data in a single matrix column, and subsequent use of these data in a glm model fit. I have a data matrix in which I am using the log function to transform the values. This transformation results in -Inf values in some places, though. I then receive an error when this matrix is used in the glm function, and would like to know this can be avoided. I have attempted several methods already including the use of na.exclue commands in the glm statement:
DistributionT<-glm(EarlyLn$yoyras~EarlyLn$temp,family=gaussian(link = "identity"),na.exclude)
I have also attempted to use the is.finite command: EarlyLn$yoyras<-EarlyLn[is.finite(EarlyLn$yoyras)==T,] I know another option would be to use a type of find and replace command to remove entire rows of the matrix that contain 0's (before log transformation) or -Inf (after transformation), but I do not know how this is done. Thank you for any advice or tips regarding conducting this transformation and feeding the data matrix into glm. Sincerely, Paul S.
In general, use syntax like this: glm(yoyras ~ log(temp), data = EarlyLn, subset = temp > 0) However, it's bad statistical practice to use a transformation that causes you to lose data. One approach is to add a constant to temp via: glm(yoyras ~ log(temp + 1), data = EarlyLn, subset = temp > 0) with the disadvantage being that the constant you choose is arbitrary but affects your inferences. Stephen Rochester, MN USA