data normality
On 05/06/2012 10:54 AM, Yong Shen wrote:
Dear all, I have two questions about data normality. I used stepwise multiple regression to determine which variables contributed to tree growth, and want to built a model to explain tree growth. Sample size is about 50 tree species, I think it is not a large sample size, and some variables are not normal distribution. 1. Do I have to transform them to normal distributions before I perform multiple regression?
No. The only area where a Normal assumption comes in is that the residuals are normally distributed. So you can happily fit the model without worrying about normality until after you've got the model.
2. Two variables can not transform to normal distributions although I used some methods (e.g log, sqrt, boxcoxfit), what should I do for the two variables?
Leave them as they are. Advice that makes life simpler - always the best sort. Bob
Bob O'Hara Biodiversity and Climate Research Centre Senckenberganlage 25 D-60325 Frankfurt am Main, Germany Tel: +49 69 798 40226 Mobile: +49 1515 888 5440 WWW: http://www.bik-f.de/root/index.php?page_id=219 Blog: http://blogs.nature.com/boboh Journal of Negative Results - EEB: www.jnr-eeb.org