Greetings, My question is more algorithmic than prectical. What I am trying to determine is, are the GAM algorithms used in the mgcv package affected by nonnormally-distributed residuals? As I understand the theory of linear models the Gauss-Markov theorem guarantees that least-squares regression is optimal over all unbiased estimators iff the data meet the conditions linearity, homoscedasticity, independence, and normally-distributed residuals. Absent the last requirement it is optimal but only over unbiased linear estimators. What I am trying to determine is whether or not it is necessary to check for normally-distributed errors in a GAM from mgcv. I know that the unsmoothed terms, if any, will be fitted by ordinary least-squares but I am unsure whether the default Penalized Iteratively Reweighted Least Squares method used in the package is also based upon this assumption or falls under any analogue to the Gauss-Markov Theorem. Thank you in advance for any help. Sincrely, Collin Lynch.
Nonnormal Residuals and GAMs
5 messages · David Winsemius, Collin Lynch, COLLINL at pitt.edu
On Nov 6, 2013, at 12:46 PM, Collin Lynch wrote:
Greetings, My question is more algorithmic than prectical. What I am trying to determine is, are the GAM algorithms used in the mgcv package affected by nonnormally-distributed residuals? As I understand the theory of linear models the Gauss-Markov theorem guarantees that least-squares regression is optimal over all unbiased estimators iff the data meet the conditions linearity, homoscedasticity, independence, and normally-distributed residuals. Absent the last requirement it is optimal but only over unbiased linear estimators. What I am trying to determine is whether or not it is necessary to check for normally-distributed errors in a GAM from mgcv. I know that the unsmoothed terms, if any, will be fitted by ordinary least-squares but I am unsure whether the default Penalized Iteratively Reweighted Least Squares method used in the package is also based upon this assumption or falls under any analogue to the Gauss-Markov Theorem.
The default functional link for mgcv::gam is "log", so I doubt that your theoretical understanding applies to GAM's in general. When Simon Wood wrote his book on GAMs his first chapter was on linear models, his second chapter was on generalized lienar models at which point he had written over 100 pages, and only then did he "introduce" GAMs. I think you need to follow the same progression, and this forum is not the correct one for statistics education. Perhaps pose your follow-up questions to CrossValidated.com
David Winsemius Alameda, CA, USA
The default functional link for mgcv::gam is "log", so I doubt that
your theoretical understanding applies to GAM's in general. When Simon Wood wrote his book on GAMs his first chapter was on linear models, his second chapter was on generalized lienar models at which point he had written over 100 pages, and only then did he "introduce" GAMs. I think you need to follow the same progression, and this forum is not the correct one for statistics education. Perhaps pose your follow-up questions to CrossValidated.com David, thank you for your advice, has the default changed for mgcv::gam? Based upon the help pages for the version I have (1.7-27) I had thought that the default family was gaussian() with link "identity". In any event I will look again at Simon Woods' book and consider CrossValidated in the future. Best, Collin.
On Nov 6, 2013, at 5:44 PM, Collin Lynch wrote:
The default functional link for mgcv::gam is "log", so I doubt that
your theoretical understanding applies to GAM's in general. When Simon Wood wrote his book on GAMs his first chapter was on linear models, his second chapter was on generalized lienar models at which point he had written over 100 pages, and only then did he "introduce" GAMs. I think you need to follow the same progression, and this forum is not the correct one for statistics education. Perhaps pose your follow-up questions to CrossValidated.com David, thank you for your advice, has the default changed for mgcv::gam? Based upon the help pages for the version I have (1.7-27) I had thought that the default family was gaussian() with link "identity". In any event I will look again at Simon Woods' book and consider CrossValidated in the future.
I may have gotten this wrong by only referring to my memory. I'm not able to tell by looking at either ?mgcv::gam or ?gam::gam pages where I picked up this notion.
David Winsemius Alameda, CA, USA
The default functional link for mgcv::gam is "log", so I doubt that
your theoretical understanding applies to GAM's in general. When Simon Wood wrote his book on GAMs his first chapter was on linear models, his second chapter was on generalized lienar models at which point he had written over 100 pages, and only then did he "introduce" GAMs. I think you need to follow the same progression, and this forum is not the correct one for statistics education. Perhaps pose your follow-up questions to CrossValidated.com David, thank you for your advice, has the default changed for mgcv::gam? Based upon the help pages for the version I have (1.7-27) I had thought that the default family was gaussian() with link "identity". In any event I will look again at Simon Woods' book and consider CrossValidated in the future.
I may have gotten this wrong by only referring to my memory. I'm not able to tell by looking at either ?mgcv::gam or ?gam::gam pages where I picked up this notion.
Ok, thanks.