Increasing number of observations worsen the regression model
Yes, it is important that it only happens with certan BLAS, so probably not really an R issue. However, there has been some concern over the C/Fortran interfaces lately, so if you could narrow it down to a specific BLAS routine, it could prove useful for the developers. One fairly easy thing to do would be to find the breakdown point. I speculate that it could be at 16384 (=2^14) and that some sort of endianness or integer width declaration is the cause. (It would in turn suggest that MKL is using 16-bit integers somehow, which doesn't really seem credible, but you never know.) I'm moving this to the r-devel list. It certainly is not for r-help. -pd
On 27 May 2019, at 10:47 , Ivan Krylov <krylov.r00t at gmail.com> wrote: On Sat, 25 May 2019 14:38:07 +0200 Raffa <raffamaiden at gmail.com> wrote:
I have tried to ask for example in CrossValidated <https://stats.stackexchange.com/questions/410050/increasing-number-of-observations-worsen-the-regression-model> but the code works for them. Any help?
In the comments you note that the problem went away after you replaced Intel MKL with OpenBLAS. This is important. The code that fits linear models in R is somewhat complex[*]; if you want to get to the bottom of the problem, you may have to take parts of it and feed them differently-sized linear regression problems until you narrow it down to a specific set of calls to BLAS or LAPACK functions which Intel MKL provides. One option would be to ask at Intel MKL forums[**]. -- Best regards, Ivan [*] https://madrury.github.io/jekyll/update/statistics/2016/07/20/lm-in-R.html [**] https://software.intel.com/en-us/forums/intel-math-kernel-library/
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com