Skip to content

randomForest: Numeric deviation between 32/64 Windows builds

2 messages · Rosenberger George, Brian Ripley

#
Dear R Developers

I'm using the great randomForest package (4.6-7) for many projects and recently stumbled upon a problem when I wrote unit tests for one of my projects:

On Windows, there are small numeric deviations when using the 32- / 64-bit version of R, which doesn't seem to be a problem on Linux or Mac.

R64 on Windows produces the same results as R64/R32 on Linux or Mac:
MeanDecreaseGini
Sepal.Length         9.452470
Sepal.Width          2.037092
Petal.Length        43.603071
Petal.Width         44.116904

R32 on Windows produces the following:
MeanDecreaseGini
Sepal.Length         9.433986
Sepal.Width          2.249871
Petal.Length        43.594159
Petal.Width         43.941870

Is there a reason why this is different for the Windows builds? Are the compilers on Windows doing different things for 32- / 64-bit builds than the ones on Linux or Mac?

Thank you very much for your help.

Best regards,
George
#
On 15/10/2013 14:00, Rosenberger George wrote:
Yes, no (but these are not R issues).

There are bigger differences in the OS's equivalent of libm on Windows. 
  You did not tell us what CPUs your compilers targeted on Linux and OS 
X (sic), but generally they assume more than the i386 assumed on 32-bit 
Windows by Microsoft.  OTOH, all x86_64 OSes, including Windows, can 
assume more as all such CPUs have post-i686 features.  Remember Windows 
XP is still supported, and that was released in 2001.

Based on much wider experience than you give (e.g. reference results 
from R itself and recommended packages), deviations from x86_64 results 
are increasingly likely on OS X, i686 Linux and then i386 Windows.