Different PCA results under Windows and Linux
Hi Jathine, And then to see things more clearly still, you can do something like this on your test results: format(formatC(p1$var$coord, digits=15, format="f"), justify="right") and format(formatC(p1$var$coord, digits=16, format="f"), justify="right") Though I do hope that the second command doesn't begin to concern you even more. Regards, Mark.
Mark Difford wrote:
Hi Jathine,
I hope this can explain the problem a bit more clearly. Why PCA gives different results on the two different platforms?
What is amazing, Jathine, is how nearly exactly identical the two sets of results are, not that they begin to differ at the 16th decimal place. To assuage your concerns, do the following on the results from your two trials: round(p1$var$coord, 15) ?round ## And read the famous FAQ on floating point arithmetic It also isn't a very good idea to be doing PCAs on 0s and 1s Regards, Mark. jathine wrote:
Thank you for your reply. Here are some more info, I hope this can explain the problem a bit more clearly. Why PCA gives different results on the two different platforms? freqtest.txt file line text : M1 M2 M3 M4 M5 M6 M7 M8 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 ******Linux R script result and sessionInfo()
library(FactoMineR)
x1=read.table("freqtest.txt", header=TRUE)
xrcc2=x1[,1:8]
p1=PCA(xrcc2, graph=FALSE)
p1$var
$coord
Dim.1 Dim.2 Dim.3
M1 1 -3.925231e-16 -2.287663e-48
M2 1 7.850462e-17 -3.600641e-32
M3 1 7.850462e-17 9.001602e-33
M4 1 7.850462e-17 9.001602e-33
M5 0 0.000000e+00 0.000000e+00
M6 0 0.000000e+00 0.000000e+00
M7 1 7.850462e-17 9.001602e-33
M8 1 7.850462e-17 9.001602e-33
$cor
Dim.1 Dim.2 Dim.3
M1 1 -3.925231e-16 -2.287663e-48
M2 1 7.850462e-17 -3.600641e-32
M3 1 7.850462e-17 9.001602e-33
M4 1 7.850462e-17 9.001602e-33
M5 NaN NaN NaN
M6 NaN NaN NaN
M7 1 7.850462e-17 9.001602e-33
M8 1 7.850462e-17 9.001602e-33
$cos2
Dim.1 Dim.2 Dim.3
M1 1 1.540744e-31 5.233404e-96
M2 1 6.162976e-33 1.296462e-63
M3 1 6.162976e-33 8.102884e-65
M4 1 6.162976e-33 8.102884e-65
M5 NaN NaN NaN
M6 NaN NaN NaN
M7 1 6.162976e-33 8.102884e-65
M8 1 6.162976e-33 8.102884e-65
$contrib
Dim.1 Dim.2 Dim.3
M1 16.66667 83.333333 3.229346e-31
M2 16.66667 3.333333 8.000000e+01
M3 16.66667 3.333333 5.000000e+00
M4 16.66667 3.333333 5.000000e+00
M5 0.00000 0.000000 0.000000e+00
M6 0.00000 0.000000 0.000000e+00
M7 16.66667 3.333333 5.000000e+00
M8 16.66667 3.333333 5.000000e+00
sessionInfo()
R version 2.7.1 (2008-06-23) x86_64-redhat-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] FactoMineR_1.09
******Windows R script result and sessionInfo()
library(FactoMineR)
x1=read.table("freqtest.txt", header=TRUE)
xrcc2=x1[,1:8]
p1=PCA(xrcc2, graph=FALSE)
p1$var
$coord
Dim.1 Dim.2 Dim.3
M1 1 2.458061e-16 -4.590163e-49
M2 1 -4.916122e-17 -4.750455e-32
M3 1 -4.916122e-17 1.187614e-32
M4 1 -4.916122e-17 1.187614e-32
M5 0 0.000000e+00 0.000000e+00
M6 0 0.000000e+00 0.000000e+00
M7 1 -4.916122e-17 1.187614e-32
M8 1 -4.916122e-17 1.187614e-32
$cor
Dim.1 Dim.2 Dim.3
M1 1 2.458061e-16 -4.590163e-49
M2 1 -4.916122e-17 -4.750455e-32
M3 1 -4.916122e-17 1.187614e-32
M4 1 -4.916122e-17 1.187614e-32
M5 NaN NaN NaN
M6 NaN NaN NaN
M7 1 -4.916122e-17 1.187614e-32
M8 1 -4.916122e-17 1.187614e-32
$cos2
Dim.1 Dim.2 Dim.3
M1 1 6.042064e-32 2.106959e-97
M2 1 2.416826e-33 2.256682e-63
M3 1 2.416826e-33 1.410426e-64
M4 1 2.416826e-33 1.410426e-64
M5 NaN NaN NaN
M6 NaN NaN NaN
M7 1 2.416826e-33 1.410426e-64
M8 1 2.416826e-33 1.410426e-64
$contrib
Dim.1 Dim.2 Dim.3
M1 16.66667 83.333333 7.469228e-33
M2 16.66667 3.333333 8.000000e+01
M3 16.66667 3.333333 5.000000e+00
M4 16.66667 3.333333 5.000000e+00
M5 0.00000 0.000000 0.000000e+00
M6 0.00000 0.000000 0.000000e+00
M7 16.66667 3.333333 5.000000e+00
M8 16.66667 3.333333 5.000000e+00
sessionInfo()
R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] FactoMineR_1.09
Steven McKinney wrote:
Not likely that anyone can explain, as there is not enough information in your email. Including the contents of the freqtest.txt file was a good idea, as the posting guide suggests (the posting guide is that clearly labeled bit at the bottom that looks like this:
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Check it out! It is cool.) Additionally, include the command sessionInfo() and its output from all machines you refer to so maintainers know which versions of software you are running. Also, include the output you obtained from your code (with your code being a self-contained and reproducible set of R commands). Finally, describe what the difference is and why the difference is problematic (i.e. don't report machine precision differences, or sign differences for PCA results - PCA vector directions are arbitrary modulo 180 degrees). I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the same. Please explain. The R maintainers do an amazing job of creating numerically stable platform-independent software, so you get the same results almost everywhere. (Thank you R core!) HTH Steve McKinney -----Original Message----- From: r-help-bounces at r-project.org on behalf of jathine Sent: Tue 9/16/2008 2:19 PM To: r-help at r-project.org Subject: [R] Different PCA results under Windows and Linux I ran the following R script under both Linux and Windows, and got 2 different results. Linux R version 2.7.1 and Windows R version 2.7.2. library(FactoMineR) x1=read.table("freqtest.txt",header=TRUE) xrcc2=x1[,1:8] p1=PCA(xrcc2, graph=FALSE) p1$var freqtest.txt file lines of text : M1 M2 M3 M4 M5 M6 M7 M8 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the same. Please explain. -- View this message in context: http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19520449.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
View this message in context: http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19539474.html Sent from the R help mailing list archive at Nabble.com.