[Bioc-devel] affxparser: Core dump with R 2.14.x on OSX [take #2]
On Fri, Jan 20, 2012 at 3:51 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
On Fri, Jan 20, 2012 at 3:20 PM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
[bringing back to the list, because we could need some help from other developers with access to various OSX versions] Hi Dan, thanks for looking into this. On Fri, Jan 20, 2012 at 2:17 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
On Fri, Jan 20, 2012 at 1:57 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
On Fri, Jan 20, 2012 at 1:44 PM, Dan Tenenbaum <dtenenba at fhcrc.org> wrote:
Hi Henrik, On Fri, Jan 20, 2012 at 10:33 AM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
Hi, this is a kind request for the BioC team to have another look at fixing the binary affxparser builds. ?Quite a few OSX users on R v2.14.0 have R crashing because of this problem.
Thanks for the prompt and the detailed problem report.
Since thread 'Re: [Bioc-devel] affxparser: Core dump with R 2.14.0 on OSX' on Nov 7, 2011 [https://stat.ethz.ch/pipermail/bioc-devel/2011-November/002969.html] became cluttered with mistakes, I'm starting a new thread on the same topic. PROBLEM: The binary build of affxparser v1.26.2 for OSX provided by Bioconductor is broken and causes R v2.14.0 to crash ("core dump", "abort trap", ...) on OSX 10.6 ("Snow Leopard") and (I assume; someone please confirm) OSX 10.7 ("Lion"),
I can confirm that it happens on Lion too.
but not OSX 10.5 ("Leopard"). ?A
reproducible example is:
library("affxparser");
readCdfHeader("Mapping10K_Xba142.cdf");
which should return a named header. (Download CDF file:
http://www.aroma-project.org/data/annotationData/chipTypes/Mapping10K_Xba142/Mapping10K_Xba142.CDF.gz
; 2.2Mb). ?Another example is
[http://www.aroma-project.org/data/annotationData/chipTypes/Mapping250K_Nsp/Mapping250K_Nsp.cdf.gz]:
library("affxparser");
readCdfHeader("Mapping250K_Nsp.cdf");
CURRENT WORKAROUNDS:
- Install affxparser from source
[http://bioconductor.org/packages/2.9/bioc/src/contrib/affxparser_1.26.2.tar.gz].
- Install Kasper Hansen's binary build (not universal?)
[http://www.braju.com/R/repos/osx_10.6/affxparser_1.26.2.tgz] that
works on (at least) OSX 10.6.8.
See also aroma.affymetrix thread 'OSX 10.6 & 10.7 users: Workaround
for faulty BioC build of affxparser v1.26.2' on Jan 14, 2012
[https://groups.google.com/forum/#!topic/aroma-affymetrix/lEfDanThLEA/discussion]
TROUBLESHOOTING:
I can confirm that installing from source, works on an OSX 10.6.8
machine with R v2.14.1
(http://cran.r-project.org/bin/macosx/R-2.14.1.pkg). ?Installing
Kasper's binary build also works. ?I've a limited understanding on the
different types of OSX package binaries, only access to OSX 10.6.8,
making it hard for me to do any more troubleshooting, but as far as I
understand there is something wrong with the way affxparser is build
on the Bioconductor servers.
An important fact to bear in mind is that the BioC Mac build servers are running Leopard (OS X 10.5.8). It's a bit tricky to debug since it works fine on the platform it's built on...but using primitive means (Rprintf() statements), I was able to narrow down the problem to the FileHeaderReader::ReadMagicNumber() function in affxparser/src/fusion_sdk/calvin_files/parsers/src/FileHeaderReader.cpp In that function, the expression if (fileMagicNumber != DATA_FILE_MAGIC_NUMBER) evaluates to true, and therefore an affymetrix_calvin_exceptions::InvalidFileTypeException is thrown. I don't really know why the magic number is wrong, or would vary between operating systems, but perhaps this gives you something to go on? BTW, the trace is: R: readCdfHeader() C++: R_affx_get_cdf_file_header() FusionCDFData::ReadHeader() FusionCDFData::CreateObject() FusionCDFData::IsCalvinCompatibleFile() GenericFileReader::ReadFileHeaderNoDataGroupHeader() FileHeaderReader::Read() FileHeaderReader::ReadMagicNumber() Hope this helps. If I can be of assistance in further debugging this, please let me know.
I should also have mentioned that in the ReadMagicNumber() function, fileMagicNumber == 67 with the file Mapping10K_Xba142.cdf, and the expected magic number is 59.
Sorry for all the emails, but here's one more piece of info: If I run the package with the debug statements on pitt, our Leopard build machine, it works fine, as expected, but it also reports that fileMagicNumber is 67. So the exception is still thrown, but execution continues.
So when "execution continues" despite the incorrect magic number, do still get a valid CDF header readout at the R prompt?
Yes. Or at least, I assume it is valid. But no other errors are displayed. Here is what it displays: $ncols [1] 658 $nrows [1] 658 $nunits [1] 10208 $nqcunits [1] 9 $refseq [1] "" $chiptype [1] "Mapping10K_Xba142" $filename [1] "./Mapping10K_Xba142.CDF" $rows [1] 658 $cols [1] 658 $probesets [1] 10208 $qcprobesets [1] 9 $reference [1] ""
Whereas on my Lion machine, execution ends (after a pause) with "Abort trap: 6". So I am not sure whether this exception is really part of the problem, or just a red herring.
It could be a red herring; the incorrectly read magic header (first byte in the file) is just a side effect of something more complicated, but it is definitely a start. ?It is also a hint that we could/should update affxparser to at least catch this and give an error instead of crashing (but I'm sure if we should play with such updates, while troubleshooting the real cause). There is one more important clue available. This problem started to occur with BioC 2.9 and R v2.14.x. ?Previous BioC builds of affxparser did not cause this, and by even forcing an installation of the old affxparser v1.24.0 binaries on R v2.14.1 on OSX 10.6.8: ?http://bioconductor.org/packages/2.8/bioc/bin/macosx/leopard/contrib/2.13/affxparser_1.24.0.tgz it works. ?So, something "happened" between: affxparser_1.24.0.tgz: Packaged: 2011-04-15 09:35:06 UTC; biocbuild Built: R 2.13.0; universal-apple-darwin9.8.0; 2011-04-15 16:46:28 UTC; unix Archs: i386, ppc, x86_64 and affxparser_1.26.2.tgz Packaged: 2011-11-17 06:38:13 UTC; biocbuild Built: R 2.14.0; universal-apple-darwin9.8.0; 2011-11-17 15:39:53 UTC; unix Archs: i386, ppc, x86_64 (the first known report on this problem is from Nov 7, 2011 [http://goo.gl/ZqBsW], which is before the date of the latter). ?There is only one real update in affxparser v1.26.1, but that is in pure R code and more importantly not in code used in this bug report. ?So, rebuilding affxparser v1.24.0 on the BioC server will most likely cause the same crash as affxparser v1.26.2 does. BTW, are you planning to update to R v2.14.1 on the BioC OSX servers? With some luck, maybe that will fix it.
pitt is already running R 2.14.1, see: http://bioconductor.org/checkResults/release/bioc-LATEST/pitt-NodeInfo.html
It would be great if someone else with OSX 10.5.8 ("Leopard") could
build/install affxparser v1.26.2 from source are share it with us for
testing on OSX 10.6 & 10.7; that would help narrow down the source of
the problem. ?If such a build works, then it is much more likely that
there is something with the BioC OSX 10.5.8 server setup, whereas if
it also crashes, then we might have to search for the problem
elsewhere.
Sounds good, although we already tried this test, with another Leopard machine we had, and the resulting package also crashed on newer OSes. But it is probably still valuable to have someone else try this, and if they create a package that works on newer OSes, then we can start to look at differences in compilers, etc. BTW, the last known 'good' version was built on pelham, here is the node info for that machine: http://bioconductor.org/checkResults/2.8/bioc-20111021/pelham-NodeInfo.html It looks like the same versions of C and C++ compilers were used on both. One difference is that R CMD config CXX on pelham is "g++-4.2 -arch i386" whereas on pitt it is "g++ -arch i386", however, on pitt, ls -l /usr/bin/g++ lrwxr-xr-x ?1 root ?wheel ?7 Jun 29 ?2011 /usr/bin/g++ -> g++-4.2 so I guess they are really equivalent. (The same is true of gfortran, if that matters.) This makes me wonder if the "something" that "happened" could have happened between R 2.13 and 2.14. Of course, the problem seems to be deep inside C++ code that is not even using SEXPs, but it's possible something could have changed in .Call()... I'm in the process of doing further testing (trying to build my debug version on Lion and seeing if the magic number mismatch occurs). Will let you know what happens.
A couple more data points: If I build affxparser on Lion, the magic number mismatch still occurs, but the R function call completes without error. If I comment out the "throw" in FileHeaderReader::ReadMagicNumber(), and build it on pitt, the same error occurs on Lion. So it doesn't really matter, apparently, whether that exception is thrown, but that function is as far as I am able to trace before the "abort trap". FWIW, Dan
Dan
Thanks, Henrik
Hope this is helpful.... Dan
Dan
Thanks, Dan
Thanks Henrik
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel