Skip to content

Wrong config check for __libc_stack_end

12 messages · Martin Maechler, Alba Pompeo, Simon Urbanek

#
> Here is my log from 'make check' using an Intel i5 64-bit
    > processor - http://pastebin.com/raw/N6SYAuFX Here is
    > Isaac's log from 'make check' using an Intel Atom 32-bit
    > processor - http://pastebin.com/raw/sey6DEk9

    > We are both on Alpine Linux, which uses the musl
    > libc. http://www.musl-libc.org/

    > Thank you very much.

It probably would have helped to choose a different subject
which I now do.

    > On Thu, Jan 28, 2016 at 9:54 AM, Alba Pompeo
> <albapompeo at gmail.com> wrote:
>> Hello, developers of R.
    >> 
    >> I have been unsuccessfully trying to build R on a musl
    >> libc system for the last days.  ./configure works, but
    >> make fails. The command that errors out is here -
    >> http://pastebin.com/raw/UwFRsiqT
    >> 
    >> It was brought to my attention that this is a (very
    >> longstanding) abuse of a private glibc symbol in R.
    >> 
    >> In R 3.2.3, it seems that configure is trying to test for
    >> it on Linux.  It apparently fails to accurately test (as
    >> demonstrated by the link error), perhaps because the test
    >> program does not actually *use* __libc_stack_end so it
    >> gets optimized out. (See line 35500 or so in
    >> R-3.2.3/configure.)  Ideally, the test program would
    >> check that a pointer to __libc_stack_end is non-null, but
    >> that's an autoconf bug.

So, ideally someone who knows autoconf much better than I do
should submit a bug report to the autoconf maintainers.

Back to R: I'm not familiar with that part of the code, neither
the configuration, nor the usage (in  R/src/unix/system.c ).
However, that code seems to be using a a glibc "feature" widely
available which does help making R startup (a very tiny bit ??)
faster.

    >> A work around was to 'export r_cv_libc_stack_end=no'
    >> before configuring R.  

which *does* solve that problem, right?

    >> However, there are a couple little issues with non-ASCII
    >> text and a *lot* of math differences, many of which say
    >> "*no* convergence: NOTIFY R-core!".

Hmm, I may be off, but these would look like entirely unrelated
with the libc_stack_end availibility, wouldn't they ?

Maybe you / the musl developers should try to make those C
libraries more "standard", notably because I would see math
differences as something pretty grave for R, and indeed, I would
not want to use a platform where R's math functions work
incompatibly with all other platforms ... but maybe I
misunderstand completely.

Hmm... I've found this,

http://wiki.musl-libc.org/wiki/Functional_differences_from_glibc#Floating-point_and_mathematical_library

which make what you say above more relevant/interesting.

Still, from this thread I get that the C source code of R needs
considerable configuration patches before R can work with musl.
But that needs another thread, something like  'Building R with musl'.

    >> Until these are resolved, R can't be packaged for
    >> distributions that use musl, such as Alpine Linux.

which I agree would not be ideal.
Martin

--
Martin <Maechler at stat.math.ethz.ch>  http://stat.ethz.ch/people/maechler
Seminar f?r Statistik, ETH Z?rich
#
On Feb 1, 2016, at 4:16 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:

            
Agreed, since there is actually no abuse, case was easily dismissed as bogus given the subject.
@Alba, can you, please, check that your hypothesis actually holds true and the latest R from trunk fixes the check for you?
No, it's actually very crucial as it is used to detect stack overflows.

Cheers,
Simon
#
@Simon. Here's what I did.
I checked out R revision 70059.
Ran export r_cv_libc_stack_end=no. (otherwise it would give that error
we talked about before)
Ran ./configure --without-recommended-packages. (otherwise it would
complain of not finding ./src/library/Recommended/MASS_*.tar.gz)
Ran make.
Ran make check. Log is here - http://pastebin.com/raw/cGJgqB8p

What do you think? Is there anything else I can do to help solve this issue?



On Mon, Feb 1, 2016 at 11:36 AM, Simon Urbanek
<simon.urbanek at r-project.org> wrote:
#
On Feb 1, 2016, at 9:56 AM, Alba Pompeo <albapompeo at gmail.com> wrote:

            
No, the point was that you use a clean checkout (do NOT build in the sources) and don't override anything ..
#
On Feb 1, 2016, at 9:56 AM, Alba Pompeo <albapompeo at gmail.com> wrote:

            
No, the whole point was to test this behavior. I see that the fix is in configure.ac but not configure so you'll need to run something like
aclocal -I m4 && autoconf
to update it.

Also please don't build in the sources - you'll have trouble making sure they are clean. It is recommended to build in a separate directory (see the docs).
I guess you forgot to run
tools/rsync-recommended 
perhaps? It doesn't matter either way for the above issues, but it's probably better to build with recommended packages.

Cheers,
Simon
#

        
> On Feb 1, 2016, at 4:16 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
[..............]   

    >> Back to R: I'm not familiar with that part of the code, neither
    >> the configuration, nor the usage (in  R/src/unix/system.c ).
    >> However, that code seems to be using a a glibc "feature" widely
    >> available which does help making R startup (a very tiny bit ??)
    >> faster.
    >> 

    > No, it's actually very crucial as it is used to detect stack overflows.

    > Cheers,
    > Simon


Well, I think you misunderstood what I meant to say (or then I'm
happy for clarification if I misunderstood you) :

The #ifdef ... #elseif ... #else ... # endif
branch which *uses* the __libc_stack_end "variable" would
hopefully be a speedup in comparison with the alternatives; from
system.c  mentioned above:


#if defined(HAVE_LIBC_STACK_END)
    R_CStackStart = (uintptr_t) __libc_stack_end;
#elif defined(HAVE_KERN_USRSTACK)
    {
	/* Borrowed from mzscheme/gc/os_dep.c */
	int nm[2] = {CTL_KERN, KERN_USRSTACK};
	void * base;
	size_t len = sizeof(void *);
	(void) sysctl(nm, 2, &base, &len, NULL, 0);
	R_CStackStart = (uintptr_t) base;
    }
#else
    if(R_running_as_main_program) {
	/* This is not the main program, but unless embedded it is
	   near the top, 5540 bytes away when checked. */
	R_CStackStart = (uintptr_t) &i + (6000 * R_CStackDir);
    }
#endif
    if(R_CStackStart == -1) R_CStackLimit = -1; /* never set */

    /* printf("stack limit %ld, start %lx dir %d \n", R_CStackLimit,
	      R_CStackStart, R_CStackDir); */
}
#endif

so I'd hope that typically  R_CStackStart  would be set usefully
also when the  __libc_stack_end  is not available.

If not, that would mean that for the 'musl' lovers, R would not
be able to detect stack overflows.... which would probably be
quite undesirable.
#
Here's what I did.

svn checkout https://svn.r-project.org/R/trunk/
cd ./trunk
aclocal -I m4 && autoconf
tools/rsync-recommended
cd ..
mkdir build
cd build
../trunk/configure
make
make check

On make check it gives an error.
Here's the log.
http://pastebin.com/raw/1qfjqQY2


On Mon, Feb 1, 2016 at 1:53 PM, Simon Urbanek
<simon.urbanek at r-project.org> wrote:
#
But it looks like R is working. I found the R binary on build/bin/R
I ran it and it works.
Should I be worried about the make check log?

@Isaac Dunham
Can you please test this on your system too?
Maybe R can be packaged soon?

Ciao.
On Mon, Feb 1, 2016 at 3:33 PM, Alba Pompeo <albapompeo at gmail.com> wrote:
#
> Here's what I did.
    > svn checkout https://svn.r-project.org/R/trunk/
    > cd ./trunk
    > aclocal -I m4 && autoconf
    > tools/rsync-recommended
    > cd ..
    > mkdir build
    > cd build
    > ../trunk/configure
    > make
    > make check

    > On make check it gives an error.
    > Here's the log.
    > http://pastebin.com/raw/1qfjqQY2

Thank you.  It shows some output differences for complex
arithmetic, which *may* be a bad sign for the  musl routines, or
the (also alternative ??)  math lib  you have on your platform.
But these differences where not leading to the failure, 
rather is the reason close to the end of the log:
------------------------------------------------
make[3]: *** [reg-tests-1c.Rout] Error 1
------------------------------------------------

and these are the very latest regression checks, so they should not fail.
If you want, you can also make the
   tests/reg-tests-1c.Rout.fail

file available via a link above,
but to me, it currently looks there needs to be a bit more work
on your system libraries (or possibly on our configuration) side
before you should bundle R with your Alpine Linux.

I'd call it "unsafe" for now.
Martin

--
Martin Maechler, ETH Zurich and R Core Team.
>>>> On Feb 1, 2016, at 4:16 AM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
>>>> 
    >>>>>>>>>> Alba Pompeo <albapompeo at gmail.com>
    >>>>>>>>>> on Fri, 29 Jan 2016 08:23:26 -0200 writes:

  [.........]

    >>>>>>> However, there are a couple little issues with non-ASCII
    >>>>>>> text and a *lot* of math differences, many of which say
    >>>>>>> "*no* convergence: NOTIFY R-core!".
    >>>>> 
    >>>>> Hmm, I may be off, but these would look like entirely unrelated
    >>>>> with the libc_stack_end availibility, wouldn't they ?
    >>>>> 
    >>>>> Maybe you / the musl developers should try to make those C
    >>>>> libraries more "standard", notably because I would see math
    >>>>> differences as something pretty grave for R, and indeed, I would
    >>>>> not want to use a platform where R's math functions work
    >>>>> incompatibly with all other platforms ... but maybe I
    >>>>> misunderstand completely.
    >>>>> 
    >>>>> Hmm... I've found this,
    >>>>> 
    >>>>> http://wiki.musl-libc.org/wiki/Functional_differences_from_glibc#Floating-point_and_mathematical_library
    >>>>> 
    >>>>> which make what you say above more relevant/interesting.
    >>>>> 
    >>>>> Still, from this thread I get that the C source code of R needs
    >>>>> considerable configuration patches before R can work with musl.
    >>>>> But that needs another thread, something like  'Building R with musl'.
    >>>>> 
    >>>>>>> Until these are resolved, R can't be packaged for
    >>>>>>> distributions that use musl, such as Alpine Linux.
    >>>>> 
    >>>>> which I agree would not be ideal.
    >>>>> Martin
    >>>>> 
    >>>>> --
    >>>>> Martin <Maechler at stat.math.ethz.ch>  http://stat.ethz.ch/people/maechler
    >>>>> Seminar f?r Statistik, ETH Z?rich
#
Here is tests/reg-tests-1c.Rout.fail -
http://pastebin.com/raw/3QVDUBwT

About the libm, I don't know which one R uses.
musl has its on libm. http://git.musl-libc.org/cgit/musl/tree/src/math
I think I also have openlibm installed, but I don't think that's used.

Any more information I can give to help debug this?

Thanks.


On Mon, Feb 1, 2016 at 3:49 PM, Martin Maechler
<maechler at stat.math.ethz.ch> wrote:
#
On Feb 1, 2016, at 12:32 PM, Martin Maechler <maechler at stat.math.ethz.ch> wrote:

            
But how do you know that the kernel method will work there? What I meant is that the facility it provides is important - it's unclear under what circumstances it is necessary or not - that depends on the kernel, OS, hardware etc. Hence __libc_stack_end is not just a speed hack (which your comment seemed to imply) - when available it is reliable and thus preferred, wheres are fallback methods may or may not work and to varying degree of reliability.

Cheers,
Simon
#
AP> Here is tests/reg-tests-1c.Rout.fail -
    AP> http://pastebin.com/raw/3QVDUBwT


Thank you .... so it fails only at the very very end,
where I had added regression checks for a very recent bug
fix... but when I see your result ... embarassingly ... I guess
that that *check* may have been wrong..

Otherwise inspecting that file:

- shows some possible printf
  differences between musl and glibc  but they don't look problematic.

- considerably more problematic seems to me that the random
  numbers differ in some places *AFTER* a set.seed() call.  That
  should not happen {unless we call a randomized algorithm
  somewhere before that, and the algorithm does not converge in
  the same number of steps ... that's possible... e.g., kmeans()
  is "famous" for that}
  

    AP> About the libm, I don't know which one R uses.  musl has
    AP> its on
    AP> libm. http://git.musl-libc.org/cgit/musl/tree/src/math
    AP> I think I also have openlibm installed, but I don't think
    AP> that's used.

    AP> Any more information I can give to help debug this?

Maybe just privately, to confirm my suspicion above:
Send me privately the result of 'f1' from that R script.

Martin





    AP> On Mon, Feb 1, 2016 at 3:49 PM, Martin Maechler
AP> <maechler at stat.math.ethz.ch> wrote:
>>>>>>> Alba Pompeo <albapompeo at gmail.com> on Mon, 1 Feb
    >>>>>>> 2016 15:33:11 -0200 writes:
    >> 
    >> > Here's what I did.  > svn checkout
    >> https://svn.r-project.org/R/trunk/ > cd ./trunk > aclocal
    >> -I m4 && autoconf > tools/rsync-recommended > cd ..  >
    >> mkdir build > cd build > ../trunk/configure > make > make
    >> check
    >> 
    >> > On make check it gives an error.  > Here's the log.  >
    >> http://pastebin.com/raw/1qfjqQY2
    >> 
    >> Thank you.  It shows some output differences for complex
    >> arithmetic, which *may* be a bad sign for the musl
    >> routines, or the (also alternative ??)  math lib you have
    >> on your platform.  But these differences where not
    >> leading to the failure, rather is the reason close to the
    >> end of the log:
    >> ------------------------------------------------
    >> make[3]: *** [reg-tests-1c.Rout] Error 1
    >> ------------------------------------------------
    >> 
    >> and these are the very latest regression checks, so they
    >> should not fail.  If you want, you can also make the
    >> tests/reg-tests-1c.Rout.fail
    >> 
    >> file available via a link above, but to me, it currently
    >> looks there needs to be a bit more work on your system
    >> libraries (or possibly on our configuration) side before
    >> you should bundle R with your Alpine Linux.
    >> 
    >> I'd call it "unsafe" for now.  Martin
    >> 
    >> --
    >> Martin Maechler, ETH Zurich and R Core Team.
    >> 
    >> >>>> On Feb 1, 2016, at 4:16 AM, Martin Maechler
>> <maechler at stat.math.ethz.ch> wrote:
>> >>>>
    >> >>>>>>>>>> Alba Pompeo <albapompeo at gmail.com> >>>>>>>>>>
    >> on Fri, 29 Jan 2016 08:23:26 -0200 writes:
    >> 
    >> [.........]
    >> 
    >> >>>>>>> However, there are a couple little issues with
    >> non-ASCII >>>>>>> text and a *lot* of math differences,
    >> many of which say >>>>>>> "*no* convergence: NOTIFY
    >> R-core!".
    >> >>>>>
    >> >>>>> Hmm, I may be off, but these would look like
    >> entirely unrelated >>>>> with the libc_stack_end
    >> availibility, wouldn't they ?
    >> >>>>>
    >> >>>>> Maybe you / the musl developers should try to make
    >> those C >>>>> libraries more "standard", notably because
    >> I would see math >>>>> differences as something pretty
    >> grave for R, and indeed, I would >>>>> not want to use a
    >> platform where R's math functions work >>>>> incompatibly
    >> with all other platforms ... but maybe I >>>>>
    >> misunderstand completely.
    >> >>>>>
    >> >>>>> Hmm... I've found this,
    >> >>>>>
    >> >>>>>
    >> http://wiki.musl-libc.org/wiki/Functional_differences_from_glibc#Floating-point_and_mathematical_library
    >> >>>>>
    >> >>>>> which make what you say above more
    >> relevant/interesting.
    >> >>>>>
    >> >>>>> Still, from this thread I get that the C source
    >> code of R needs >>>>> considerable configuration patches
    >> before R can work with musl.  >>>>> But that needs
    >> another thread, something like 'Building R with musl'.
    >> >>>>>
    >> >>>>>>> Until these are resolved, R can't be packaged for
    >> >>>>>>> distributions that use musl, such as Alpine
    >> Linux.
    >> >>>>>
    >> >>>>> which I agree would not be ideal.  >>>>> Martin
    >> >>>>>
    >> >>>>> --
    >> >>>>> Martin <Maechler at stat.math.ethz.ch>
    >> http://stat.ethz.ch/people/maechler >>>>> Seminar f?r
    >> Statistik, ETH Z?rich