R crashing after successfully running compiled code

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121030/6bfef35e/attachment.pl>

I'm running R 2.15.1x64, though the same problem persists for 2.13.0x32 and
2.13.0x64.

I am trying to run compiled C code using the .C convention. The
code compiles without problems, dynamically loads within the R
workspace with no problems, and even runs and gives correct results with
no problems.

However, R will randomly crash within a few minutes of successfully using
the compiled function.

For example, if I run my compiled function using:
dyn.load("mycfun.dll")
answer<-.C("mycfun", parameters...), I get a completely sensible
result that gets stored to "answer".
However, if I try to do too many things to "answer", the R exits
without warning.
I've tried dyn.unload in hopes that R would become stable afterwards, but
in this case using the function crashes R without fail.

Usually, I can either plot, or view, or save "answer" to a file - but never
take more than a single action before R exits. This does not appear to
depend on how long R has been open. Initially, I thought it was a bug in
the "inline" function, but I'm finding the same problem now that I'm using
the dynamically loaded file directly. I'm used to R being insanely stable,
and am somewhat mystified by this whole problem.

My next move is to learn the ".Call" convention, as I suspect that
my problem is related to my "C" function using memory that R doesn't
know is used. But - before I invest a while lot more time on this, I'd
like to know whether anybody things this is likely to solve the problem.
If not, I may just want to run my code entirely in C, and forget the
R problem.
Hi Adam,

Can you make a minimal reproducible example of your C sources available? I'm relatively certain that the problem is in the memory management therein, but I obviously can't say more without seeing the code. 

Michael
-- 
Adam Clark
University of Minnesota, EEB
100 Ecology Building
1987 Upper Buford Circle
St. Paul, MN 55108
(857)-544-6782

   [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
I'm running R 2.15.1x64, though the same problem persists for 2.13.0x32 and
2.13.0x64.

I am trying to run compiled C code using the .C convention. The
code compiles without problems, dynamically loads within the R
workspace with no problems, and even runs and gives correct results with
no problems.

However, R will randomly crash within a few minutes of successfully using
the compiled function.

For example, if I run my compiled function using:
dyn.load("mycfun.dll")
answer<-.C("mycfun", parameters...), I get a completely sensible
result that gets stored to "answer".
However, if I try to do too many things to "answer", the R exits
without warning.
I've tried dyn.unload in hopes that R would become stable afterwards, but
in this case using the function crashes R without fail.

Usually, I can either plot, or view, or save "answer" to a file - but never
take more than a single action before R exits. This does not appear to
depend on how long R has been open. Initially, I thought it was a bug in
the "inline" function, but I'm finding the same problem now that I'm using
the dynamically loaded file directly. I'm used to R being insanely stable,
and am somewhat mystified by this whole problem.

My next move is to learn the ".Call" convention, as I suspect that
my problem is related to my "C" function using memory that R doesn't
know is used. But - before I invest a while lot more time on this, I'd
like to know whether anybody things this is likely to solve the problem.
If not, I may just want to run my code entirely in C, and forget the
R problem.
I think your C code has a bug in it.  The bug might go away when you 
rewrite the function to work within the .Call convention, but it is 
probably easier to find the bug and fix it with the current code.

Things to look for:

Are you fully allocating all arrays in R before passing them to C?  The 
C code receives a pointer and will happily write to it, whether that 
makes sense or not.

Are you careful with your limits on vectors?  In R, a vector is indexed 
from 1 to n, but the same vector in C is indexed from 0 to n-1.  If the 
C code writes to entry n, that will eventually cause problems.

Are you allocating memory in your C code?  There are several ways to do 
that, depending on how you want it managed.  If you do it one way and 
expect it to be managed in a different way, you'll get problems.

Duncan Murdoch

On 12-10-30 11:13 PM, Adam Clark wrote:
I'm running R 2.15.1x64, though the same problem persists for 2.13.0x32 and
2.13.0x64.

I am trying to run compiled C code using the .C convention. The
code compiles without problems, dynamically loads within the R
workspace with no problems, and even runs and gives correct results with
no problems.

However, R will randomly crash within a few minutes of successfully using
the compiled function.

For example, if I run my compiled function using:
dyn.load("mycfun.dll")
answer<-.C("mycfun", parameters...), I get a completely sensible
result that gets stored to "answer".
However, if I try to do too many things to "answer", the R exits
without warning.
I've tried dyn.unload in hopes that R would become stable afterwards, but
in this case using the function crashes R without fail.

Usually, I can either plot, or view, or save "answer" to a file - but never
take more than a single action before R exits. This does not appear to
depend on how long R has been open. Initially, I thought it was a bug in
the "inline" function, but I'm finding the same problem now that I'm using
the dynamically loaded file directly. I'm used to R being insanely stable,
and am somewhat mystified by this whole problem.

My next move is to learn the ".Call" convention, as I suspect that
my problem is related to my "C" function using memory that R doesn't
know is used. But - before I invest a while lot more time on this, I'd
like to know whether anybody things this is likely to solve the problem.
If not, I may just want to run my code entirely in C, and forget the
R problem.
I think your C code has a bug in it.  The bug might go away when you rewrite 
the function to work within the .Call convention, but it is probably easier 
to find the bug and fix it with the current code.

Things to look for:

Are you fully allocating all arrays in R before passing them to C?  The C 
code receives a pointer and will happily write to it, whether that makes 
sense or not.

Are you careful with your limits on vectors?  In R, a vector is indexed from 
1 to n, but the same vector in C is indexed from 0 to n-1.  If the C code 
writes to entry n, that will eventually cause problems.
Using R-devel and the following new feature

     ? There is a new option, options(CBoundsCheck=), which controls how
       .C() and .Fortran() pass arguments to compiled code.  If true
       (which can be enabled by setting the environment variable
       R_C_BOUNDS_CHECK to yes), raw, integer, double and complex
       arguments are always copied, and checked for writing off either
       end of the array on return from the compiled code (when a second
       copy is made).  This also checks individual elements of character
       vectors passed to .C().

       This is not intended for routine use, but can be very helpful in
       finding segfaults in package code.

makes checking these two points a lot easier.
Are you allocating memory in your C code?  There are several ways to do that, 
depending on how you want it managed.  If you do it one way and expect it to 
be managed in a different way, you'll get problems.
If you can run your code under valgrind (see 'Writing R Extensions') 
you will usually get pointed to the exact cause.  But that's for 
Linux, and with some care MacOS X.
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121031/1c4695e6/attachment.pl>
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121031/5e499095/attachment.pl>
As long as you use C (or C++ or Fortran ...), using memory that
you don't own is possible.  This is one reason people use languages
like R.

(If you program microprocessors or write operating system code
then C's willingness to let you read or write any at address is essential.)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Adam Clark
Sent: Wednesday, October 31, 2012 8:47 AM
To: Adam Clark
Cc: r-help at r-project.org; Prof Brian Ripley
Subject: Re: [R] R crashing after successfully running compiled code

Aha - got it.

My problem was that I had a pointer (*Aest) that had less memory allocated
to it than I ended up storing in it (e.g. I had *Aest = 1:100, but stored
into it values at positions 5:105). With that fixed, all works flawlessly.

Thanks a lot for the help. What I hadn't realized was that .C allowed you
to exceed memory allocations - I'd have assumed that it would just crash
while I was running the function as soon as it ran out of space. Instead, I
guess it must have been writing beyond the space allocated for *Aest, and
not running into any trouble until R tried to store something else in the
same spot later (e.g. while plotting a figure).

It was immensely helpful to hear from you all that the function could still
have a bug, even if it ran successfully. As I understand it, the way
".Call" passes variables, this sort of a mistake would not be possible. Is
this true? I'm tempted to learn the new syntax, but I also like how ".C"
allows me to keep things looking more or less like normal C code.

Adam

PS - As you mention, Prof. Ripley, I am insane for trying to do this
without a good debugger. Also, as you point out, valgrind doesn't run on
Cygwin. If you know of any useful PC debuggers, I'd be most grateful - but
if all else fails for debugging, I can just run Ubuntu in an Oracle
VirtualBox.

On Wed, Oct 31, 2012 at 9:38 AM, Adam Clark <atclark at umn.edu> wrote:

Thanks for the advice.

I'll go ahead and dig through my C code. It's helpful to know that my C
code can cause R to crash AFTER successfully implementing the code.

I have made sure to account for C's vector indexing, and I think I'm
allocating my C memory, and passing information back and forth between C
and R, as I should. I'm including my input/output stuff below.

I tried including options(CBoundsCheck=TRUE) in the script both before
and after loading the C function, which doesn't seem to do much. To get it
to work, do I actually need to go into the R configuration file and edit
the default?

It would be exceedingly helpful if anybody could give me tips on where I'm
misusing pointers in the example below. That said, I certainly don't expect
the R community to debug my C code for me. If I come up with a solution,
I'll email it out over the list.

In R, I run the script:
dyn.load("mycfun.dll")
set.seed(1031)
A<-1:100
B<-runif(100)
myfunC<-function(A, B, M, N) {
result<-as.double(rep(0, (length(A)- N -(M+1))))
 plengtht<-as.integer(length(A))
Aest<-as.numeric(rep(0, (length(A)- N -(M+1))))
 distances<-as.numeric(rep(0, length(A)))
neighbors<-as.integer(rep(0, (M+1)))
 u<-as.numeric(rep(0, (M+1)))
w<-as.numeric(rep(0, (M+1)))
 return(.C("mycfun", as.double(A), as.double(B), as.integer(M),
as.integer(N),
result=as.double(result), as.integer(plengtht), as.double(Aest),
as.double(distances),
 as.integer(neighbors), as.double(u), as.double(w))$result)
}

fun_result<-myfunC(A,B,3,1)

This corresponds to the C code (input output only):
#include <R.h>
#include <Rmath.h>
void mycfun(double *A, double *B, int *pM, int *pN, double *result, int
*plengtht,
double *Aest, double *distances, int *neighbors, double *u, double *w) {
int t, i, j, n, from, to, nneigh;
double distsv, sumu, sumaest, corout;
int M = *pM;
int N = *pN;
int lengtht= *plengtht;
n=0;

##### running various loops over variables #####

result[n]=corout;
    n=n+1;
}

##### END #####

I also have two sub-functions that manipulate "neighbors" and "distances"
- I can send the i/o for those as well, but they seem much more
straightforward, since I don't need to pass all my arguments as pointers. I
pass the pointers to internal variables at the beginning because I couldn't
index any C arrays using *pM or *pN.

Many thanks,
Adam

On Wed, Oct 31, 2012 at 7:12 AM, Prof Brian Ripley <ripley at stats.ox.ac.uk>wrote:

On Wed, 31 Oct 2012, Duncan Murdoch wrote:

 On 12-10-30 11:13 PM, Adam Clark wrote:

I'm running R 2.15.1x64, though the same problem persists for 2.13.0x32
and
2.13.0x64.

I am trying to run compiled C code using the .C convention. The
code compiles without problems, dynamically loads within the R
workspace with no problems, and even runs and gives correct results with
no problems.

However, R will randomly crash within a few minutes of successfully
using
the compiled function.

For example, if I run my compiled function using:
dyn.load("mycfun.dll")
answer<-.C("mycfun", parameters...), I get a completely sensible
result that gets stored to "answer".
However, if I try to do too many things to "answer", the R exits
without warning.
I've tried dyn.unload in hopes that R would become stable afterwards,
but
in this case using the function crashes R without fail.

Usually, I can either plot, or view, or save "answer" to a file - but
never
take more than a single action before R exits. This does not appear to
depend on how long R has been open. Initially, I thought it was a bug in
the "inline" function, but I'm finding the same problem now that I'm
using
the dynamically loaded file directly. I'm used to R being insanely
stable,
and am somewhat mystified by this whole problem.

My next move is to learn the ".Call" convention, as I suspect that
my problem is related to my "C" function using memory that R doesn't
know is used. But - before I invest a while lot more time on this, I'd
like to know whether anybody things this is likely to solve the problem.
If not, I may just want to run my code entirely in C, and forget the
R problem.

I think your C code has a bug in it.  The bug might go away when you
rewrite the function to work within the .Call convention, but it is
probably easier to find the bug and fix it with the current code.

Things to look for:

Are you fully allocating all arrays in R before passing them to C?  The
C code receives a pointer and will happily write to it, whether that makes
sense or not.

Are you careful with your limits on vectors?  In R, a vector is indexed
from 1 to n, but the same vector in C is indexed from 0 to n-1.  If the C
code writes to entry n, that will eventually cause problems.

Using R-devel and the following new feature

    * There is a new option, options(CBoundsCheck=), which controls how
      .C() and .Fortran() pass arguments to compiled code.  If true
      (which can be enabled by setting the environment variable
      R_C_BOUNDS_CHECK to yes), raw, integer, double and complex
      arguments are always copied, and checked for writing off either
      end of the array on return from the compiled code (when a second
      copy is made).  This also checks individual elements of character
      vectors passed to .C().

      This is not intended for routine use, but can be very helpful in
      finding segfaults in package code.

makes checking these two points a lot easier.

 Are you allocating memory in your C code?  There are several ways to do
that, depending on how you want it managed.  If you do it one way and
expect it to be managed in a different way, you'll get problems.

If you can run your code under valgrind (see 'Writing R Extensions') you
will usually get pointed to the exact cause.  But that's for Linux, and
with some care MacOS X.

--
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,
http://www.stats.ox.ac.uk/~**ripley/<http://www.stats.ox.ac.uk/~ripley/>
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

--
Adam Clark
University of Minnesota, EEB
100 Ecology Building
1987 Upper Buford Circle
St. Paul, MN 55108
(857)-544-6782

--
Adam Clark
University of Minnesota, EEB
100 Ecology Building
1987 Upper Buford Circle
St. Paul, MN 55108
(857)-544-6782

	[[alternative HTML version deleted]]
Adam Clark <atclark at umn.edu> wrote:
I'll go ahead and dig through my C code. 
My problem was that I had a pointer (*Aest) that had less memory allocated
to it than I ended up storing in it ...

As long as you use C (or C++ or Fortran ...), using memory that you don't own is possible.  This is one reason people use languages like R.
This seems suitable for the fortunes collection. Any seconds to this nomination?
David Winsemius, MD
Alameda, CA, USA