Skip to content

problem with glm (PR#452)

2 messages · Brad McNeney, Peter Dalgaard

#
I looked into the glm problem some more. Seems to be a memory problem that
occurs just after garbage collection. The easiest way for me to
reproduce  the error is in lapply. I take a list and ask lapply to provide
the descriptive stat summary. I do this several times with the same list.
Most often it gives the correct summary, but just after allocSExp calls
R_gc, one of the summaries for a list element will be NULL. 

E.g for a data frame dat with five columns I get the following (having added
print statements in summary.default and in allocSExp just after the test
for whether gc is necessary):
[1] "exiting summary.default"
R_FreeSEXP == NULL  
[1] "exiting summary.default"
[1] "exiting summary.default"
[1] "exiting summary.default"
[1] "exiting summary.default"
$Delta
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    1.00    1.00    1.19    2.00    2.00 

$D
NULL

$R
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    1.00    1.00    0.86    1.00    1.00 

$V
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
0.004552 0.740700 1.575000 1.754000 3.205000 3.390000 

$X
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    0.0     0.0     0.5     0.5     1.0     1.0 


I don't know if the following is relevant, but for some reason I've found 
I can avoid this particular problem if I make a small change in lapply in 
its main loop.  Around line 50 of apply.c:

    PROTECT(ans = allocVector(VECSXP, n));
    for(i = 0; i < n; i++) {
        INTEGER(ind)[0] = i + 1;
        VECTOR(ans)[i] = eval(R_fcall, rho);
    }

if I change it to make an assignment to a
temporary variable first everything seems to work fine:

    PROTECT(ans = allocVector(VECSXP, n));
    for(i = 0; i < n; i++) {
        INTEGER(ind)[0] = i + 1;
        tem = eval(R_fcall, rho);
        VECTOR(ans)[i] = tem;
    }

hasn't given me any NULLs yet.
On 20 Feb 2000, Peter Dalgaard BSA wrote:

            
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Brad McNeney <mcneney@cs.sfu.ca> writes:
Now that is pretty darn odd... Looks quite a bit like a compiler bug.
Those two pieces of code should be equivalent as far as I can see.

One thing that you might do: objdump -dS apply.o allows you to see
what code the compiler generates for the two cases. Also, of course,
you can try reducing the level of optimization.


Alternatively, the mere addition of "tem" changes the local memory use
of that function so that something that got clobbered before doesn't
get clobbered now.

The effect doesn't seem to be present on Intels:
[1] FALSE