Skip to content

Arules: R Crashes when running eclat with tidLists=TRUE

5 messages · Matias Salibian-Barrera, Tom D. Harray, Brian Ripley +1 more

#
Hello,

I'm using the eclat function of the arules package (1.0-6) for the
identification of frequent itemsets. I need the tidLists, but if I set
in the function tidLists=TRUE R crashes (Windows XP Professional SP3,
32 bit, R version 2.12.1 (2010-12-16), reproducible on two different
computers) with two different error messages or non at all. Minimum
examples are:

	library(arules)
	data("Adult")

	eclat(Adult, parameter=list(support=0.05, tidLists=FALSE)) # OK
	eclat(Adult, parameter=list(support=0.50, tidLists=TRUE))  # OK
	eclat(Adult, parameter=list(support=0.05, tidLists=TRUE))  # crashes

I started to search the cause of the problem and figured out that the
problem must be  located in the external function reclat called in
eclat.R using the source of the arules package:

    result <- .Call("reclat",
        ## transactions
        items at p,
        items at i,
        items at Dim,
        ## parameter
        parameter, control,
        data at itemInfo,
        PACKAGE = "arules")

Then I looked into the source code of reclat.c, trying to follow the
tidList=TRUE parameter, but I have no C programming experience. It
looks to me that the problem may be in the item set report function
"_report_R" where memory is allocated and or released (see code
fragments below).

My question are:

a) Does anyone else can confirm the problem?

b) Does anyone know, if this may be a problem of the binary package
(e.g. how it was compiled) or if it is a problem of the source code
for reclat?

c) Does anyone know how to fix the problem and how?

Thanks and regards,

Dirk


/*----------------------------------------------------------------------
  Item Set Report Function
----------------------------------------------------------------------*/

static void _report_R (int *ids, int cnt, int supp, int *tal, void *data)
{
(...)

	if (flags & OF_LIST) {
			  vec1 = (int*)realloc(ruleset->trnb, size1 *sizeof(int));
			  if (!vec1) {
				  if (vec) free(vec);
				  if (vec2) free(vec2);
				  _cleanup(); error(msg(E_NOMEM));}
			  ruleset->trnb = vec1;
		  }
(...)
	if (flags & OF_LIST) {        /* if to list the transactions, */
	  h = ruleset->trtotal;
	  if (supp < 0) {             /* if bit vector representation */
		  for (i = 0; i < tacnt; i++) {  /* traverse the bit vector */
			  if (h >= size2) {
				  size2 += (size2 > BLKSIZE) ? (size2 >> 1) : BLKSIZE;
				  vec1 = (int*)realloc(ruleset->trans, size2 *sizeof(int));
				  if (!vec1) {
					  if (vec) free(vec);
					  if (vec2) free(vec2);
					  _cleanup(); error(msg(E_NOMEM));}
				  ruleset->trans = vec1;
			  }
			  if (tal[i >> BM_SHIFT] & (1 << (i & BM_MASK))) {
				  /*Rprintf(" %d", i+1);*/
				  ruleset->trans[h] = i;
				  h++;
			  }
		  }
	  }                       /* print the indices of set bits */
	  else {                      /* if list of transaction ids */
		  if ((h + supp) >= size2) {
			  while ((h + supp) >= size2) {
				  size2 += (size2 > BLKSIZE) ? (size2 >> 1) : BLKSIZE;
			  }
			  vec1 = (int*)realloc(ruleset->trans, size2 *sizeof(int));
			  if (!vec1) {
				  if (vec) free(vec);
				  if (vec2) free(vec2);
				  _cleanup(); error(msg(E_NOMEM));}
			  ruleset->trans = vec1;
		  }
		  for (i = 0; i < supp; i++) {
			  /*Rprintf(" %d", tal[i]); */
			  ruleset->trans[h] = tal[i];
			  h++;
		  }                           /* traverse and print */
	  }
	  ruleset->trtotal = ruleset->trnb[ruleset->rnb] = h;
	}                             /* the transaction identifiers */
(...)
#
Hello,

This simple SVD calculation (commands are copied immediately below) crashes on my Ubuntu machine (R 2.13.0). However it works fine on my Windows 7 machine, so I suspect there's a problem with (my?) Ubuntu and / or R. Can anybody else reproduce it (with Ubuntu 11.04)? Thanks in advance.

p <- 500
n <- 300
set.seed(1234)
x <- matrix(rnorm(n*p), n, p)
sih <- var(x)
b <- svd(sih)

produces:

?*** caught illegal operation ***
address 0x42b8c9, cause 'illegal operand'

Traceback:
?1: .Call("La_svd", jobu, jobv, x, double(min(n, p)), u, v, "dgsedd",???? PACKAGE = "base")
?2: La.svd(x, nu, nv)
?3: svd(sih)

I'm using Ubuntu 11.04 and
?????????????? _??????????????????????????? 
platform?????? i686-pc-linux-gnu??????????? 
arch?????????? i686???????????????????????? 
os???????????? linux-gnu??????????????????? 
system???????? i686, linux-gnu????????????? 
status????????????????????????????????????? 
major????????? 2??????????????????????????? 
minor????????? 13.0???????????????????????? 
year?????????? 2011???????????????????????? 
month????????? 04?????????????????????????? 
day??????????? 13?????????????????????????? 
svn rev??????? 55427??????????????????????? 
language?????? R??????????????????????????? 
version.string R version 2.13.0 (2011-04-13)

Thanks,

Matias
#
Update

I had the chance to test the issue tonight using R version 2.12.2
on a Linux (Ubuntu 10.04.2, x86_64, kernel 2.6.32-32-generic) system:

It does also crash (with a support of 0.01 instead of 0.05 posted
earlier) running

	eclat(Adult, parameter=list(support=0.01, tidLists=TRUE))

but I got some more information, which supports my initial guess:

	 *** caught segfault ***
	address 0xa481b70, cause 'memory not mapped'

	Traceback:
	 1: .Call("reclat", items at p, items at i, items at Dim, parameter,
                   control, data at itemInfo, PACKAGE = "arules")
	 2: eclat(Adult, parameter = list(support = 0.01,
                   tidLists = TRUE))

It also crashes on a virtual x86_32 Windows XP Home SP3 running on the
same linux machine ...


Does anyone know how to fix the problem and how?


Thanks and regards,

Dirk
#
On Fri, 3 Jun 2011, Matias Salibian-Barrera wrote:

            
There is no evidence here that 'R crashes' rather than one of those 
crashed R.

You don't tell us whether you compiled R yourself or used someone 
else's pre-compiled distribution -- if the latter, ask on r-sig-debian 
as this is most likely a problem with the distribution, since 
Debian/Ubuntu builds normally replace R's LAPACK/BLAS with that from 
the OS.

It works correctly on a vanilla R build on i686 Fedora 14.

  
    
#
On Fri, Jun 3, 2011 at 7:03 PM, Matias Salibian-Barrera
<msalibian at yahoo.ca> wrote:
Works fine for me with Ubuntu 11.04 (amd_64) and the pre-compiled R-2.13.0
$ wajig list r-base-core
ii  r-base-core                    2.13.0-2natty0                 GNU
R core of statistical computation and graphics system
List of 3
 $ d: num [1:500] 5.04 4.94 4.92 4.83 4.82 ...
 $ u: num [1:500, 1:500] -0.03663 0.05414 0.00182 -0.02847 -0.00117 ...
 $ v: num [1:500, 1:500] -0.03663 0.05414 0.00182 -0.02847 -0.00117 ...
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base