a quick Q about memory limit in R - R-help

Yan Yu

Tue, May 20, 2003 2:23 PM #

Hello, there,
   I got this error when i tried to run " data.kr <- surf.gls(2, expcov,
data, d=0.7);"

"Error: cannot allocate vector of size 382890 Kb
Execution halted"

My data is 100x100 grid.

the following is the summary of "data":

x                y                z
 Min.   :  1.00   Min.   :  1.00   Min.   :-1.0172
 1st Qu.: 26.00   1st Qu.: 25.75   1st Qu.: 0.6550
 Median : 51.00   Median : 50.50   Median : 0.9657
 Mean   : 50.99   Mean   : 50.50   Mean   : 1.0000
 3rd Qu.: 76.00   3rd Qu.: 75.25   3rd Qu.: 1.2817
 Max.   :100.00   Max.   :100.00   Max.   : 2.6501

I have 2 Qs:
(1). for a 100x100 grid, why R tried to allocate such a HUGE vector,
382890 Kb??

(2) what decides the memory limit in R, How can increase that?

Many thanks,
yan

Uwe Ligges

Tue, May 20, 2003 11:47 PM #

Yan Yu wrote:

Because you perform some calculations with the data which consumes more 
memory than the data itself, e.g. by generating some matrices, temporary 
objects and copies of the data.

a) See ?Memory and the R FAQ 7.1, for example. If you are on Windows 
also R fow Windows FAQ 2.7. These are obvious places to look, aren't they?

b) By buying some more memory for your machine.

Uwe Ligges

Roger Bivand

Wed, May 21, 2003 1:38 AM #

On Tue, 20 May 2003, Yan Yu wrote:

This is, I think, where the problem is. You have n=10000 observations, and 
if you do debug(surf.gls) before running, you will probably find that it 
stops at:

    Z <- .C("VR_gls", as.double(x), as.double(y), as.double(z), 
        as.integer(n), as.integer(np), as.integer(npar), as.double(f), 
        l = double((n * (n + 1))/2), r = double((npar * (npar + 
            1))/2), beta = double(npar), wz = double(n), yy = double(n), 
        W = double(n), ifail = as.integer(0), l1f = double(n * 
            npar), PACKAGE = "spatial")

because (n * (n + 1))/2 in your case is 50,005,000, times 8 as a double,
so l is a very large object ("On output L contains the Cholesky factor of
the covariance matrix of the observations" - comment in spatial/src/kr.c).
Do you need to have so many observations? If so, perhaps you could
consider using other packages that permit the search area to be restricted
to close neighbours of your observations?

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Breiviksveien 40, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 93 93
e-mail: Roger.Bivand at nhh.no

Paul Gilbert

Wed, May 21, 2003 10:12 AM #

Yan Yu wrote:

...

In recent versions of R this is controlled by the operating system, unless you
start R with an option that sets a lower limit than the OS allows. In Linux and
Unix this is controlled by

1/ ulimit (or limit in some shells). This can typically be relaxed by a normal
user to the system limits. On Linux the default max memory is usually not
limited (by ulimit) but it is sometimes necessary to relax the default stack
size.

2/ A combination of the amount of memory and amount of swap space. Roughly, on
Linux, these are added together to give the limit. This has changed over the
years in Linux and may vary on different version of Unix, but typically swap
space increases the size of problem you can handle. Physical memory is faster,
but swap works. Given prices these days you might consider having these add to
around 4G if you want to work on large problems with R.

3/ The architecture of the processor (e.g. 32-bit vs 64-bit). A program cannot
exceed the address space of the architecture (2^32 = 4G bytes on a typical PC
32-bit processor, 2^64=a lot more on a 64-bit workstation). The OS itself needs
some of this, so I believe the practical limit on 32-bit Linux is around 3G. (In
any case, there is not much point in have more than 4G of swap+memory on a
32-bit machine.) On a 64-bit architecture with a 64-bit Unix (most workstations)
the application (R) must be compiled as a 64-bit application. Thus the fixing of
Solaris bugs in gcc 3.2.3 has meant that much larger problems can now be handled
with R on Solaris (I believe this was possible before with Sun compilers.) It
should also be possible to compile 64-bit R under (64-bit) Linux on Intel
Itanium and AMD Opteron processors. I have no experience with this (but would be
interested in hearing from anyone that does).

On Windows the situation is different and I am much less familiar with it (and
look forward to being corrected). I believe applications must fit into physical
memory on Windows, that is, they can be swapped out but not partly swapped out.
This means that it is necessary to buy more memory to run bigger R problems. (Of
course, with physical memory problems will run much faster, so you should
consider buying more memory even in Unix.) Windows itself demands some of the
memory, so I believe the practical limit for applications in Windows is 2G
bytes. I understand there is a 64-bit version of Windows under development, but
I don't think it has been released yet.

Paul Gilbert