Skip to content
Back to formatted view

Raw Message

Message-ID: <BANLkTi=i-Odb-2EjGVgwO1QGF9d_L4HEiQ@mail.gmail.com>
Date: 2011-05-20T14:13:02Z
From: Dimitri Liakhovitski
Subject: Memory capacity question for a specific situation
In-Reply-To: <4DD674A0.2050207@statistik.tu-dortmund.de>

Thanks a lot, Uwe!

2011/5/20 Uwe Ligges <ligges at statistik.tu-dortmund.de>:
>
>
> On 20.05.2011 15:33, Dimitri Liakhovitski wrote:
>>
>> Hello!
>>
>> I am trying to figure out if my latest R for 64 bits on a 64-bit
>> Windows 7 PC, RAM = 6 GB could read in a dataset with:
>>
>> ~64 million rows
>> ~30 columns about half of which contain integers (between 1 and 3
>> digits) and half - numeric data (tens to thousands).
>>
>> Or is it too much data?
>> And even if it could read it in - will there be any memory left to
>> conduct, for example, cluster analysis on that data set...
>
>
> Let us ask R:
>
>> 64e6 * (15*8 + 15*4)
> [1] 1.152e+10
>
> That means you will need roughly 12 GB to store the data in memory. To work
> with the data, you should have at least 3 times the amount of memory
> available. Hence a 32 GB machine is a minimal requirement if you cannot
> restrict yourself to less observations or variables.
>
> Uwe Ligges
>
>
>>
>> Thanks a lot!
>>
>



-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com