Skip to content
Prev 79774 / 398502 Next

memory problem in handling large dataset

Hi, Jim:
Thanks for the calculation. I think you won't mind if I cc the reply
to r-help too so that I can get more info.

I assume you use 4 bytes for integer and 8 bytes for float, so
300x8+50x4=2600 bytes for each observation, right?

I wish I could have 500x8 G memory :) just kidding.. definately,
sampling will be proceeded as the first step. Some feature selections
(filtering, mainly) will be applied. Accepting Berton's suggestion, I
will probably use python to do the sampling since whenever I have some
"slow" situations like this, python never fails me. (I am not saying R
is bad though)

I understand "I get what I pay" here.  But more information or
experience on R's handling large dataset (like using RMySQL) will be
appreciated.

regards,

Weiwei
On 10/27/05, jim holtman <jholtman at gmail.com> wrote:
--
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
Message-ID: <cdf817830510271024l1e70cfe7u85a50769dd9cd7e6@mail.gmail.com>
In-Reply-To: <644e1f320510271006mc2b2bfj3954b2742bcb8edd@mail.gmail.com>