memory problem for R --Summary
Thank you very much for the replies you have sent me regarding the memory
problem.
The following is the summary
(I tried to read all the messages through. I apologized if I overlooked your
message)
Cheers,
Yun-Fang
----------------------------
Backgrounds:
a. Data: 1million rows with 73 numeric attributes
b. Environment: R 1.7.1 on FreeBSD 4.3 with 2GB memory and double CPU
Pentium III/Pentium III Xeon/Celeron
with data seg size (kbytes) =1572864 limit
Suggested Solutions:
z. use SAS since SAS is not trying to read all the data into RAM.
a. random sampling from the large data set i.e. 10% of 1 million rows
(the option singular.ok=TRUE can be used in lm for singular matrice.)
b. use kalman filter with migration variance =0. ( see the dse package for
details)
c. add the following configuration: options(object.size=1e8)
Results: still OOM
d. if data is all numeric, add colClasses="numeric" in read.table()
Results: read.table read in the data successfully but I failed to access
the dataset after the loading
(even dataset[1:10,] didn't work)
----- Original Message -----
From: "Liaw, Andy" <andy_liaw at merck.com>
To: "'Yun-Fang Juan'" <yunfang at yahoo-inc.com>; "Prof Brian Ripley"
<ripley at stats.ox.ac.uk>
Cc: <r-help at stat.math.ethz.ch>
Sent: Friday, January 30, 2004 11:44 AM
Subject: RE: [R] memory problem for R
You still have not read the posting guide, have you? See more below.
From: Yun-Fang Juan
[...]
I tried 10% sample and it turned out the matrix became singular after I did that. Ther reason is some of the attributes only have zero values most of the time. The data i am using is web log data and after some transformation, they are all numeric. Can we specify some parameters in read.table so that the program will treat all the vars as numeric (with this context, hopefully that will reduce the memory consumption) ?
and you clearly have not read my (private) reply, either, in which I told you *exactly* how to do that, via the colClasses argument to read.table(). Please take the help given to you seriously. If you want attention, you have to pay attention. Andy
thanks a lot, Yun-Fang
--------------------------------------------------------------------------
----
Notice: This e-mail message, together with any attachments, contains information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply
and then delete it from your system. --------------------------------------------------------------------------
----