Skip to content

Large data files

3 messages · cstrato@EUnet.at, Thomas Lumley, Andy Elvey

#
Dear R and S-Plus users:

Currently I am using:
at work: "S-Plus 2000 Pro" on a PC: Pentium II/350MHz, 256 MB RAM,
running Win NT
at home: "R" on my Mac PowerBook G3/292MHz, 128 MB RAM, running LinuxPPC

Currently, at home I am trying to import a table(nrow=302500, ncol=6)
which I have to do
for each column extra because of memory problems. I have partially to
use the columns,
partially I have to convert them in to matrices(550 x 550) for doing
calculations.
Ultimately, I have to import many (ca 20-100) of these tables, which
will be impossible
on my current machines due to memory limitations.

My question now is the following:

At work I have access to the following multiprocessor machines:
a, Compaq Proliant Server: 4 x Pentium II/450MHz, 2 GB RAM, Win NT
b, Sun Enterprise 450 Server: 4 x SPARC/??MHz, 2 GB RAM, Solaris 2.6

For testing purposes I would like to install "R":
1, Can R take advantage of multiprocessor machines?
2, Which machine would be better suited to run R on?

Finally, the question is:
Is R or S-Plus better suited for handling such large data?
Would "S-Plus 2000" for Win NT or "S-Plus 5" for Unix better suited?
Can S-Plus take advantage of multiprocessor machines?

Thank you in advance for your help
and Happy New Year 2000 (hopefully not 1900)
Christian Stratowa, Ph.D., Vienna


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Wed, 29 Dec 1999 cstrato at EUnet.at wrote:

            
Not really. You can run multiple copies of R, which lets you get four
things done at once, but R is not multithreaded.
Either would work. We have done some very limited comparisons of speed on
machines here: the various test suites for the survival5 package run at
about the same speed on a new Sun Enterprise server and on a Pentium
II/400 under Linux, and run faster on a Pentium III/500 under WinNT, and
slower on an eighteen-month old Sun Enterprise 450 server.

The speeds are close enough that other factors are probably more important
(which system you prefer, how many other people you will annoy by taking
over the machine)

If you are doing a lot of simple linear algebra the Sun Workshop compilers
might be expected to have some advantages over gcc: I haven't found any
examples where it matters, but I don't work with very large matrices much.
Neither R nor S-PLUS is particularly suited to handling large data.  I
believe S-PLUS has some multithreading, but that its main computations are
still done by a single processor. However, this is perhaps not the best
list to get information about S-PLUS.

You would be better off splitting the data into pieces using some other
program.  Either S-PLUS or R will handle 550x550 matrices perfectly
happily if you have that much memory.


Thomas Lumley
Assistant Professor, Biostatistics
University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Thomas Lumley wrote:

            
One other possibility that may be worth a try is the language "Yorick" which is
specifically designed with array/matrix processing in mind.  Try the following URL
-

 ftp://ftp-icf.llnl.gov/pub/Yorick/yorick-ad.html

( Hope this suggestion doesn't offend on an R-help mailing list ... I am a keen
(although new) R user but am also aware of a few other ways of solving
problems...:-)




-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._