R: machine for moderately large data
On Fri, Oct 5, 2012 at 12:09 PM, PIKAL Petr <petr.pikal at precheza.cz> wrote:
Hi
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Sk?la, Zden?k (INCOMA GfK) Sent: Friday, October 05, 2012 3:38 PM To: r-help at r-project.org Subject: [R] R: machine for moderately large data Dear all, I would like to ask your advice about a suitable computer for the following usage. I (am starting to) work with moderately big data in R: - cca 2 - 20 million rows * 100 - 1000 columns (market basket data) - mainly clustering, classification trees, association analysis (e.g. libraries rpart, cba, proxy, party)
If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this?
This is not as outrageous as one might think -- you can get a mac pro with 32 gigs of memory for around $3,500 Best, Ista
Maybe some suitable database interface shall be preferable. Regards Petr
Can you recommend a sufficient computer for this volume?
I am routinely working in Windows but feel that Mac or some linux
machine might be needed.
Please, respond directly to my email.
Many thanks!
Zdenek Skala
zdenek.skala at gfk.com
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.