R: machine for moderately large data
Hi
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Sk?la, Zden?k (INCOMA GfK) Sent: Friday, October 05, 2012 3:38 PM To: r-help at r-project.org Subject: [R] R: machine for moderately large data Dear all, I would like to ask your advice about a suitable computer for the following usage. I (am starting to) work with moderately big data in R: - cca 2 - 20 million rows * 100 - 1000 columns (market basket data) - mainly clustering, classification trees, association analysis (e.g. libraries rpart, cba, proxy, party)
If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this? Maybe some suitable database interface shall be preferable. Regards Petr
Can you recommend a sufficient computer for this volume? I am routinely working in Windows but feel that Mac or some linux machine might be needed. Please, respond directly to my email. Many thanks! Zdenek Skala zdenek.skala at gfk.com [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.