An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-hpc/attachments/20140414/e3a4a545/attachment.pl>
Decision trees in R with big data
2 messages · Supriya Jain, Rich Calaway
Revolution R Enterprise, a commercial distribution of R, includes external memory algorithm implementations of both decision trees and decision forests. These are geared for "tall" data--the two million rows wouldn't be a problem, nor would two billion, but the 20,000 attributes would probably challenge them. It's probably worth a look (and is available for free for academic use): www.revolutionanalytics.com Hope this helps! --Rich Calaway
On Mon, Apr 14, 2014 at 9:53 AM, Supriya Jain <sjsjsj2009 at gmail.com> wrote:
Hi,
I have successfully used rpart but with a few thousands rows, and a few
hundred input attributes. When using data with ~2 million rows (instances),
and ~20,000 input attributes (typical data sizes in my application), I get
memory problems when using rpart.
Does anyone know of a Decision tree algorithm that works in R with big
data?
Thanks!
[[alternative HTML version deleted]]
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Rich Calaway Documentation Manager Revolution Analytics, Inc. 1505 Westlake Ave North Suite 520 Seattle, WA 98109 richcalaway at revolutionanalytics.com ph: 206-456-6086 (direct line) 855-GET-REVO