Skip to content
Prev 240191 / 398500 Next

Using R for Production - Discussion

I worked on a project where we used a random forest classifier to
predict a binary response. We trained a model in the ec2 cloud with 3
million observations and 44 features. We stored the model that was
generated by R using save(mymodel,file="model.Rdata"). Now we use
model.Rdata locally to predict new observations.
In our local system, we built a parser in Perl to generate the csv
representation of the observation we want to predict, then we used
RSPerl to communicate between Perl and R. But there is a catch,
instead of loading the random forest model (model.Rdata) every time we
want to predict a new observation, we have an R console running as a
daemon with the model.Rdata loaded already. Then, we send the
observation to be predicted from Perl to R. If anyone else has better
solutions/ideas, please feel free to share.
Thanks,
Saeed

On Mon, Nov 1, 2010 at 9:04 PM, Santosh Srinivas
<santosh.srinivas at gmail.com> wrote: