-----Original Message-----
From: Dimitri Liakhovitski [mailto:ld7631 at gmail.com]
Sent: Monday, April 13, 2009 4:35 PM
To: Liaw, Andy
Cc: R-Help List
Subject: Re: [R] Random Forests: Question about R^2
Andy,
thank you very much!
One clarification question:
If MSE = sum(residuals) / n, then
in the formula (1 - mse / Var(y)) - shouldn't one square mse before
dividing by variance?
Dimitri
On Mon, Apr 13, 2009 at 10:52 AM, Liaw, Andy
<andy_liaw at merck.com> wrote:
MSE is the mean squared residuals. ?For the training data, the OOB
estimate is used (i.e., residual = data - OOB prediction, MSE =
sum(residuals) / n, OOB prediction is the mean of
trees for which the case is OOB). ?It is _not_ the average
trees in the forest.
I hope there's no question about how the pseudo R^2 is computed on a
test set? ?If you understand how that's done, I assume the
only how the OOB MSE is formed.
Best,
Andy
From: Dimitri Liakhovitski
Dear Random Forests gurus,
I have a question about R^2 provided by randomForest (for
I don't succeed in finding this information.
In the help file for randomForest under "Value" it says:
rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y).
Could someone please explain in somewhat more detail how
is calculated?
Is "mse" mean squared error for prediction?
Is "mse" an average of mse's for all trees run on out-of-bag
holdout samples?
In other words - is this R^2 based on out-of-bag samples?
Thank you very much for clarification!
--
Dimitri Liakhovitski
MarketTools, Inc.
Dimitri.Liakhovitski at markettools.com