substitute NA values
On Fri, 2007-03-30 at 16:25 +0200, Sergio Della Franca wrote:
This is that i obtained. There isn't a method to replace the NA values only for character variable?
This is R, there is always a way (paraphrasing an R-Helper the name of
whom I forget just now). If you mean a canned function, not that I'm
aware of.
Here is one way:
## some example data - not exactly like yours
set.seed(1234)
dat <- data.frame(test = sample(c("t","f"), 9, replace = TRUE),
num = c(10,14,25,NA,40,45,44,47,NA))
## add an NA to dat$test to match your example
dat$test[8] <- NA
## print out dat
dat
## count the various options in $test and return the name of
## the most frequent
freq <- names(which.max(table(dat$test)))
## replace NA in $test with most frequent
dat$test[is.na(dat$test)] <- freq
## print out dat again to show this worked
dat
There may be better ways - the names(which.max(table(...))) seems a bit
clunky to me but it is Friday afternoon and it's been a long week...
And, as this /is/ R, you could wrap that into a function for you use on
other data sets, but I'll leave that bit up to you.
HTH
G
2007/3/30, Gabor Grothendieck <ggrothendieck at gmail.com>:
I assume you are referring to na.roughfix in randomForest. I don't think it works for logical vectors or for factors outside of data frames:
library(randomForest) DF <- data.frame(a = c(T, F, T, NA, T), b = c(1:3, NA, 5)) na.roughfix(DF)
Error in na.roughfix.data.frame(DF) : na.roughfix only works for numeric or factor
DF$a <- factor(DF$a) na.roughfix(DF$a)
Error in na.roughfix.default(DF$a) : roughfix can only deal with numeric data.
na.roughfix(DF)
a b 1 TRUE 1.0 2 FALSE 2.0 3 TRUE 3.0 4 TRUE 2.5 5 TRUE 5.0 On 3/30/07, Sergio Della Franca <sergio.della.franca at gmail.com> wrote:
Dear R-Helpers, I have the following data set(y): Test_Result #_Test t 10 f 14 f 25 f NA f 40 t 45 t 44 <NA> 47 t NA I want to replace the NA values with the following method: - for the numeric variable, replace NA with median - for character variable , replace NA with the most frequent level If i use x<-na.roughfix(y) the NA values are correctly replaced. But if i x<-na.roughfix(y$Test_Result) i obtain the following error: roughfix can only deal with numeric data. How can i solve this proble that i met every time i want to replace only
the
NA values of a column (type character)?
Thank you in advance.
Sergio Della Franca
[[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%