Skip to content
Prev 165468 / 398506 Next

imputing the numerical columns of a dataframe, returning the rest unchanged

Hi,

?sapply will tell you

....
     'sapply' is a user-friendly version of 'lapply' by default
     returning a vector or matrix if appropriate.
....

so 'x' has lost its class in sapply(); e.g.

## iris is a data.frame
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1
1 1 1 1 1 1 ...
## but sapply() will coerce it into a numeric matrix
num [1:150, 1:5] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:5] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" ...

I'd suggest you get the class of each column first, then apply
impute() to these columns (i.e. DF[, sapply(DF, class) == "numeric"])
and assign the new values to the original columns.

Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086
Mobile: +86-15810805877
Homepage: http://www.yihui.name
School of Statistics, Room 1037, Mingde Main Building,
Renmin University of China, Beijing, 100872, China
On Mon, Dec 22, 2008 at 11:38 PM, Mark Heckmann <mark.heckmann at gmx.de> wrote: