impute missing values in correlated variables: transcan?
On 11/30/04 13:21, Frank E Harrell Jr wrote:
Jonathan Baron wrote:
I would like to impute missing data in a set of correlated
variables (columns of a matrix). It looks like transcan() from
Hmisc is roughly what I want. It says, "transcan automatically
transforms continuous and categorical variables to have maximum
correlation with the best linear combination of the other
variables." And, "By default, transcan imputes NAs with "best
guess" expected values of transformed variables, back transformed
to the original scale."
But I can't get it to work. I say
m1 <- matrix(1:20+rnorm(20),5,) # four correlated variables
colnames(m1) <- paste("R",1:4,sep="")
m1[c(2,19)] <- NA # simulate some missing data
library(Hmisc)
transcan(m1,data=m1)
and I get
Error in rcspline.eval(y, nk = nk, inclx = TRUE) :
fewer than 6 non-missing observations with knots omitted
Jonathan - you would need many more observations to be able to fit flexible additive models as transcan does. Also note that single imputation has problems and you may want to consider multiple imputation as done by the Hmisc aregImpute function, if you had more data.
Thanks. But they don't _need_ to be so flexible as what transcan
does. Linear would be OK, but I can't find an option for that in
transcan.
We _will_ have more data, about 50 applicants rated by the time
we start making decisions. So I tried my little simulation with
more data, and it didn't give an error message. So that was the
problem. Here is the new one:
m1 <- matrix(1:80+rnorm(80),,4)
colnames(m1) <- paste("R",1:4,sep="")
m1[c(2,19)] <- NA
library(Hmisc)
t1 <- transcan(m1,data=m1,long=T,imputed=T)
I've used aregImpute, and I notice it has a "defaultlinear"
option, which is good. Thus, it may work better once I figure
out how to get a single value out of it for each missing datum
(which doesn't look too hard).
This is not about statistical inference, which seems to me to be
where the main advantage of multiple imputation lies. But
probably it won't do any harm.
Jon
Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/