impute missing values in correlated variables: transcan?

Jonathan Baron · 2004-11-30T19:50:26Z

On 11/30/04 13:21, Frank E Harrell Jr wrote: >Jonathan Baron wrote: >> I would like to impute missing data in a set of correlated >> variables (columns of a matrix). It looks like transcan() from >> Hmisc is roughly what I want. It says, "transcan automatically >> transforms continuous and categorical variables to have maximum >> correlation with the best linear combination of the other >> variables." And, "By default, transcan imputes NAs with "best >> guess" expected values of transformed va

Jonathan Baron

Tue, Nov 30, 2004 11:50 AM

On 11/30/04 13:21, Frank E Harrell Jr wrote:

Thanks.  But they don't _need_ to be so flexible as what transcan
does.  Linear would be OK, but I can't find an option for that in
transcan.

We _will_ have more data, about 50 applicants rated by the time
we start making decisions.  So I tried my little simulation with
more data, and it didn't give an error message.  So that was the
problem.  Here is the new one:

m1 <- matrix(1:80+rnorm(80),,4)
colnames(m1) <- paste("R",1:4,sep="")
m1[c(2,19)] <- NA
library(Hmisc)
t1 <- transcan(m1,data=m1,long=T,imputed=T)

I've used aregImpute, and I notice it has a "defaultlinear"
option, which is good.  Thus, it may work better once I figure
out how to get a single value out of it for each missing datum
(which doesn't look too hard).

This is not about statistical inference, which seems to me to be
where the main advantage of multiple imputation lies.  But
probably it won't do any harm.

Jon

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
R search page: http://finzi.psych.upenn.edu/

impute missing values in correlated variables: transcan?

Thread (5 messages)