Imputation
On 05/04/05 11:13, Ramesh Kolluru wrote:
ÂÂ I have timeseries data for some factors, and some missing values are there in those factors, I want impute those missing values without disturbing the distribution of that factor, and maintaining the correlation with other factors. Pl. suggest me some imputation methods. I tried some functions in R like aregImpute, transcan. After the imputation I am unable to retrive the data with imputed values. Please give me some way to get the data with imputed values. Here is one way to do it with transcan(), but I'm looking forward to seeing other answers. The data are in s.m, and the missing values are NA. The imputed values are in s.imp$imputed, in order, and the third line simply replaces the NAs with these values. (I posted this before. You might have found it by searching the R search page below.) This is for the simplest possible sort of imputation. I'm not sure that it meets your requirements. (In fact, I'm pretty sure it doesn't.) So you'd have to change the options for transcan, or do something else. s.imp <- transcan(s.m,asis="*",data=s.m,imputed=T,long=T,pl=F) s.na <- is.na(s.m) # which data are imputed s.m[which(s.na)] <- unlist(s.imp$imputed) As for aregImpute(), that has to be more difficult, because aregImpute() does multiple imputation. Very roughly, it produces an whole set of imputed values, for the purpose of statistical inference. I don't know how to get a single best estimate out of this set, or even whether this is a good idea. Jon
Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~baron R search page: http://finzi.psych.upenn.edu/