Hi all, I would like to impute missing values in a data set based on the distribution of the other values of the variable. Imagine that 30 % of the values = 1, 20 % = 2 and 50 % = 3, in effect I'd like to do the following : df$var[df$var==NA]<-1 # for 30 % of the NA occurrences # df$var[df$var==NA]<-2 # for 20 % of the NA occurrences # df$var[df$var==NA]<-3 # for 50 % of the NA occurrences # Can anybody help ? John
Imputation of missing values
1 message · John Tomkinson