Skip to content
Prev 333833 / 398506 Next

Adding NA values in random positions in a dataframe

An essentially identical approach that may be a tad clearer -- but
requires additional space -- first creates a logical vector for the
locations of the NA's in the unlisted data.frame. Further NA positions
are randomly added and then the augmented vector is used as a logical
matrix to index where the NA's should go in the data frame:

df <- data.frame(a = c(1:3,NA,4:6),
                b=c(letters[1:6],NA),
                 c= c(1,NA,runif(5)))

nr <- nrow(df); nc <- ncol(df)
p <- .3 ## desired total proportion of NA's

ina <- is.na(unlist(df)) ## logical vector, TRUE corresponds to NA positions
n2 <- floor(p*nr*nc) - sum(ina)  ## number of new NA's

ina[sample(which(!is.na(ina)), n2)] <- TRUE
df[matrix(ina, nr=nr,nc=nc)]<- NA ## using matrix indexing

df

Cheers,
Bert
On Fri, Nov 29, 2013 at 10:09 AM, arun <smartpink111 at yahoo.com> wrote: