Skip to content

how to introduce missing data for complete data

3 messages · dila radi, Bert Gunter, MacQueen, Don

#
1. You need to define more explicitly exactly what you mean by "randomly."

2. You need to make an honest effort to learn basic R, e.g. by
spending time with the "Introduction to R" document that ships with R
or an online tutorial (there are many good ones).

Cheers,
Bert
On Sun, Nov 10, 2013 at 10:31 PM, dila radi <dilaradi21 at gmail.com> wrote:

  
    
#
Here's a suggestion.

The sample() function takes random samples of sets. See
  ?sample
The set you want to take a random sample from is the rows of your data.
Represent the rows by their row numbers.
To get a vector of row numbers, you can use the seq() function. See
  ?seq

Let's suppose your data is in a data frame named 'mydat', and you want to
introduce 10 instances of missing data.

nr <- nrow(mydat)
set.to.missing <- sample( seq(nr) , 10)
mydat$Amount[set.to.missing] <- NA


A simplified example of the core idea is:
[1]  1  2  3  4  5  6  7  8  9 10
[1]  1  2 NA  4  5  6  7  8  9 10