Not missing at random
Hi Blaz, What do you do if the number of values sampled to be set missing (e.g., 4) is greater than the number of values for a given case that are less than your < 3 threshold? If no special considerations are needed for that, I do not see why you cannot apply the same technique you did below with MCAR to MNAR. Best regards, Josh
On Tue, Jun 7, 2011 at 12:17 AM, Blaz Simcic <blazsimcic at yahoo.com> wrote:
Josh, thanks for the answer, it really helped me. I have another question, if you maybe know how to do it. I would also like ?to sample number of missing values within selected cases, as i did wit MCAR (see below). Can you help me tith this? Thanks, Blaz from Slovenia Here is my code for MCAR: N <- 1000 ?????####number of cases n <- 12 ??????????####number of variables X <- matrix(rnorm(N * n), N, n)??? ####matrix pMiss <- 0.20???? ####percent of missing values idMiss <- sample(1:N, N * pMiss)??? ####sample cases nMiss <- length(idMiss) m <- 3??? ####maximum number of missing values within selected cases howmanyMiss <- sapply(idMiss, function(x) sample(1:m, 1)) howmanyMiss? #### number of missing values within selected cases varMiss<-lapply(howmanyMiss, function(x) sample(1:n, x))??? #### which values are missing ids <- cbind(rep(idMiss, howmanyMiss), unlist(varMiss)) Xmiss <- X Xmiss[ids] <- NA Xmiss
________________________________
From: Joshua Wiley <jwiley.psych at gmail.com>
To: Blaz Simcic <blazsimcic at yahoo.com>
Cc: r-help at r-project.org
Sent: Mon, June 6, 2011 10:34:38 PM
Subject: Re: [R] Not missing at random
Hi Blaz,
See below.
x <-
matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4),
nrow = 7, ncol=7, byrow=TRUE) ####matrix
pMiss <- 30? ? ####percent of missing values
N <- dim(x)[1]? ####number of cases
candidate <- which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3
|
x[,7]<3)? ? #### I want to sample all cases with at least 1 value
lower than 3, so I have to find candidates
## easier to use this
## find all x < 3 and return their row and column indices
## select only row indices, and then find unique
candidate <- unique(which(x < 3, arr.ind = TRUE)[, "row"])
idMiss <- sample(candidate, N * pMiss / 100)? #### I sampled cases
## from the subset of x cases that will be missing
## find all that are < 3 and set to NA
x[idMiss, ][x[idMiss, ] < 3] <- NA
## If you are going to do this a lot, consider a function
nmar <- function(x, op = "<", value = 3, p = 30) {
? op <- get(op)
? candidate <- unique(which(op(x, value), arr.ind = TRUE)[, "row"])
? idMiss <- sample(candidate, nrow(x) * p / 100)
? x[idMiss, ][op(x[idMiss, ], value)] <- NA
? return(x)
}
nmar(x)
## has the advantage that you can easily change
## p, the cut off value, the operator (e.g., "<", ">", "<=", etc.)
Cheers,
Josh
On Sun, Jun 5, 2011 at 11:17 PM, Blaz Simcic <blazsimcic at yahoo.com> wrote:
Hello!
I would like to sample 30 % of cases (with at least 1 value lower than 3 -
in
the row) and among them I want to set all values lower than 3 (within
selected
cases) as NA (NMAR- Not missing at random). I managed to sample cases, but
I
don?t know how to set values (lower than 3) as NA.
R code:
x <-
matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4),
?nrow = 7, ncol=7, byrow=TRUE) ####matrix
pMiss <- 30???? ####percent of missing values
N <- dim(x)[1]?? ####number of cases
candidate<-which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3
|
x[,7]<3)??? #### I want to sample all cases with at least 1 value lower
than 3,
so I have to find candidates
idMiss <- sample(candidate, N * p / 100)??? #### I sampled cases
Now I'd like to set all values among sampled cases as NA.
Any suggestion?
Thanks,
Bla?
? ? ? ?[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/