(no subject)

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110506/f13b823b/attachment.pl>
Hi:

To quote one of the sages of this list: 'Loops? We don't need no
steenking loops!!'.

Here's one way to do what you were asking with a two-pass approach.
Generate some random data, use the sample() function to get 20 indices which
are then used to generate NAs in the original vector. Then replace the missing
values by the preceding values (with an ifelse() statement to handle the first
position case) and then replace the remaining NAs with the vector's mean.

# Generate 100 random Poisson(10) values
x <- rpois(100, 10)
# Get the indices to set to NA
midx <- sample(length(x), 20)
# Replace x[midx] with NA
x[midx] <- NA
# If first value of x is NA, keep NA, else replace missing value
# by previous value
x[midx] <- x[ifelse(midx == 1L, NA, midx - 1)]
# Replace remaining NAs with the vector's mean
x[is.na(x)] <- mean(x, na.rm = TRUE)

To do all of this at once, wrap it up into a function and then
use the raply() function in plyr or the replicate() function in base R to
run it and put the result into a 1000 x 100 matrix:

hdimp <- function() {
  x <- rpois(100, 10)
  midx <- sample(length(x), 20)
  x[midx] <- NA
  x[midx] <- x[ifelse(midx == 1L, NA, midx - 1)]
  x[is.na(x)] <- mean(x, na.rm = TRUE)
  x
 }

library(plyr)
u <- raply(1000, hdimp)

An alternative is to use the replicate() function:

v <- t(replicate(1000, hdimp()))

The latter approach is about 20% faster in my tests.

HTH,
Dennis
I'm using the survey api. I am taking 1000 samples of size of 100 and
replacing 20 of those values with missing values. Im trying to use
sequential hot deck imputation, and thus I am trying to figure out how
to replace missing values with the value before it. Other things I have
to keep in mind is if there are two missing values side by side, how do I
?replace both those values with the value before. Also if the first of
the sample of 100 is a missing value? I will replace that with the mean
of the population. Im pretty sure I have to write a loop, but if anyone
can help me figuring how to write this I would appreciate it greatly.
Thank you

Nick Manginelli
? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.