Skip to content
Prev 321119 / 398500 Next

Replace missing value within group with non-missing value

Note that
    which(anyLogicalVector)[1]
always has length 1, because of the subscript [1], so the 'if' statement
may as well be omitted.

There are  2 cases the above code does not detect or deal with.
  (a) nrow(x)==0
  (b) all(is.na(x$mth))
  (c) length(which(is.na(x$mth))) > 1
Case (a) causes the function to stop in way you saw:
  > f <- function(x) { # the function passed to lapply
  +    idx <- which(!is.na(x$mth))[1]
  +    if (length(idx) > 0)
  +       x$mth <- x$mth[idx]
  +    x
  + }
  > f(data.frame(mth=integer()))
  Error in `$<-.data.frame`(`*tmp*`, "mth", value = NA_integer_) : 
    replacement has 1 rows, data has 0
but (b) and (c) may indicate some errors in your data and cause some
surprises down the line.
  >  f(data.frame(mth=c(NA,NA)))
    mth
  1  NA
  2  NA
  >  f(data.frame(mth=c(NA,2,3)))
   mth
  1   2
  2   2
  3   2

You could have your code check whether there is exactly one non-missing
value for mth in each non-empty group and warn if that assumption is not true
for some group (but also return some reasonable result)?  The following does
that:
f2 <- function (x)  {
    idx <- !is.na(x$mth) # logical vector with length nrow(x)
    nNotNA <- sum(idx)
    if (nNotNA > 1) {
        warning("more than one non-missing mth value in group, using the first")
        idx[cumsum(idx) > 1] <- FALSE
    }
    else if (nrow(x) > 0 && nNotNA == 0) {
        warning("no non-missing values in group, all mth values will be NA")
        idx[1] <- TRUE
    }
    x$mth <- x$mth[idx]
    x
}

The error messages do not say where in 'sp' the problem arose.  You could change
your lapply call so the group number was in the warning:
   lapply(seq_along(sp), function(i) {
      x <- sp[[i]]
      ... same code as in f2, but add the group number, i,  to the end of warnings ...
           warning("more than one ... in group number", i)
      ...
   })

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com