Skip to content

evaluating NAs in a dataframe

6 messages · Wade Wall, Peter Ehlers, Philipp Pagel +2 more

#
On 2010-12-08 12:10, Wade Wall wrote:
You don't say what you want to have occur when x is NA. (I don't know
what 'evaluate NA' means.)

But why not just use something like:

  for(....){
    if(!is.na(x[i]){
      .... your stuff, preferably replacing '&&' with '&' ....
    } else {....}
  }

Peter Ehlers
#
Hi!
Sounds like you are looking for is.na :
[1] FALSE  TRUE FALSE
[...]
First of all, you don't need a loop here. Example:

# make up some data
foo <- data.frame(a=sample(1:20, 20, replace=TRUE))
# assign to classes
foo$class <- cut(foo$a, breaks=c(-1, 7, 13, 20), labels=c('small', 'medium', 'large'))

This also works in the presence of NAs - but of course the class will
be NA in those cases which, at least in my opinion, is the correct
value.

cu
	Philipp
#
Wade,

As you have discovered, you need to test for NA first, and to do that you need to use is.na().  Something like this should work

for (i in 1:nrow(demo)) {
  if (is.na(demo$Area[i])) Class[i] <- "Sna" else
  if (demo$Area[i] < 10) Class[i] <- "S01"   else 
  if (demo$Area[i] < 25) Class[i] <- "S02"   else
  if (demo$Area[i] < 50) Class[i] <- "S03"   else 
  if (demo$Area[i] < 100) Class[i] <- "S04"  else 
  if (demo$Area[i] < 200) Class[i] <- "S05"  else 
  if (demo$Area[i] < 400) Class[i] <- "S06"  else 
  if (demo$Area[i] < 800) Class[i] <- "S07"  else 
  if (demo$Area[i] < 1600) Class[i] <- "S08" else 
  if (demo$Area[i] < 3200) Class[i] <- "S09" else 
  Class[i] <- "S10" 
  }

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA
#
On Dec 8, 2010, at 3:10 PM, Wade Wall wrote:

            
That looks really, really painful. Why not use the function  
findInterval and then do a lookup in a character vector. Then you can  
throw away that loopy construct completely.

 > demo  <- data.frame(Area = runif(10, 0, 100))
 > demo$catarea <- findInterval(demo$Area, c(0,25,50,75,100))
 > demo
         Area catarea
1  71.440401       3
2   8.438097       1
3  45.492178       2
4  50.669996       3
5  15.444114       1
6  33.954948       2
7  19.738747       1
8  56.485654       3
9  29.218921       2
10 74.204611       3
 > demo$catname <- c("S01","S02", "S03","S04")[demo$catarea]
 > demo
         Area catarea catname
1  71.440401       3     S03
2   8.438097       1     S01
3  45.492178       2     S02
4  50.669996       3     S03
5  15.444114       1     S01
6  33.954948       2     S02
7  19.738747       1     S01
8  56.485654       3     S03
9  29.218921       2     S02
10 74.204611       3     S03