Skip to content

Subscripting problem with is.na()

4 messages · G.Maubach at weinwolf.de, Ista Zahn, Rui Barradas +1 more

#
Hi All,

I would like to recode my NAs to 0. Using a single vector everything is 
fine.

But if I use a data.frame things go wrong:

-- cut --

var1 <- c(1:3, NA, 5:7, NA, 9:10)
var2 <- c(1:3, NA, 5:7, NA, 9:10)
ds_test <-
  data.frame(var1, var2)

test <- var1
test[is.na(test)] <- 0
test  # NA recoded OK

# First try
ds_test[is.na(ds_test$var1)] <- 0  # duplicate subscripts WRONG

# Second try
ds_test[is.na("var1")] <- 0 
ds_test$var1  # not recoded WRONG

# Third try: to me the most intuitive approach
is.na(ds_test["var1"]) <- 0  # attempt to select less than one element in 
integerOneIndex WRONG

# Fourth try
ds_test[is.na(var1)] <- 0  # duplicate subscripts for columns WRONG

-- cut --
 
How can I do it correctly?

Where could I have found something about it?

Kind regards

Georg
#
Suggestion: figure out the correct extraction syntax first. One you do that
replacement will be easy.

See ?Extract for all the messy details.

Best,
Ista
On Jun 23, 2016 10:00 AM, <G.Maubach at weinwolf.de> wrote:

            

  
  
#
Hello,

You could do

ds_test[is.na(ds_test$var1), ] <- 0? # note the comma

or, more generally,

ds_test[] <- lapply(ds_test, function(x) {x[is.na(x)] <- 0; x})

Hope this helps,

Rui Barradas
?

Citando G.Maubach at weinwolf.de:
?
#
Dear Georg,

You need to learn a bit more about the subsetting methods, depending on 
the object structure you're trying to subset.

More specifically, when you run this: ds_test[is.na(ds_test$var1)]
you get this error: "Error in `[.data.frame`(ds_test, 
is.na(ds_test$var1)) : undefined columns selected"

This means that R does not understand which column you're trying to 
select. But you're actually trying to select rows.

Using a single bracket '[' on a data.frame does the same as for 
matrices: you need to specify rows and columns, like this:
ds_test[is.na(ds_test$var1), ] ## notice the last comma
ds_test[is.na(ds_test$var1), ] <- 0 ## works on all columns because you 
didn't specify any after the comma

If you want it only for "var1", then you need to specify the column:
ds_test[is.na(ds_test$var1), "var1"] <- 0

It's the same problem with your 2nd and 4th tries (4th one has other 
problems). Your 3rd try does not change ds_test at all.

HTH,
Ivan

--
Ivan Calandra, PhD
Scientific Mediator
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calandra at univ-reims.fr
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 23/06/2016 ? 15:57, G.Maubach at weinwolf.de a ?crit :