Skip to content
Prev 361967 / 398506 Next

Subscripting problem with is.na()

Hi Bert,

many thanks for all your help and your comments. I learn at lot this way.

My question was about is.na() at the first sight but the actual task looks like this:

I have two variables in my customer data that signal if the customer accout was closed by master data management or by sales. Say these variables are closed_mdm and closed_sls. They contain NA if the customer account is still open or a closing code from "01" to "08" if the customer account was closed and why.

For my analysis I need a variable that combines the two variables closed_mdm and closed_sls to set a filter easily on those who are closed not matter what the reason was nor who closed the account.

As I always encounter problems when dealing with ifelse statements and NA I decided to merge these two variables to one variable containing 0 = not closed and 1 = closed. In my context this seems to be - at least to me - a reasonable approach.

Replacement of missing values and merging the variables is the easiest way for me.

-- cut --

cust_id <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20)
closed_mdm <- c("01", NA, NA, NA, "08", "07", NA, NA, "05", NA, NA, NA, "04", NA, NA, NA, NA, NA, NA, NA)
closed_sls <- c(NA, "08", NA, NA, "08", "07", NA, NA, NA, NA, "03", NA, NA, NA, "05", NA, NA, NA, NA, NA)

# 1st try
ds_temp1 <- data.frame(cust_id, closed_mdm, closed_sls)
ds_temp1

ds_temp1$closed <- closed_mdm | closed_sls  # WRONG

# 2nd try
closed_mdm_fac1 <- as.factor(closed_mdm)
closed_sls_fac1 <- as.factor(closed_sls)

ds_temp2 <- data.frame(cust_id, closed_mdm_fac1, closed_sls_fac1)
ds_temp2

ds_temp2$closed <- ds_temp$closed_mdm_fac1 | ds_temp$closed_sls_fac1  # WRONG

# 3rd try
closed_mdm_num1 <- as.numeric(closed_mdm)  # OK
closed_sls_num1 <- as.numeric(closed_sls)  # OK

ds_temp3 <- data.frame(cust_id, closed_mdm_num1, closed_sls_num1)
ds_temp3

ds_temp3$closed <- ds_temp$closed_mdm_num1 | ds_temp$closed_sls_num1  # WRONG

# 4th try
ds_temp4 <- ds_temp3
ds_temp4

# Does not run due to not allowed NA in subscripts
ds_temp4[is.na(ds_temp4$closed_mdm_num1), ds_temp4$closed_mdm_num1] <- 0
ds_temp4[is.na(ds_temp4$closed_sls_num1), ds_temp4$closed_sls_num1] <- 0

# 5th try
ds_temp4$closed_mdm_num1 <- ifelse(is.na(ds_temp4$closed_mdm_num1), 1, 0)
ds_temp4$closed_sls_num1 <- ifelse(is.na(ds_temp4$closed_sls_num1), 1, 0)
ds_temp4

ds_temp4$closed <- ifelse(ds_temp4$closed_mdm_num1 == 1 | ds_temp4$closed_sls_num1 == 1, 1, 0)
ds_temp4

-- cut --

Is there a better way to do it?

Kind regards

Georg