-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of PIKAL Petr
Sent: Thursday, April 28, 2016 2:32 PM
To: G.Maubach at weinwolf.de
Cc: r-help at r-project.org
Subject: Re: [R] Antwort: RE: Interdependencies of variable types, logical
expressions and NA
Hi
your initial ds
'data.frame': 2 obs. of 3 variables:
$ var1: num 1 1
$ var2: logi TRUE FALSE
$ var3: logi NA NA
first result
'data.frame': 2 obs. of 6 variables:
$ var1 : num 1 1
$ var2 : logi TRUE FALSE
$ var3 : logi NA NA
$ value_and_logical: logi TRUE TRUE
$ logical_and_na : logi TRUE NA
$ value_and_na : logi TRUE TRUE
1 is considered as TRUE therefore OR gives TRUE TRUE in first case, TRUE NA
in second and TRUE TRUE in third
Changing to factor changes var 2 to NA (I am not sure why)
'data.frame': 2 obs. of 3 variables:
$ var1: Factor w/ 2 levels "NOT ok","OK": 2 2
$ var2: Factor w/ 2 levels "NOT ok","OK": NA NA
$ var3: Factor w/ 2 levels "NOT ok","OK": NA NA
And this results to warning
ds$value_and_logical <- ifelse(ds$var1 | ds$var2, TRUE, FALSE)
Warning message:
In Ops.factor(ds$var1, ds$var2) : '|' not meaningful for factors
ds$logical_and_na <- ifelse(ds$var2 | ds$var3, TRUE, FALSE)
Warning message:
In Ops.factor(ds$var2, ds$var3) : '|' not meaningful for factors
ds$value_and_na <- ifelse(ds$var1 | ds$var3, TRUE, FALSE)
Warning message:
In Ops.factor(ds$var1, ds$var3) : '|' not meaningful for factors
'data.frame': 2 obs. of 6 variables:
$ var1 : Factor w/ 2 levels "NOT ok","OK": 2 2
$ var2 : Factor w/ 2 levels "NOT ok","OK": NA NA
$ var3 : Factor w/ 2 levels "NOT ok","OK": NA NA
$ value_and_logical: logi NA NA
$ logical_and_na : logi NA NA
$ value_and_na : logi NA NA
so | operation is not valid for factor variables and results to NA values.
Cheers
Petr
-----Original Message-----
From: G.Maubach at weinwolf.de [mailto:G.Maubach at weinwolf.de]
Sent: Thursday, April 28, 2016 12:00 PM
To: PIKAL Petr <petr.pikal at precheza.cz>
Subject: Antwort: RE: [R] Interdependencies of variable types, logical
expressions and NA
Hi Petr,
many thanks for your reply.
Yes it's interesting. I did not understand what the truth table wanted to
say due to 4 columns instead of 3. But know I got it.
The other thing is that logical expessions with NA work differently on
different types of variables as my example code shows:
-- cut --
# Truth table for logicals and NA
var2 <- c(TRUE, FALSE)
var3 <- c(NA, NA)
var1 <- c(1, 1)
ds <- data.frame(var1, var2, var3)
ds
ds$value_and_logical <- ifelse(ds$var1 | ds$var2, TRUE, FALSE)
ds$logical_and_na <- ifelse(ds$var2 | ds$var3, TRUE, FALSE)
ds$value_and_na <- ifelse(ds$var1 | ds$var3, TRUE, FALSE)
print(ds)
ds$var1 <- factor(ds$var1, levels = c(0, 1), labels = c("NOT ok", "OK"))
ds$var2 <- factor(ds$var2, levels = c(0, 1), labels = c("NOT ok", "OK"))
ds$var3 <- factor(ds$var3, levels = c(0, 1), labels = c("NOT ok", "OK"))
ds$value_and_logical <- ifelse(ds$var1 | ds$var2, TRUE, FALSE)
ds$logical_and_na <- ifelse(ds$var2 | ds$var3, TRUE, FALSE)
ds$value_and_na <- ifelse(ds$var1 | ds$var3, TRUE, FALSE)
print(ds)
-- cut --
Additionally the warning message that this script issues was not displayed
in my production code, but only in this test code.
Also: Is "<NA>" the same as "NA"?
Kind regards
Georg
Von: PIKAL Petr <petr.pikal at precheza.cz>
An: "G.Maubach at weinwolf.de" <G.Maubach at weinwolf.de>,
"r-help at r-project.org" <r-help at r-project.org>,
Datum: 28.04.2016 10:02
Betreff: RE: [R] Interdependencies of variable types, logical
expressions and NA
Sorry
these
T&NA = T (you can decide that regardless value in NA the result must be T)
F&NA = NA (you cannot decide hence NA)
should be
T | NA = T (you can decide that regardless value in NA the result must be
T)
F | NA = NA (you cannot decide hence NA)
Cheers
Petr
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of PIKAL
Sent: Thursday, April 28, 2016 9:42 AM
To: G.Maubach at weinwolf.de; r-help at r-project.org
Subject: Re: [R] Interdependencies of variable types, logical
NA
Hi
Your script is not reproducible.
Creating Check_U_0__Kd_1_2011 from Umsatz_2011 and Kunde01_2011
Error in ifelse(Kunden01[[Umsatz]] == 0 & Kunden01[[Kunde]] == 1, 1, 0)
object 'Kunden01' not found
This is interesting
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
<NA> FALSE TRUE
<NA> NA FALSE NA
FALSE FALSE FALSE FALSE
TRUE NA FALSE TRUE
I am not sure, but the logic for AND is to return TRUE only when both
expressions are TRUE.
so
T&T = T
F&F = F
T&NA = NA (you cannot decide hence NA)
F&NA = F (you can decide that regardless of NA the result must be F)
outer(x, x, "|") ## OR table
<NA> FALSE TRUE
<NA> NA NA TRUE
FALSE NA FALSE TRUE
TRUE TRUE TRUE TRUE
OTOH the logic for OR table is that if one of the expressions is TRUE
must be TRUE
T | T = T
F | F = F
T&NA = T (you can decide that regardless value in NA the result must be
F&NA = NA (you cannot decide hence NA)
And I believe that all your results can be explained by this logic.
Cheers
Petr
-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
G.Maubach at weinwolf.de
Sent: Thursday, April 28, 2016 9:08 AM
To: r-help at r-project.org
Subject: [R] Interdependencies of variable types, logical expressions
Hi All,
my script tries to do the following on factors:
## Check for case 3: Umsatz = 0 & Kunde = 1
for (year in 2011:2015) {
+ Umsatz <- paste0("Umsatz_", year)
+ Kunde <- paste0("Kunde01_", year)
+ Check <- paste0("Check_U_0__Kd_1_", year)
+
+ cat('Creating', Check, 'from', Umsatz, "and", Kunde, '\n')
+
+ Kunden01[[ Check ]] <- ifelse(Kunden01[[ Umsatz ]] == 0 &
+ Kunden01[[ Kunde ]] == 1,
+ 1, 0
+ )
+ Kunden01[[ Check ]] <- factor(Kunden01[[ Check ]],
+ levels=c(1, 0),
+ labels= c("Check 0", "OK")
+ )
+
+ }
Creating Check_U_0__Kd_1_2011 from Umsatz_2011 and
Creating Check_U_0__Kd_1_2012 from Umsatz_2012 and
Creating Check_U_0__Kd_1_2013 from Umsatz_2013 and
Creating Check_U_0__Kd_1_2014 from Umsatz_2014 and
Creating Check_U_0__Kd_1_2015 from Umsatz_2015 and
table(Kunden01$Check_U_0__Kd_1_2011, useNA = "ifany")
table(Kunden01$Check_U_0__Kd_1_2012, useNA = "ifany")
table(Kunden01$Check_U_0__Kd_1_2013, useNA = "ifany")
table(Kunden01$Check_U_0__Kd_1_2014, useNA = "ifany")
table(Kunden01$Check_U_0__Kd_1_2015, useNA = "ifany")
Kunden01$Check_U_0__Kd_1_all <-
ifelse(Kunden01$Check_U_0__Kd_1_2011 ==
1 |
+ Kunden01$Check_U_0__Kd_1_2012
1 |
+ Kunden01$Check_U_0__Kd_1_2013
1 |
+ Kunden01$Check_U_0__Kd_1_2014
1 |
+ Kunden01$Check_U_0__Kd_1_2015
table(Kunden01$Check_U_0__Kd_1_all, useNA = "ifany")
0 <NA>
7 23
(Ann.: I made the values up. But the relations equal real world data.)
I had expected to get back a factor or at least a numeric variable
containing 0, 1 and NA, instead 1 is not included.
I searched the web for information on the treatment of logical
evaluates to NA, but NA & FALSE evaluates to FALSE. See the examples
below.
## construct truth tables :
x <- c(NA, FALSE, TRUE)
names(x) <- as.character(x)
outer(x, x, "&") ## AND table
outer(x, x, "|") ## OR table
Ann. Not very useful. How should it be read?
3.
http://www.ats.ucla.edu/stat/r/faq/missing.htm
Good explanation for NA in general and in analysis, but no information
about NA in logical expressions.
Then I made some tests with different data types and variables with
-- cut --
# 2016-04-27-001_truth_table_for_logicals_and_NA.R
# Test 1
var2 <- c(TRUE, FALSE)
var3 <- c(NA, NA)
var1 <- c(1, 1)
ds <- data.frame(var1, var2, var3)
ds
ds$value_and_logical <- ifelse(ds$var1 | ds$var2, TRUE, FALSE)
ds$logical_and_na <- ifelse(ds$var2 | ds$var3, TRUE, FALSE)
ds$value_and_na <- ifelse(ds$var1 | ds$var3, TRUE, FALSE)
print(ds)
# Output
# var1 var2 var3 value_and_logical logical_and_na value_and_na
# 1 1 TRUE NA TRUE TRUE TRUE
# 2 1 FALSE NA TRUE NA TRUE
# Test 2
ds$var1 <- factor(ds$var1, levels = c(0, 1), labels = c("NOT ok",
ds$var2 <- factor(ds$var2, levels = c(0, 1), labels = c("NOT ok",
ds$var3 <- factor(ds$var3, levels = c(0, 1), labels = c("NOT ok",
ds$value_and_logical <- ifelse(ds$var1 | ds$var2, TRUE, FALSE)
ds$logical_and_na <- ifelse(ds$var2 | ds$var3, TRUE, FALSE)
ds$value_and_na <- ifelse(ds$var1 | ds$var3, TRUE, FALSE)
# Output (abbrev.)
# Warning message:
# In Ops.factor(ds$var1, ds$var3) : ?|? ist nicht sinnvoll f?r
print(ds)
# Output
# var1 var2 var3 value_and_logical logical_and_na value_and_na
# 1 OK <NA> <NA> NA NA NA
# 2 OK <NA> <NA> NA NA NA
-- cut --
I had expected to get the same result in Test 2 as in Test 1.
Where can I find information and documentation about NA handling in
logical expressions on different variable types?
Kind regards
Georg