Skip to content

Finicky factor comparison operators

4 messages · johnmark, R. Michael Weylandt, David Winsemius

#
This error occurs because the "==" comparison operator doesn't allow
comparison of ordered and normal factors:

/df[df5$close_quarter == as.Date("2011-02-01"),]/
Warning message:
In /`[.data.frame`(df, df$close_quarter == as.Date("2011-02-01")/,  :
  Incompatible methods ("Ops.ordered", "Ops.Date") for "=="

Why should this be a problem -- Isn't this being overly cautious?  Can
anyone think of a case where coercing the ordered factor to a normal factor
for comparisons of == would do the wrong thing? 

Perhaps this is a question for the developer's section. 

Cheers -john mark agosta



--
View this message in context: http://r.789695.n4.nabble.com/Finicky-factor-comparison-operators-tp4400377p4400377.html
Sent from the R help mailing list archive at Nabble.com.
#
It's not a matter of unordered & ordered factors, but ordered factors
and Dates (as the warning says)

I can see at least one ambiguity -- should comparison be made from the
level or the internal code -- so the warning makes sense to me (though
an error might make even more sense). Generally, for factors that
correspond to non-Date quantities, the comparison likely isn't well
defined.

How would you resolve this comparison in general?

Michael
On Sat, Feb 18, 2012 at 3:08 PM, johnmark <johnmark.agosta at gmail.com> wrote:
1 day later
#
MIchael -

Thanks for your insight.  I think I see where you're going with this.  

To make '==' comparisons for subsetting against an ordered factor, I've had
to create a lookup table for all possible values I'd ever want to compare
against (all dates covered by the quarters in question, in this case) that
maps into the ordered factors values.  This is wrapped by a function that
returns an ordered factor, which allows me to write:

/(opps$close_quarter == which.quarter.end("2010-10-20")/

Otherwise if I try to create an ordered factor from the constant just for
the purposes of comparison, the error tells me that ordered factors from
different sources cannot be compared:

/(opps$close_quarter == factor("2007-10-20", ordered=T)
Error in Ops.factor(factor("2007-10-30", ordered = T), quarter.factors[1,
2]) : 
  level sets of factors are different/

That makes sense, since internally factors are integers -- "enums" in other
terms. 

But what I want to avoid -- and what I don't see as necessary is explicitly
coercing the terms to a common representation that mimics their print form:

/as.character("2007-10-20")== as.character(factor("2007-10-20", ordered=T))
/
I don't think there should be confusion since the conversion to print form
is "obvious" -- but it does conflict with the conversion rules for creating
vectors by c():

/c("2011-10-20", factor("2007-10-20", ordered=T))
[1] "2011-10-20" "1" /

where the factor is converted to its internal "enum" representation, then to
a character. 

Having given this some more thought to what motivated the original question,
one could use "which()" to invert the factor's levels vector:

/which("2008-04-30" == levels(quarter.factors[,2]))
[1] 3 /

Its still not clear to me what exactly are the implicit conversion rules for
factors.

Cheers -jm

/

--
View this message in context: http://r.789695.n4.nabble.com/Finicky-factor-comparison-operators-tp4400377p4403352.html
Sent from the R help mailing list archive at Nabble.com.
#
On Feb 20, 2012, at 1:45 AM, johnmark wrote:

            
Actually it is telling you that you cannot compare ordered factors  
which have different levels. That makes perfect sense for the same  
reasons that you are not allowed to compare Dates to ordered factors.  
If the factors from different sources had the same levels you should  
have succeeded.

 > z <- factor(LETTERS[3:1], ordered = TRUE)
 > z3 <- factor(LETTERS[1:3] , ordered=TRUE)
 > z[2] == z3[2]
[1] TRUE
That just an example of the need to use as.character when converting  
data out of factor class.
In your last case you are comparing a character to a character value  
and getting the expected result. (Since levels(quarter.factors) is NOT  
a factor.)  You should also succeed when testing equality between  
ordered factor and character types. You have still not provided an  
example for testing so this may suffice.

 > z <- factor(LETTERS[3:1], ordered = TRUE)
 > z == "A"
[1] FALSE FALSE  TRUE

You should be able to assemble a list of valid candidate (character)  
values with levels(fac). Or if you want them in factor representation  
then use unique(fac).