Skip to content

[bug] droplevels() also drop object attributes (comment…)

1 message · Suharto Anggono Suharto Anggono

#
* Be careful with "contrasts" attribute. If the number of levels is reduced, the original contrasts matrix is no longer valid.
Example case:
x <- factor(c("a", "a", "b", "b", "b"), levels = c("a", "b", "c"))
contrasts(x) <- contr.treatment(levels(x), contrasts=FALSE)[, -2, drop=FALSE]
droplevels(x)

* If function 'factor' is changed, make sure that as.factor(x) and factor(x) is the same for 'x' where is.integer(x) is TRUE. Currently, as.factor(<integer>) is treated specially.

* It is possible that names(x) is not attr(x, "names"). For example, 'x' is a "POSIXlt" object.
Look at this example, which works in R 3.3.2.
x <- as.POSIXlt("2017-01-01", tz="UTC")
factor(x, levels=x)


By the way, in NEWS, in "CHANGES IN R 3.4.0", in "SIGNIFICANT USER-VISIBLE CHANGES", there is "factor() now uses order() to sort its levels". It is false. Code of function 'factor' in R 3.4.0 (https://svn.r-project.org/R/tags/R-3-4-0/src/library/base/R/factor.R) still uses 'sort.list', not 'order'.

--------------------------------
>> Hi,

    >> Just reporting a small bug? not really a big deal, but I
    >> don?t think that is intended: droplevels() also drops all
    >> object?s attributes.

    > Yes.  The help page for droplevels (or the simple
    > definition of 'droplevels.factor') clearly indicate that
    > the method for factors is really just a call to factor(x,
    > exclude = *)

    > and that _is_ quite an important base function whose
    > semantic should not be changed lightly. Still, let's
    > continue :

    > Looking a bit, I see that the current behavior of factor()
    > {and hence droplevels} has been unchanged in this respect
    > for the whole history of R, well, at least for more than
    > 17 years (R 1.0.1, April 2000).

    > I'd agree there _is_ a bug, at least in the documentation
    > which does *not* mention that currently, all attributes
    > are dropped but "names", "levels" (and "class").

    > OTOH, factor() would only need a small change to make it
    > preserve all attributes (but "class" and "levels" which
    > are set explicitly).

    > I'm sure this will break some checks in some packages.  Is
    > it worth it?
> because 'drop=TRUE' calls factor(..) and that would also
    > preserve the "dim" attribute.  I would think that the
    > changed behavior _is_ better, and is also according to
    > documentation, because the help page for [.factor explains
    > that 'drop = TRUE' drops levels, but _not_ that it
    > transforms a factor matrix into a factor (vector).

    > Martin

I'm finally coming back to this.
It still seems to make sense to change factor() and hence
droplevels() behavior here, and plan to commit this change
within a day.

Martin Maechler
ETH Zurich