The data.frame method deliberately skips non-atomic columns before
invoking is.na(x) so I think it is fair to assume this behaviour is
intentional and assumed.
Not so clear to me that there is a sensible answer for list columns.
(List columns seem to collide with the expectation that in each
variable every observation will be of the same type)
Consider your list L as
L <- list(NULL, NA, c(NA, NA))
Seems like every observation could have a claim to be 'missing' here.
Concretely, if a data.frame had a list column representing the lat-lon
of an observation, we might only be able to represent missing values
like c(NA, NA).
On Fri, 13 Aug 2021 at 17:27, I?aki Ucar <iucar at fedoraproject.org> wrote:
On Thu, 12 Aug 2021 at 22:20, Gabriel Becker <gabembecker at gmail.com>
Hi Toby,
This definitely appears intentional, the first expression of
stats:::na.omit.default is
if (!is.atomic(object))
return(object)
I don't follow your point. This only means that the *default* method
is not intended for non-atomic cases, but it doesn't mean it shouldn't
exist a method for lists.
So it is explicitly just returning the object in non-atomic cases,
includes lists. I was not involved in this decision (obviously) but my
guess is that it is due to the fact that what constitutes an
"being complete" in unclear in the list case. What should
na.omit(list(5, NA, c(NA, 5)))
return? Just the first element, or the first and the last? It seems, at
least to me, unclear. A small change to the documentation to to add
is.na(list(5, NA, c(NA, 5)))
[1] FALSE TRUE FALSE
Following Toby's argument, it's clear to me: the first and the last.
I?aki
(in the sense of is.atomic returning \code{TRUE})" in front of
or similar where what types of objects are supported seems justified,
though, imho, as the current documentation is either ambiguous or
technically incorrect, depending on what we take "vector" to mean.
Best,
~G
On Wed, Aug 11, 2021 at 10:16 PM Toby Hocking <tdhock5 at gmail.com>
Also, the na.omit method for data.frame with list column seems to be
inconsistent with is.na,
L <- list(NULL, NA, 0)
str(f <- data.frame(I(L)))
'data.frame': 3 obs. of 1 variable:
$ L:List of 3
..$ : NULL
..$ : logi NA
..$ : num 0
..- attr(*, "class")= chr "AsIs"
L
[1,] FALSE
[2,] TRUE
[3,] FALSE
L
1
2 NA
3 0
On Wed, Aug 11, 2021 at 9:58 PM Toby Hocking <tdhock5 at gmail.com>
na.omit is documented as "na.omit returns the object with
removed." and "At present these will handle vectors," so I
when it is used on a list, it should return the same thing as if we
via is.na; however I observed the following,
L <- list(NULL, NA, 0)
str(L[!is.na(L)])
List of 2
$ : NULL
$ : num 0
List of 3
$ : NULL
$ : logi NA
$ : num 0
Should na.omit be fixed so that it returns a result that is
with is.na? I assume that is.na is the canonical definition of
should be considered a missing value in R.
[[alternative HTML version deleted]]