Skip to content

Documentation for is.atomic and is.recursive

11 messages · Hadley Wickham, Duncan Murdoch, Tony Plate +2 more

#
The documentation for is.atomic and is.recursive is inconsistent with
their behavior in R 2.9.1 Windows.

? is.atomic

???? 'is.atomic' returns 'TRUE' if 'x' is an atomic vector (or 'NULL')
???? and 'FALSE' otherwise.
???? ...
???? 'is.atomic' is true for the atomic vector types ('"logical"',
???? '"integer"', '"numeric"', '"complex"', '"character"' and '"raw"')
???? and 'NULL'.

This description implies that is.atomic(x) implies is.vector(x)
(assuming that an "atomic vector type" is a subset of a "vector
type").? But in fact that is not true for values with class
attributes:

is.atomic(factor(3)) => TRUE
is.vector(factor(3)) => FALSE

is.atomic(table(3)) => TRUE
is.vector(factor(3)) => FALSE

It appears, then, that is.atomic requires only that unclass(x) be an
atomic vector, not that x be an atomic vector.

There is also another case where is.atomic(x) != is.vector(unclass(x)):

is.atomic(NULL) => TRUE
is.vector(NULL) => FALSE

It would be useful to make the documentation consistent with the
implementation. (Presumably by updating the documentation, not
changing the behavior.)

The documentation continues:

???? 'is.recursive' returns 'TRUE' if 'x' has a recursive (list-like)
???? structure and 'FALSE' otherwise.
???? ...
???? Most types of language objects are regarded as recursive: those
???? which are not are the atomic vector types, 'NULL' and symbols (as
???? given by 'as.name').

But is.recursive(as.name('foo')) == is.recursive(quote(foo)) == FALSE.

Again, it would be useful to make the documentation consistent with
the implementation.

To summarize all this in a table of the most common datatypes:

outerl <-
? function(f, a, b)
??? structure(outer(a,b,Vectorize(f)),
????????????? dimnames=list(a,b))

outerl(function(x,f)(match.fun(f))(x),
?????? list(3,factor(c("a","b")),NULL,function()3,as.name("foo"),environment()),
?????? list("class","mode","storage.mode","is.vector","is.atomic","is.recursive"))

????????????? class???????? mode????????? storage.mode? is.vector
is.atomic is.recursive
3???????????? "numeric"???? "numeric"???? "double"????? "TRUE"
"TRUE"??? "FALSE"????       <<< OK
1:2?????????? "factor"????? "numeric"???? "integer"???? "FALSE"
"TRUE"??? "FALSE"????         <<< inconsistent
NULL????????? "NULL"??????? "NULL"??????? "NULL"??????? "FALSE"
"TRUE"??? "FALSE"????     <<< inconsistent
function ()?? "function"??? "function"??? "function"??? "FALSE"
"FALSE"?? "TRUE"?????     <<< OK
foo?????????? "name"??????? "name"??????? "symbol"????? "FALSE"
"FALSE"?? "FALSE"????      <<< inconsistent
<environment> "environment" "environment" "environment" "FALSE"
"FALSE"?? "TRUE"?????<<< OK

Thanks,

           -s
#
On Wed, Sep 2, 2009 at 2:39 PM, Stavros Macrakis<macrakis at alum.mit.edu> wrote:

            
Sorry, this *is* consistent with the behavior.  But if we read "the
atomic vector types, 'NULL' and symbols" as a list of mutually
exclusive categories, then is.atomic(NULL)==FALSE is inconsistent.

              -s
#
On Wed, Sep 2, 2009 at 1:54 PM, Stavros Macrakis<macrakis at alum.mit.edu> wrote:
And the sentence could be more clearly written as:

Most types of language objects are regarded as recursive, except for
atomic vector types, 'NULL' and symbols (as given by 'as.name').

Hadley
#
On 9/2/2009 2:39 PM, Stavros Macrakis wrote:
I don't see is.vector mentioned there.  The description of is.vector on 
its own man page implies the behaviour below; I think the description of 
is.atomic that you quote above is also consistent with the behaviour.

One could argue that in R's pre-history we should have had is.atomic 
imply is.vector, but that's not how things are documented, and I think 
we're pretty much stuck with the definitions we've got on low level 
functions like those.
That's what it says should happen.  symbols such as as.name('foo') are 
not recursive.

Duncan Murdoch
#
Let us stipulate that the current wording can be construed to be correct.

I would nonetheless claim that the documentation as currently written
is at best ambiguous and confusing, and would benefit from improved
wording.

What would be lost by that?
I explicitly said in my mail that I was not suggesting that past
design decisions (wise or unwise) be revisited; only that they be
documented more clearly.

               -s
On Wed, Sep 2, 2009 at 3:37 PM, Duncan Murdoch<murdoch at stats.uwo.ca> wrote:
#
On 02/09/2009 4:10 PM, Stavros Macrakis wrote:
I'd rather just state that the current wording is correct, without the 
weasel words.
A claim that documentation would benefit from improved wording is a 
tautology.  A claim that the documentation is ambiguous requires more 
evidence than you've offered.  You have demonstrated that someone could 
be confused when reading it, but that isn't necessarily our responsibility.

Duncan Murdoch
#
Duncan Murdoch wrote:
I suspect the confusion comes from the mistaken, but very 
understandable, interpretation that the phrase "'x' is a vector" in the 
documentation should somehow equate with the R function call 
'is.vector(x)'.  Similar potentially misleading wording appears in the 
"Description" for ?is.vector:
   "|is.vector| returns |TRUE| if |x| is a vector (of mode logical, 
integer, real, complex, character, raw or list if not specified) or 
expression and |FALSE|".
However, the Details for ?is.vector do clarify, saying:
  "|is.vector| returns |FALSE| if |x| has any attributes except names."

Here are some concrete suggestions for improving the documentation:
(1) In the Details or Note section for ?is.atomic, add the paragraph:
"is.atomic is unaffected by the presence of attributes on 'x', unlike 
is.vector (which would probably be better named is.bare.vector).  
is.atomic(x) merely tests whether the data mode of x is an atomic vector 
type, and ignores whether or not x has attributes.  The behavior of 
is.vector(x) is quite different -- it tests whether x is a bare vector.  
is.vector can be TRUE for lists and expressions as well as atomic vector 
types, but if there are attributes other than names on x, is.vector(x) 
returns FALSE."

(2) In the Description section for ?is.vector, change
  "|is.vector| returns |TRUE| if |x| is a vector (of mode logical, 
integer, real, complex, character, raw or list if not specified) or 
expression"
to
  "|is.vector| returns |TRUE| if |x| is a bare vector (of mode logical, 
integer, real, complex, character, raw or list if not specified, with no 
attributes other than names), or expression with no attributes other 
than names"

-- Tony Plate
#
On 02/09/2009 6:50 PM, Tony Plate wrote:
Thanks for the suggestions.  I don't like your first one for a number of 
reasons:  too wordy, not quite right in its description of is.vector, 
editorializing about some other function's name, etc.  I'd rather leave 
the ?is.atomic page alone.

I agree that the ?is.vector page should be improved, but your suggestion 
is ambiguous:  does the "no attributes" part apply only to expressions? 
  I've just spent a few minutes, and come up with the following rewrite. 
  I'd shorten the Description entry to

   \code{is.vector} returns \code{TRUE} if \code{x} is a vector of the
specified mode having no attributes other than names.  It returns 
\code{FALSE} otherwise.

and lengthen the Details to

   If \code{mode = "any"}, \code{is.vector} returns \code{TRUE} for
   modes logical, integer, real, complex, character, raw, list or
   expression. It returns \code{FALSE} if \code{x} has any attributes
   except names.  (This is incompatible with S.)  On the other hand,
   \code{as.vector} removes \emph{all} attributes including names for
   results of atomic mode.

It's not perfect, because it is ambiguous what "This" refers to, but I 
don't actually know how much of the preceding description is 
incompatible with S, or even whether that claim makes sense:  Are all 
versions of S consistent in this?  I think it's probably inappropriate 
to be trying to document S here...  Nevertheless, I've committed this 
change.

It's not easy to write good documentation.

Duncan Murdoch
#
On Wed, Sep 2, 2009 at 5:30 PM, Duncan Murdoch<murdoch at stats.uwo.ca> wrote:
...
Of course not.  I forgot.  This is r-devel: the user is always wrong,
the developer is always right.

           -s
#
On 03/09/2009 5:36 PM, Stavros Macrakis wrote:
Yes, you want r-whinge.  That's down the hall.

Duncan Murdoch
#
hw> On Wed, Sep 2, 2009 at 1:54 PM, Stavros
hw> Macrakis<macrakis at alum.mit.edu> wrote:
>> On Wed, Sep 2, 2009 at 2:39 PM, Stavros
>> Macrakis<macrakis at alum.mit.edu> wrote:
>> 
    >>> ???? Most types of language objects are regarded as
    >>> recursive: those ???? which are not are the atomic
    >>> vector types, 'NULL' and symbols (as ???? given by
    >>> 'as.name').
    >>> 
    >>> But is.recursive(as.name('foo')) ==
    >>> is.recursive(quote(foo)) == FALSE.
    >> 
    >> Sorry, this *is* consistent with the behavior. ?But if we
    >> read "the atomic vector types, 'NULL' and symbols" as a
    >> list of mutually exclusive categories, then
    >> is.atomic(NULL)==FALSE is inconsistent.

    hw> And the sentence could be more clearly written as:

    hw> Most types of language objects are regarded as
    hw> recursive, except for atomic vector types, 'NULL' and
    hw> symbols (as given by 'as.name').

yes, that's a shorter and more elegant.
But before amending that,  why 
"language objects" instead of just "R objects" or "objects" ?
  
In the context of S and R when I'd hear  "language objects",
I'd think of the results of
    expression() , formula(), substitute(), quote()
i.e., objects for which  is.language() was true.

So, I'm proposing

  Most types of objects are regarded as recursive, except for
  atomic vector types, \code{NULL} and symbols (as given by
  \code{\link{as.name}}).

--
Martin Maechler