Skip to content

identical() versus sapply()

30 messages · Paulson, Ariel, Bert Gunter, Jeff Newmiller +9 more

Messages 1–25 of 30

#
Sorry if this has been answered elsewhere, but I can't find any discussion of it.

Wondering why the following situation occurs (duplicated on 3.2.2 CentOS6 and 3.0.1 Win2k, so I don't think it is a bug):
[1] TRUE
[1] FALSE FALSE
[1]  TRUE FALSE
[1] FALSE FALSE

I have been unable to find anything different about the versions of "1" that identical() is not finding identical.

Thanks,
Ariel
#
I highly recommend making friends with the str function. Try

str( 1 )
str( 1:2 )

for the clue you need, and then

sapply( 1:2, identical, 1L )
#
On 09/04/16 16:24, Jeff Newmiller wrote:
Interesting.  But to me counter-intuitive.  Since R makes no distinction 
between scalars and vectors of length 1 (or more accurately I think, 
since in R there is *no such thing as a scalar*, only a vector of length 
1) I don't see why "1" should be treated in a manner that is 
categorically different from the way in which "1:2" is treated.

Can you, or someone else with deep insight into R and its rationale, 
explain the basis for this difference in treatment?
cheers,

Rolf
#
On 09/04/2016 6:27 AM, Rolf Turner wrote:
It's not the fact that one is a vector, it's just that the : function 
returns an integer result when given whole number arguments.  The 
literal 1 is stored in floating point, the result of 1:2 is integer.

I think the rationale is that sequences of whole numbers are often used 
as integers (e.g. in "for (i in 1:2)", whereas constants are often used 
in floating point expressions, so this reduces the number of conversions 
that are needed.  If you really mean your 1 to be stored as an integer, 
write it as 1L.  If you want it as floating point, write it as 1.

Duncan Murdoch
2 days later
#
Ok, I see the difference between 1 and 1:2, I'll just leave it as one of those "only in R" things.

But it seems then, that as.numeric() should guarantee a FALSE outcome, yet it does not. 

To build on what Rolf pointed out, I would really love for someone to explain this one:
num 1
int [1:2] 1 2
num [1:2] 1 2
int [1:2] 1 2

Which doubly makes no sense.  1) Either the class is "numeric" or it isn't; I did not call as.integer() here.  2) method of recasting should not affect final class.

Thanks,
Ariel


-----Original Message-----
From: Rolf Turner [mailto:r.turner at auckland.ac.nz] 
Sent: Saturday, April 09, 2016 5:27 AM
To: Jeff Newmiller
Cc: Paulson, Ariel; 'r-help at r-project.org'
Subject: Re: [FORGED] Re: [R] identical() versus sapply()
On 09/04/16 16:24, Jeff Newmiller wrote:
Interesting.  But to me counter-intuitive.  Since R makes no distinction between scalars and vectors of length 1 (or more accurately I think, since in R there is *no such thing as a scalar*, only a vector of length
1) I don't see why "1" should be treated in a manner that is categorically different from the way in which "1:2" is treated.

Can you, or someone else with deep insight into R and its rationale, explain the basis for this difference in treatment?
cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276
#
Indeed!

Slightly simplified to emphasize your point:
[1] "integer"
[1] "numeric"

whereas in ?as it says:

"Methods are pre-defined for coercing any object to one of the basic
datatypes. For example, as(x, "numeric") uses the existing as.numeric
function. "

I suspect this is related to my ignorance of S4 classes (i.e. as() )
and how they relate to S3 classes, but I certainly don't get it
either.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Apr 11, 2016 at 9:30 AM, Paulson, Ariel <apa at stowers.org> wrote:
#
Hypothesis regarding the thought process: integer is a perfect subset of numeric, so why split hairs?
#
Hi Jeff,


We are splitting hairs because R is splitting hairs, and causing us problems.  Integer and numeric are different R classes with different properties, mathematical relationships notwithstanding.  For instance, the counterintuitive result:
[1] FALSE


Unfortunately the reply-to chain doesn't extend far enough -- here is the original problem:
[1] TRUE
[1] FALSE FALSE
[1]  TRUE FALSE
[1] FALSE FALSE

These are the results of R's hair-splitting!

Ariel
#
Use all.equal instead of identical if you want to gloss over
integer/numeric class differences and minor floating point differences (and
a host of others).

Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Apr 11, 2016 at 5:25 PM, Paulson, Ariel <apa at stowers.org> wrote:

            

  
  
#
On 11/04/2016 8:25 PM, Paulson, Ariel wrote:
The issue here is that R has grown.  The as() function is newer than the 
as.numeric() function, it's part of the methods package.  It is a much 
more complicated thing, and there are cases where they differ.

In this case, the problem is that is(1L, "numeric") evaluates to TRUE, 
and nobody has written a coerce method that specifically converts 
"integer" to "numeric".  So the as() function defaults to doing nothing.
It takes a while to do nothing, approximately 360 times longer than 
as.numeric() takes to actually do the conversion:

 > microbenchmark(as.numeric(1L), as(1L, "numeric"))
Unit: nanoseconds
               expr   min    lq      mean  median       uq     max neval
     as.numeric(1L)   133   210    516.92   273.5    409.5    9444   100
  as(1L, "numeric") 51464 64501 119294.31 99768.5 138321.0 1313669   100

R performance is not always simple and easy to predict, but I think 
anyone who had experience with R would never use as(x, "numeric").  So 
this just isn't a problem worth fixing.

Now, you might object that the documentation claims they are equivalent, 
but it certainly doesn't.  The documentation aims to be accurate, not 
necessarily clear.

Duncan Murdoch
#
?Perfect!


Thanks,

Ariel
#
Hi Duncan,

That explains it, thanks!

I rarely use as(), but had thought in this case, replacing identical(x, y) with identical(x, as(y,class(x))) could be an sapply-friendly way to iron out class differences -- then noticed the inexplicable result.  But now I know about all.equal().

Thanks,
Ariel
#
"The documentation aims to be accurate, not necessarily clear."

!!!

I hope that is not the case! Accurate documentation that is confusing
is not very useful. I understand that it is challenging to write docs
that are both clear and accurate; but I hope that is always the goal.

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Apr 11, 2016 at 6:09 PM, Duncan Murdoch
<murdoch.duncan at gmail.com> wrote:
#
On 11/04/2016 10:18 PM, Bert Gunter wrote:
I don't think it is ever intentionally confusing, but it is often 
concise to the point of obscurity.  Words are chosen carefully, and 
explanations are not repeated.  It takes an effort to read it.  It will 
be clear to careful readers, but not to all readers.

I was thinking of the statement quoted earlier, 'as(x, "numeric") uses 
the existing as.numeric function'.  That is different than saying 'as(x, 
"numeric") is the same as as.numeric(x)'.

Duncan Murdoch

  I understand that it is challenging to write docs
#
On 12/04/16 13:09, Duncan Murdoch wrote:
<SNIP>
<SNIP>

Fortune nomination!

cheers,

Rolf
#
On 12/04/16 14:45, Duncan Murdoch wrote:
IMHO this is so *obviously* confusing and misleading --- even though it 
is technically correct --- that whoever wrote it was either 
intentionally trying to be confusing or is unbelievably obtuse and/or 
out of touch with reality.

It is not (again IMHO) clear even to *very* careful readers.

To my mind this documentation fails even the fortune(350) test.

cheers,

Rolf
#
I have lost the link but someone here had a lovely essay on R documentation which pointed out that one had to  have "faith" that everything was in the documentation.
____________________________________________________________
Receive Notifications of Incoming Messages
Easily monitor multiple email accounts & access them with a click.
Visit http://www.inbox.com/notifier and check it out!
#
Thank you Rolf.  fortune(350) was the link I was trying to remember.

I believe! I believe in the documentation. 

It can be incredibly difficult to document something and unless one has an editor to read and 'try' to interpret the results the original writer may not realise just how opaque the explanation is.

John Kane
Kingston ON Canada
____________________________________________________________
Can't remember your password? Do you need a strong and secure password?
Use Password manager! It stores your passwords & protects your account.
#
On 11/04/2016 11:34 PM, Rolf Turner wrote:
I generally agree that that particular sentence falls pretty far out on 
the obscurity end of the spectrum, but it's much easier to criticize the 
documentation than it is to write it.  I notice that none of the critics 
in this thread have offered improvements on what is there.

I haven't looked up who wrote it (it wasn't me, though I'm sure I've 
written equally obscure sentences), but I do not believe it was 
intentionally confusing, nor is the author obtuse or out of touch with 
reality.  I think that insulting authors is not a way to encourage them 
to change.  That's reality.

Duncan Murdoch
#
This issue is as old as documented things. With software it is
particularly nasty, especially when we want the software to function
across many platforms.

Duncan has pointed out that critics need to step up to do something.
I would put documentation failures at the top of my list of
time-wasters, and have been bitten by some particularly weak offerings
(not in R) in the last 2 weeks. So ....

Proposal: That the R community consider establishing a "test and
document" group to parallel R-core to focus on the documentation.
An experiment to test the waters is suggested below.

The needs:
- tools that let the difficulties with documentation be visualized along
with proposed changes and the discussion accessed by the wider
community, while keeping a well-defined process for committing accepted
changes.
- a process for the above. Right now a lot happens by discussion in the
lists and someone in R-core committing the result. If it is
well-organized, it is not well-understood by the wider R user community.
- tools for managing and providing access to tests

At the risk of opening another can of worms, documentation is an area
where such an effort could benefit from paid help. It's an area where
there's low reward for high effort, particularly for volunteers.
Moreover, like many volunteers, I'm happy to do some work, but I need
ways to contribute in small bites (bytes?), and it is difficult to find
suitable tasks to take on.

Is it worth an experiment to customize something like Dokuwiki (which I
believe was the platform for the apparently defunct R wiki) to allow a
segment of R documentation to be reviewed, discussed and changes
proposed? It could show how we might get to a better process for
managing R documentation.

Cheers, JN
#
Short comment inline
On 12/04/2016 12:45, John Kane wrote:
I do not think anyone who has written documentation would disagree.
Would one way forward here for the OP to suggest with the benefit of all 
the comments how things might be enhanced so that he would not have been 
baffled?

  
    
  
#
On 12/04/2016 9:21 AM, ProfJCNash wrote:
The idea of having non-core people write and test documentation appeals 
to me.   The mechanism (Dokuwiki or whatever) makes no difference to me; 
it should be up to the participants to decide on what works.

The difficulty will be "calibration":  those people need to make changes 
that core members agree are improvements, or they won't be incorporated.

I'd suggest that you start very slowly.  First choose *one* help page 
that you think needs improvement, and explain why to one of the authors 
of that page, and what sort of improvements you propose to make.  Then 
get  the author to agree with the proposal, do it, and get the same 
author to agree to the final version and commit it.

I'll volunteer to participate in the approval and committing stage, but 
at first only for pages that I authored.  If it turns out to be an 
efficient way to improve docs, then I'd consider other pages too.

Duncan Murdoch
#
FWIW:

1. I agree that this is an idea worth considering. Especially now that
R has become so widely used among practitioners who are neither
especially software literate nor interested in poring over R manuals
(as I did when I first learned R). They have explicit tasks to do and
just want to get to them as directly as possible.

2. A partial reply to the (fair) criticism of those who criticize docs
without offering improvements is that one may not know what
improvement to offer precisely because the docs do not make it clear.
This proposal or something similar addresses this issue. The experts
could adjudicate.

3. I agree: writing good docs is hard. Having a mechanism like this
would also help non-native English writers of software (or challenged
native writers like me!) .

4. I also think John is right, that if the right mechanism were found
so that small efforts could be accumulated, a lot of us would
participate. A wiki sounds about right, but I bow to those with
greater wisdom and experience here.

5. The danger here is that this would suck a lot of time from R core.
That's unacceptable. Presumably a wiki (self-correcting?) would help
avoid this.

Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Apr 12, 2016 at 6:21 AM, ProfJCNash <profjcnash at gmail.com> wrote:
#
Thanks Duncan, for the offer to experiment.

Can you suggest a couple of your pages that you think might need
improvement? We might as well start with something you'd like looked at.

Then I'll ask if there are interested people and see what can be done
about getting a framework set up to work on one of those documents.

JN
On 16-04-12 10:52 AM, Duncan Murdoch wrote:
#
I am very interested in such a distributed documentation editing
project, and have some thoughts on how to make it workable for both
volunteers and core members who would need to review.

I'm willing to lead or colead such a project, if someone stepping up
would be a useful first step, and I'm also willing to host a wiki,
although I think something like GitHub is probably the best place.
I've been contemplating for a while how I can get more involved in the
main R efforts, and have contributed to the documentation before, in
tiny ways. I think those of us who have participated in R-help for a
while have an idea of the main stumbling blocks in the documentation
(besides, of course, getting people to read it in the first place).

I don't think R-help is the right place to continue discussion; should
this be moved to R-devel, or somewhere else entirely?

Sarah
On Tue, Apr 12, 2016 at 11:06 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: