all.equal.list() sometimes fails with unnamed and named components (PR#674) - R-devel

Tue, Oct 3, 2000 9:26 AM #

components (PR#674)

Probably not. Lists do have orderings: they are not sets but generic
vectors.

That's not what current versions of S-PLUS give, as one might hope.

I think that both the names and components should match exactly (the
components recursively).  Unfortunately the named-component extraction
is partial matching (at least, sometimes) so the ordering of the names
always matters.  (There's an S/R difference here I keep forgetting to 
write down. I think it is 

x <- list(aa=1, bb=2)
x["a"]

which gives in S
$aa:
[1] 1
and in R
$"NA"
NULL
so S always partial matches, but R does not always.)

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Kurt Hornik

Tue, Oct 3, 2000 11:25 PM #

Prof Brian Ripley writes:

To: Kurt.Hornik@ci.tuwien.ac.at
Cc: cberry@tajo.ucsd.edu, r-devel@stat.math.ethz.ch
Subject: Re: [Rd] all.equal.list() sometimes fails with unnamed and named

components (PR#674)

From: Peter Dalgaard BSA <p.dalgaard@biostat.ku.dk>
Date: 03 Oct 2000 18:05:50 +0200

Kurt Hornik <Kurt.Hornik@ci.tuwien.ac.at> writes:

Maybe we should change this as follows: if either of the two lists has
names, work though the named components.  Warn about the ones not
present in both.  Compare the ones present in both.  Then get rid of all
named components and compare what is left in positional order.

As I said, I am not sure that this is really what we want.

Comments?

I think you might be right, and also that this is an easy thing to
implement. Then we'd have

all.equal(list(a=1,b=2,3,4),list(3,b=2,4,a=1)) == TRUE

Right?

Probably not. Lists do have orderings: they are not sets but generic
vectors.

However, BigBrother has

all.equal(list(a=1,b=2,3,4),list(3,b=2,4,a=1))

[1] "Names: 2 string mismatches"
attr(, "continue"):
[1] T

all.equal(list(a=1,b=2,3,4), list(a=1,b=2,4,3))

[1] T

That's not what current versions of S-PLUS give, as one might hope.

..which does look like a "compatible bug"

Hmm. Maybe one wants positional matching in any case? But then, what
is the named-component extraction about??

I think that both the names and components should match exactly (the
components recursively).  Unfortunately the named-component extraction
is partial matching (at least, sometimes) so the ordering of the names
always matters.  (There's an S/R difference here I keep forgetting to 
write down. I think it is

x <- list(aa=1, bb=2)
x["a"]

which gives in S
$aa:
[1] 1
and in R
$"NA"
NULL
so S always partial matches, but R does not always.)

More precisely, we have

R> x[["a"]]
[1] 1
R> x["a"] 
$"NA"
NULL

Does this make sense at all?  Comparing it to

R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]
$"NA"
[1] 3

I would think that the x["a"] incorrectly indicates that the list has a
named component "NA" with value NULL ...

What should we do about all.equal.list()?  Should we deal with the named
components first and strip them off, or always go the positional route?
Your comment that lists are generic vectors would indicate that the
second approach is more appropriate ...

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Brian Ripley

Tue, Oct 3, 2000 11:27 PM #

On Wed, 4 Oct 2000, Kurt Hornik wrote:

Yes, so would I. I think it is a bug in R.

See the first two lines I've left here. I think it should be positional
matching *and* the names should match too. That is, all.equal.list would
fail if one list had names and other did not.

Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Kurt Hornik

Tue, Oct 3, 2000 11:43 PM #

R> x[["a"]]

R> x["a"]

R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]

Agreed.  Should not be too hard to spot ...

Well should it really FAIL?

What would be the precendence of the comparisons?  E.g.,

* compare the lenghts.  If different, only retain the components from 1
to the smaller lenght.

* Go through these.  If they have names, compare the names.  Then
compare the values.

[If both components have the same name but differ in value, the output
would display the name and not the position, right?  I mean something
like
  msg <- c(msg, paste("Component ", nc[i], ": ", mi, sep=""))
rather than
  msg <- c(msg, paste("Component ", i, ": ", mi, sep=""))
???]

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Kurt Hornik

Fri, Oct 6, 2000 5:07 AM #

R> x[["a"]]

R> x["a"]

R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]

The more I think about it, the less I am sure about what we really want.

In discussions with Fritz yesterday we noticed that quite often lists
are in fact used as hash tables; typically this is what is meant if all
components of the list have names.  Fritz suggested an option
	hash.order = FALSE
to all.equal.list() through which one could control the interpretation.
This option could also be passed on recursively.

I am not sure about this.  There is a difference between `typically' and
always.  Maybe we should have hash tables which can be implemented as
lists with class "hashtable".  This would be clean, but a substantial
change (e.g., most modelling functions in fact return hash tables in the
above sense) and I am not sure whether this is worth the effort.

We can fix the problems in the bug report by deleting the code in
all.equal.list() which compares the components.  That has the effect
that named components are referred to as positional even if their names
agree, and we should decide whether we want this or not.

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Paul Gilbert

Fri, Oct 6, 2000 7:35 AM #

Apologies if I've missed something, as I have not been following this discussion
very closely. However, this effect does not seem very good to me. I often build
lists by appending named components, and  delete named components by setting
them to NULL. I don't think it should be necessary to construct objects with the
named components in the same order for the results to be equal.

Of course, I start from the position that all lists should be named lists and
name truncation should be banned - and then realize this is unrealistic given
the current state of affairs. Perhaps we need a new object called "goodlist"
(rather than "hashtable").

One of the issues here is the difference between what interactive users want for
shortcuts and what package writers want for consistence. If list objects are
always accessed by functions (constructors, etc.), rather than having their
elements tweaked by users, then one tends to lean very heavily toward the
hashtable approach.

My 2 cents worth,
Paul Gilbert


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._