Skip to content

all.equal.list() sometimes fails with unnamed and named components (PR#674)

6 messages · Brian Ripley, Kurt Hornik, Paul Gilbert

#
components (PR#674)
Probably not. Lists do have orderings: they are not sets but generic
vectors.
That's not what current versions of S-PLUS give, as one might hope.
I think that both the names and components should match exactly (the
components recursively).  Unfortunately the named-component extraction
is partial matching (at least, sometimes) so the ordering of the names
always matters.  (There's an S/R difference here I keep forgetting to 
write down. I think it is 

x <- list(aa=1, bb=2)
x["a"]

which gives in S
$aa:
[1] 1
and in R
$"NA"
NULL
so S always partial matches, but R does not always.)
#
More precisely, we have

R> x[["a"]]
[1] 1
R> x["a"] 
$"NA"
NULL

Does this make sense at all?  Comparing it to

R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]
$"NA"
[1] 3

I would think that the x["a"] incorrectly indicates that the list has a
named component "NA" with value NULL ...

What should we do about all.equal.list()?  Should we deal with the named
components first and strip them off, or always go the positional route?
Your comment that lists are generic vectors would indicate that the
second approach is more appropriate ...

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
On Wed, 4 Oct 2000, Kurt Hornik wrote:

            
Yes, so would I. I think it is a bug in R.
See the first two lines I've left here. I think it should be positional
matching *and* the names should match too. That is, all.equal.list would
fail if one list had names and other did not.
#
R> x[["a"]]
R> x["a"]
R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]
Agreed.  Should not be too hard to spot ...
Well should it really FAIL?

What would be the precendence of the comparisons?  E.g.,

* compare the lenghts.  If different, only retain the components from 1
to the smaller lenght.

* Go through these.  If they have names, compare the names.  Then
compare the values.

[If both components have the same name but differ in value, the output
would display the name and not the position, right?  I mean something
like
  msg <- c(msg, paste("Component ", nc[i], ": ", mi, sep=""))
rather than
  msg <- c(msg, paste("Component ", i, ": ", mi, sep=""))
???]

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
2 days later
#
R> x[["a"]]
R> x["a"]
R> x <- list(aa=1, bb=2, "NA"=3)
R> x["NA"]
The more I think about it, the less I am sure about what we really want.

In discussions with Fritz yesterday we noticed that quite often lists
are in fact used as hash tables; typically this is what is meant if all
components of the list have names.  Fritz suggested an option
	hash.order = FALSE
to all.equal.list() through which one could control the interpretation.
This option could also be passed on recursively.

I am not sure about this.  There is a difference between `typically' and
always.  Maybe we should have hash tables which can be implemented as
lists with class "hashtable".  This would be clean, but a substantial
change (e.g., most modelling functions in fact return hash tables in the
above sense) and I am not sure whether this is worth the effort.

We can fix the problems in the bug report by deleting the code in
all.equal.list() which compares the components.  That has the effect
that named components are referred to as positional even if their names
agree, and we should decide whether we want this or not.

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Apologies if I've missed something, as I have not been following this discussion
very closely. However, this effect does not seem very good to me. I often build
lists by appending named components, and  delete named components by setting
them to NULL. I don't think it should be necessary to construct objects with the
named components in the same order for the results to be equal.

Of course, I start from the position that all lists should be named lists and
name truncation should be banned - and then realize this is unrealistic given
the current state of affairs. Perhaps we need a new object called "goodlist"
(rather than "hashtable").

One of the issues here is the difference between what interactive users want for
shortcuts and what package writers want for consistence. If list objects are
always accessed by functions (constructors, etc.), rather than having their
elements tweaked by users, then one tends to lean very heavily toward the
hashtable approach.

My 2 cents worth,
Paul Gilbert


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._