Skip to content

unlist preserve common class?

6 messages · Gabriel Becker, Duncan Murdoch, Spencer Graves

#
Consider:


 > str(unlist(list(Sys.Date())))
  num 19334


 > str(unlist(list(factor('a'))))
  Factor w/ 1 level "a": 1


	  I naively expected "str(unlist(list(Sys.Date())))" to return an 
object of class 'Date'.  After some thought, I felt a need to ask this 
list if they think that the core R language might benefit from modifying 
the language so "str(unlist(list(Sys.Date())))" was of class 'Date', at 
least as an option.


	  Comments?
	  Thanks,
	  Spencer Graves


 > sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.7.1

Matrix products: default
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_4.2.2 tools_4.2.2
#
Hi Spencer,

My 2c.

According to the docs, factors are special-cased. Other S3 'classes' could
be special-cased, such as Date in your example, I suppose, but it is not
clear how what you're describing could be implemented for the general case.

Suppose I define an S3 "class" called my_awesome_class, and have a list of
3 of them in it, and no other guarantees are provided. What should, or even
could, R do in the case of unlist(list_of_awesomes)?

There is no guarantee that I as an S3 developer have provided a c method
for my class such that we could say the unlist call above is equivalent
(roughly) to do.call(c, list_of_awesomes), nor that I provided any other
particular "mash this set of my_awesome_class objects into one". Nor is it
even guaranteed that the concept of combining my_awesome_class objects is
even coherent, or would produce a new my_awesome_class object when
performed if it is.

That said, your example was of length one, we could special case (the
default method of) unlist so that for x *not a list*, we're guaranteed that

identical(unlist(list(x)), x) == TRUE

This would simplify certain code, such as the one from your motivating
example, but at the cost of making the output of unlist across inputs less
consistent and less easy to reason about and predict. In other words the
answer to the question "what class is unlist(list_of_awesomes)? " would
become "it depends on how many of them are in the list"... That wouldn't be
a good thing on balance, imho.

Best,
~G

On Thu, Dec 8, 2022 at 5:44 PM Spencer Graves <
spencer.graves at effectivedefense.org> wrote:

            

  
  
#
Hi, Gabriel:
On 12/8/22 8:20 PM, Gabriel Becker wrote:
What about adding another argument to create, e.g.,


unlist(x, recursive = TRUE, use.names = TRUE, attributeFunction=NULL)


	  Then assign the assign the results of the current "unlist(x, ...)" 
to, say, "ux", and follow that by



if(!is.null(attributeFunction))attributes(ux) <- attributeFunction(x)


return(ux)


	  An alternative could be to have a default attributeFunction, that 
computes the attributes of each component of x and keeps only the ones 
that are shared by all components of x.  This would be the same as the 
current behavior for factors IF each component had the same factor 
levels and would drop attributes that are different between components. 
For S4 classes, if the attributes were not ALL identical, then all the 
attributes would be dropped, as with the current behavior.  This should 
not be a problem for S3 generics, because they should always check to 
make sure all the required attributes are available.
My example was of length one to provide a minimal, self-contained 
example.  That was motivated by a more complicated example, which took 
me a couple of hours to understand why it wasn't working as I expected ;-)


	  Thanks for your reply.


	  Spencer Graves


we could special case (the
#
Hi Spencer,

Another, potentially somewhat less disruptive/more general option would be
to add a stop.at.object or stop.at.nonlist (or alternatively list.only)
argument, which would basically translate to "collapse the list structure
to flat, but don't try to combine the leaf elements within the list. You
could then do whatever you wanted to said now-flat list as a second call.

i.e.,

flatlist <- unlist(structured_list, list.only = TRUE)
final_res <- cool_combiner_fun(flatlist)

I had to do something similar years ago when I was implementing xpath for
arbitrary R objects, because you can, e.g., always get x[1] out of x
infinitely many times, so I defined "Stopping functions". The fully general
case would be to do the same here, and accept, e.g., stopping.cond, but
that is probably too complex for unlist and might simply belong as a
completely separate function.

Best,
~G

On Thu, Dec 8, 2022 at 8:21 PM Spencer Graves <
spencer.graves at effectivedefense.org> wrote:

            

  
  
#
On 08/12/2022 9:20 p.m., Gabriel Becker wrote:
For the non-recursive case of unlist, do.call(c, list_of_awesomes) is a 
pretty reasonable expectation.  Wouldn't the simplest change be to make 
no change to unlist, but suggest this alternative in the documentation?

Duncan Murdoch
#
On 12/9/22 4:33 AM, Duncan Murdoch wrote:
Hi, Duncan and Gabrien:


	  That's ultimately what I did.  My real problem was more like:


(todaytomorrow <- list(d0=Sys.Date()+0:1, d1=Sys.Date()+2:3))

$d0
[1] "2022-12-09" "2022-12-10"

$d1
[1] "2022-12-11" "2022-12-12"


	  I wanted the minimum of the maxima.  So I naively did:


(tt2 <- sapply(todaytomorrow, min))


    d0    d1
19335 19337


	  So I next tried:


 > (tt3 <- as.Date(tt2))

Error in as.Date.numeric(tt2) : 'origin' must be supplied


	  I believe that the default "origin" for "as.Date.numeric" should be 
"1970-01-01".  I implemented that several years ago in Ecfun:


 > (tt4 <- Ecfun::as.Date1970(tt2))

           d0           d1
"2022-12-09" "2022-12-11"


	  However, before getting here, I first misdiagnosed the problem with 
"tt2", believing that "min" not "sapply" was stripping the attributes.


	  After fixing that problem, I came to Duncan's solution:


 > (tt4 <- lapply(todaytomorrow, min))

$d0
[1] "2022-12-09"

$d1
[1] "2022-12-11"


(maximin <- do.call('max', tt4))
	
[1] "2022-12-11"


	  Conclusion:  It would help to document Duncan's solution using 
"do.call" and avoiding "unlist" and "sapply".  I brought it to the 
attention of this group, because I wondered if you might want to change 
the language -- or at least the documentation, as Duncan suggested.


	  Thanks,
	  Spencer Graves