Skip to content

Extracting elements out of list in list in list

9 messages · Bert Gunter, Brian Diggs, Rainer M Krug +2 more

#
Hi

Consider the following variable:

--8<---------------cut here---------------start------------->8---
x1 <- list(
  A = 11,
  B = 21,
  C = 31
)

x2 <- list(
  A = 12,
  B = 22,
  C = 32
)

x3 <- list(
  A = 13,
  B = 23,
  C = 33
)

x4 <- list(
  A = 14,
  B = 24,
  C = 34
)

y1 <- list(
  x1 = x1,
  x2 = x2
)

y2 <- list(
  x3 = x3,
  x4 = x4
)

x <- list(
  f1 = y1,
  f2 = y2
)
--8<---------------cut here---------------end--------------->8---


To extract all fields named "A" from y1, I can do

,----
| > sapply(y1, "[[", "A")
| x1 x2 
| 11 12
`----

But how can I do the same for x?

I could put an sapply into an sapply, but this would be less then
elegant.

Is there an easier way of doing this?

Thanks,

Rainer
#
This approach may not be fancy as what you are looking for.

 > xl <- unlist(x)
 > xl[grep("A", names(xl))]
f1.x1.A f1.x2.A f2.x3.A f2.x4.A
      11      12      13      14
 >

I hope this helps.

Chel Hee Lee
On 01/16/2015 04:40 AM, Rainer M Krug wrote:
#
Chee Hee's approach is both simpler and almost surely more efficient,
but I wanted to show another that walks the tree (i.e. the list)
directly using recursion at the R level to pull out the desired
components. This is in keeping with R's "functional" programming
paradigm and avoids the use of regular expressions to extract the
desired components from the unlist() version.

extr <- function(x,nm){
  if(is.recursive(x)){
    wh <- names(x) %in% nm
    c(x[wh],lapply(x[!wh],extr,nm=nm) )
  } else NULL
}

## The return value contains a bunch of NULLs; so use unlist() to remove them
f1.x1.A f1.x2.A f2.x3.A f2.x4.A
     11      12      13      14


I would welcome any possibly "slicker" versions of the above.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Fri, Jan 16, 2015 at 7:23 AM, Chel Hee Lee <chl948 at mail.usask.ca> wrote:
#
On 1/16/15 9:34 AM, Bert Gunter wrote:
I don't know if you would consider this "slicker" or not, but it does 
not give a lot of NULLs that have to be filtered out. It does so by 
checking if the component of the list is itself a list before 
recursively calling extr on it, and by using unlist internally.

extr <- function(x, nm) {
     if(is.list(x)) {
         sublists <- sapply(x, is.list)
         c(unlist(x[nm]), unlist(sapply(x[sublists], extr, nm)))
     } else {
         message("Argument is not a list")
         NULL
     }
}

Running it on x gives

 > extr(x, "A")
[1] 11 12 13 14
2 days later
#
Chel Hee Lee <chl948 at mail.usask.ca> writes:
As long as it works ans=d it is efficient, it is OK.
The unlist might be a problem as I am working with quite large lists.
The grep has one problem, as it would also return fields which contain
an "A", e.g. "Alpha". I am sure this could be fixed with a regular
expression.
Thanks,

Rainer.

  
    
#
Bert Gunter <gunter.berton at gene.com> writes:
I am not sure about the efficient - if the lists are large, they need to
be copied and "un-listed" which both require memory allocations and
processing time - so I would actually guess that your (and Brian's) are
more efficient as they only work on the names of the original list.
NULLs can be dealt with - what I like in this is that the name of the
value reflects the path to the value in the list.
Thanks - this looks neat,

Rainer

  
    
#
Brian Diggs <brian.s.diggs at gmail.com> writes:
It indeed eliminates the NULLs - but calling the unlist in the
recursion, could be inefficient again. 
But benchmarks would be needed to see if this is the case.

But I am reallywondering that there is nothing available to do this easily.

I might look into this when I have some time.

Thanks everybody,

Rainer

  
    
#
Hi,

Here is a solution which is restricted to lists with identically shaped 
"branches" (like your example). The idea is to transform the list to an 
array and make use of the fact that unlist(x, use.names=FALSE) is much 
much faster for large lists than unlist(x).

# function which transforms a list whose skeleton is appropriate, i.e.
# at all levels of the list, the elements have the same skeleton
# NOTE that no check is implemented for this (should be added)
# NOTE that it also works if the final node is not a scalar but a
# matrix or array
list2array <- function(x) {
     recfn <- function(xx, dims, nms) {
         if (is.recursive(xx)) {
             dims <- c(dims, length(xx))
             nms <- c(nms, list(names(xx)))
             recfn(xx[[1]], dims, nms)
         } else {
             dims <- c(dim(xx), rev(dims))
             nms <- c(dimnames(xx), rev(nms))
             return(list(dims, nms))
         }
     }
     temp <- recfn(x, integer(), list())
     # return
     array(unlist(x, use.names=FALSE),
           temp[[1]],
           temp[[2]])
}

# create a list which is a collection of
# moderately large matrices
dimdat <- c(1e3, 5e2)
datgen <- function() array(rnorm(prod(dimdat)),
                            dimdat,
                            lapply(dimdat, function(i) letters[1:i]))
exlist <- list(
     f1=list(x1=list(A=datgen(), B=datgen()),
             x2=list(A=datgen(), B=datgen())),
     f2=list(x1=list(A=datgen(), B=datgen()),
             x2=list(A=datgen(), B=datgen()))
     )

# tranform the list to an array
system.time(exarray <- list2array(exlist))

# check if an arbitrary subview is identical
# to the original list element
identical(exarray[,,"B", "x2", "f1"], exlist$f1$x2$B)

# compare the time for unlist(x)
system.time(unlist(exlist))


HTH,
   Denes
#
Aha! I haven't thought about it.  I really like the approach presented 
by Bert Gunter in the previous post.  It is a good lesson.

I made my previous code a little bit better by building a function that 
pulls out only the desired component.  At this time, the names of 
sublists are changed as below (i.e. the names of 'A', 'B', 'C' in the 
example given by Rainer M Krug are changed to 'A', 'Aha', 'alpha' here):

 > x1 <- list(A = 11, Aha = 21, alpha = 31)
 > x2 <- list(A = 12, Aha = 22, alpha = 32)
 > x3 <- list(A = 13, Aha = 23, alpha = 33)
 > x4 <- list(A = 14, Aha = 24, alpha = 34)
 > y1 <- list(x1 = x1, x2 = x2)
 > y2 <- list(x3 = x3, x4 = x4)
 > x <- list(f1 = y1, f2 = y2)
 >
 > extr.1 <- function(x, name){
+   xl <- unlist(x)
+   depth <- sum(unlist(strsplit(names(xl)[1], split="")) == ".") + 1
+   xl[grep(paste("^",name,"$", sep=""), unlist(strsplit(names(xl), ".", 
fixed=TRUE)))/depth]
+ }
 > extr.1(x=x, name="alpha")
f1.x1.alpha f1.x2.alpha f2.x3.alpha f2.x4.alpha
          31          32          33          34
 > extr.1(x=x, name="A")
f1.x1.A f1.x2.A f2.x3.A f2.x4.A
      11      12      13      14
 > extr.1(x=x, name="a")
named numeric(0)
 > extr.1(x=x, name="Aha")
f1.x1.Aha f1.x2.Aha f2.x3.Aha f2.x4.Aha
        21        22        23        24
 >

Hm.... this function 'extr.1()' seems to be (much) slower than the 
function 'extr()'.

Chel Hee Lee
On 01/16/2015 11:34 AM, Bert Gunter wrote: