Skip to content

Filling Lists or Arrays of variable dimensions

10 messages · William Dunlap, Chris Campbell, David Winsemius +1 more

#
Following problem:

Say you have a bunch of parameters and want to produce results for all combinations of those:

height<-c("high","low")
width<-c("slim","wide")

then what i used to do was something like this:

l<-list()
for(h in height){
	l[[h]]<-list()
	for(w in width){
		l[[h]][[w]] <- doSomething()
	}
}

Now those parameters aren't always the same. Their number can change and the number of entries can change, and i'd like to have one code that can handle all configurations.

Now i thought i could use expand.grid() to get all configurations ,and than iterate over the rows, but the problem then is that i cannot set the values in the list like above.

grid<-expand.grid(height,width)
l[[as.character(grid[1,])]] <-1
Error in `[[<-`(`*tmp*`, as.character(grid[1, ]), value = 1) : 
  no such index at level 1

 This will only work if the "path" for that is already existent, and i'm not sure how to build that in this scenario. 

I then went on and built an array instead lists of lists, but that doesn't help either because i can't access the array with what i have in the grids row - or at least i don't know how.

Any ideas?

I'd prefer to keep the named lists since all other code is built towards this.
#
Dear Jessica

Aggregate is a function that allows you to perform loops across multiple variables.
   
tempData <- data.frame(height = rnorm(20, 100, 10),  
    width = rnorm(20, 50, 5),  
    par1 = rnorm(20))  
   
tempData$htfac <- cut(tempData$height, c(0, 100, 200))   
tempData$wdfac <- cut(tempData$width, c(0, 50, 100))   
   
doSomething <- function(x) { mean(x) }  
   
out <- aggregate(tempData["par1"], tempData[c("htfac", "wdfac")], doSomething)  

# out is a data frame; this is a named list.   
# use as.list to remove the data.frame class
$htfac   
[1] (0,100]   (100,200] (0,100]   (100,200]    
Levels: (0,100] (100,200]   
   
$wdfac   
[1] (0,50]   (0,50]   (50,100] (50,100]   
Levels: (0,50] (50,100]   
   
$par1    
[1] -1.0449563 -0.3782483 -0.9319105  0.8837459    
    


I believe you are seeing an error similar to this one:
Error in `[[<-`(`*tmp*`, i, value = value) :   
  recursive indexing failed at level 2  
   
This is because double square brackets for lists can only set a single list element at once; grid[1, ] is longer.

Happy Christmas

Chris


Chris Campbell
Tel. +44 (0) 1249 705 450?| Mobile. +44 (0) 7929 628 349
mailto:ccampbell at mango-solutions.com?| http://www.mango-solutions.com
Mango Solutions
2 Methuen Park
Chippenham
Wiltshire 
SN14 OGB
UK

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jessica Streicher
Sent: 20 December 2012 12:46
To: R help
Subject: [R] Filling Lists or Arrays of variable dimensions

Following problem:

Say you have a bunch of parameters and want to produce results for all combinations of those:

height<-c("high","low")
width<-c("slim","wide")

then what i used to do was something like this:

l<-list()
for(h in height){
	l[[h]]<-list()
	for(w in width){
		l[[h]][[w]] <- doSomething()
	}
}

Now those parameters aren't always the same. Their number can change and the number of entries can change, and i'd like to have one code that can handle all configurations.

Now i thought i could use expand.grid() to get all configurations ,and than iterate over the rows, but the problem then is that i cannot set the values in the list like above.

grid<-expand.grid(height,width)
l[[as.character(grid[1,])]] <-1
Error in `[[<-`(`*tmp*`, as.character(grid[1, ]), value = 1) : 
  no such index at level 1

 This will only work if the "path" for that is already existent, and i'm not sure how to build that in this scenario. 

I then went on and built an array instead lists of lists, but that doesn't help either because i can't access the array with what i have in the grids row - or at least i don't know how.

Any ideas?

I'd prefer to keep the named lists since all other code is built towards this.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--

LEGAL NOTICE\ \ This message is intended for the use of ...{{dropped:18}}
#
Aggregate is highly confusing (and i would have appreciated if you used my example instead, i don't get it to do anything sensible on my stuff).

And this seems not what i asked for anyway. This may be a named list but not named and structured as i want it at all.

happy Christmas too
On 20.12.2012, at 15:48, Chris Campbell wrote:

            
#
Arranging data as a list of lists of lists of lists [...] of scalar values generally
will lead to slow and hard-to-read R code, mainly because R is designed to
work on long vectors of simple data.  If you were to start over, consider constructing
a data.frame with one column for each attribute.  Then tools like aggregate and
the plyr functions would be useful.

However, your immediate problem may be solved by creating your 'grid' object
as a data.frame of character, not factor, columns because as.character works differently
on lists of scalar factors and lists of scalar characters.  Usually as.<mode>(x), when
x is a list of length-1 items, gives the same result as as.<mode>(unlist(x)), but not when
x is a list of length-1 factors:

  > height<-c("high", "low")
  > width<-c("slim", "wide")
  > gridF <- expand.grid(height, width, stringsAsFactors=FALSE)
  > gridT <- expand.grid(height, width, stringsAsFactors=TRUE)
  > as.character(gridF[1,])
  [1] "high" "slim"
  > as.character(gridT[1,])
  [1] "1" "1"
  > as.character(unlist(gridT[1,])) # another workaround
  [1] "high" "slim"

Your example was not self-contained so I changed the call to doSomething() to paste(h,w,sep="/"):

  height<-c("high", "low")
  width<-c("slim", "wide")

  l <- list()
  for(h in height){
          l[[h]] <- list()
          for(w in width){
                  l[[h]][[w]] <- paste(h, w, sep="/") # doSomething()
          }
  }

  grid <- expand.grid(height, width, stringsAsFactors=FALSE)
  as.character(grid[1,])
  # [1] "high" "slim", not the [1] "1" "1" you get with stringsAsFactors=TRUE
  l[[ as.character(grid[1, ]) ]]
  # [1] "high/slim"
  l[[ as.character(grid[1, ]) ]] <- 1
  l[[ as.character(grid[1, ]) ]]
  # [1] 1

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Really must have been unclear at some point, sorry.

William, thats interesting, but not really helping the main problem, which is: how to do
without having initialized the list in the loop before. 

Well, or how to initialize it without having to do the loop thing, because the loop stuff can only be done for a specific set of parameter vectors. But those change, and i don't want to have to write another loop construct every time for the new version.

I want to say: hey, i have these vectors here with these values (my parameters), could you build me that nested list structure (tree - whatever) from it? And the function will give me that structure whatever i give it without me needing to intervene in form of changing the code.

-------------- Clarification -----------------

First: i am not computing statistics over the parameters. I'm computing stuff from other data, and the computation is affected by the parameters. 

I am computing classifiers for different sets of parameters for those classifiers. So the result of doSomething() isn't a simple value. Its usually a list of 6 lists (doing cross validation), which in turn have the classifier object, some statistics of the classifier (e.g what was missclassified), and the subsets of data used in them.
That doesn't really fit in a data.frame, hence the use of lists. I want the nested lists because it helps me find stuff in the object browser faster, and because all my other code is already geared towards it. If i had the time i might still go for a flat structure that everyone keeps telling me to use (got a few mails off the list),
but i really haven't the time.

If theres no good way i'll just keep things as they are now.
On 20.12.2012, at 18:37, William Dunlap wrote:

            
#
Jessica

This is super ugly but assigns as you requested in the original post:



height<-c("high","low")  

width<-c("slim","wide")  

doSomething <- function() {   
    cat("done...\n\n")   
    return(list(a = list(1), b = list(2), c = list(3))) }    


tempList <- list()  

for(h in height){  
    tempList[[h]] <- list()  
    for(w in width){  
    tempList[[h]][[w]] <- NA   
    }  
}  
  
gridDF <- expand.grid(height, width, stringsAsFactors = FALSE)  
  
assignToGridValueInList <- function(x) { tempList[[as.character(x["Var1"])]][[as.character(x["Var2"])]] <<- doSomething() }  
  
lapply(split(gridDF, gridDF), assignToGridValueInList)  



It is probably also possible to create the output object dynamically, but it is not a good idea. Assigning a new element to an object creates a copy of that object, so for very complicated structures you will run into speed and memory issues. Assigning to an existing element is clean, and will normally be less wasteful than creating a new object each time.

Best wishes

Chris


Chris Campbell
Tel. +44 (0) 1249 705 450?| Mobile. +44 (0) 7929 628349
mailto:ccampbell at mango-solutions.com?| http://www.mango-solutions.com
Mango Solutions
2 Methuen Park
Chippenham
Wiltshire 
SN14 OGB
UK

-----Original Message-----
From: Jessica Streicher [mailto:j.streicher at micromata.de] 
Sent: 20 December 2012 18:01
To: William Dunlap
Cc: Chris Campbell; R help
Subject: Re: [R] Filling Lists or Arrays of variable dimensions

Really must have been unclear at some point, sorry.

William, thats interesting, but not really helping the main problem, which is: how to do
without having initialized the list in the loop before. 

Well, or how to initialize it without having to do the loop thing, because the loop stuff can only be done for a specific set of parameter vectors. But those change, and i don't want to have to write another loop construct every time for the new version.

I want to say: hey, i have these vectors here with these values (my parameters), could you build me that nested list structure (tree - whatever) from it? And the function will give me that structure whatever i give it without me needing to intervene in form of changing the code.

-------------- Clarification -----------------

First: i am not computing statistics over the parameters. I'm computing stuff from other data, and the computation is affected by the parameters. 

I am computing classifiers for different sets of parameters for those classifiers. So the result of doSomething() isn't a simple value. Its usually a list of 6 lists (doing cross validation), which in turn have the classifier object, some statistics of the classifier (e.g what was missclassified), and the subsets of data used in them.
That doesn't really fit in a data.frame, hence the use of lists. I want the nested lists because it helps me find stuff in the object browser faster, and because all my other code is already geared towards it. If i had the time i might still go for a flat structure that everyone keeps telling me to use (got a few mails off the list), but i really haven't the time.

If theres no good way i'll just keep things as they are now.
On 20.12.2012, at 18:37, William Dunlap wrote:

            
--

LEGAL NOTICE\ \ This message is intended for the use of ...{{dropped:18}}
#
If you really want to use the nested lists and use things like
   lst[[ c("high", "north", "unmarried") ]] <- value
when lst[["high"]] or lst[["high"]][["north"]] does not exist,
then I think you will have to make a class that does this.

Another advantage of writing a class is that it could be rewritten
to implement a more efficient data structure and you would not
have to rewrite any other code.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
Also note that a column of a data.frame can be a list of complicated things.
E.g.,
[[1]]

Call:
lm(formula = mpg ~ wt, data = mtcars, subset = gear == d$gear[i] & 
    am == d$am[i])

Coefficients:
(Intercept)           wt  
     42.563       -8.046  

The standard printout of the data.frame doesn't look nice, but you can the
information.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Dec 20, 2012, at 10:01 AM, Jessica Streicher wrote:

            
Hasn't it become abundantly clear that this would have progress farther had you post a complete example?
#
@David : In my mind it was quite complete enough.

@William: Thanks, didn't know you could do that with data frames, if i ever have to do something similar again i might try this.
On 20.12.2012, at 22:39, David Winsemius wrote: