Skip to content

subsetting a list of dataframes

8 messages · Lara Poplarski, Jorge Ivan Velez, William Dunlap +3 more

#
shouldKeep <- sapply(listOfDataFrames, function(df)nrow(df)>1)
  listOfDataFrames[shouldKeep]
or, compressed to get rid of the intermediate variable
  listOfDataFrames[sapply(listOfDataFrames, function(df)nrow(df)>1)]

If you are writing production code and there is any chance that
listOfDataFrames might be an empty list you can use vapply (which
requires that you supply a prototype for FUN's return value):
  listOfDataFrames[vapply(listOfDataFrames, function(df)nrow(df)>1,
FUN.VALUE=FALSE)]


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On 18/05/11 08:24, Lara Poplarski wrote:
L.new <- L[tapply(L,nrow) > 1]

     cheers,

         Rolf Turner
#
Have a look at lapply(). Something like:

entries.with.nrows=lapply(data,function(x)dim(x)[1]>1)

should give you a vector with the elements of the list that you seek marked with TRUE.

This vector can then be used to extract a subset from your list by:

data.reduced=data[entries.with.nrows]

Or similar....


HTH
Jannis

--- Lara Poplarski <larapoplarski at gmail.com> schrieb am Di, 17.5.2011:
#
On May 17, 2011, at 7:13 PM, Lara Poplarski wrote:

            
Read the lapply( ...)  call as:

"For every element in the object named `data`, send that element to a  
function that returns TRUE if its first dimension is greater than one,  
returns FALSE if its first dimension is one, and return nothing  
(actually a vector with zero elements) if it doesn't have a (first)  
"dim" attribute, and finally return the ordered collection of those  
values as a list which is assigned the name 'entries.with.nrows'. "
David Winsemius, MD
Heritage Laboratories
West Hartford, CT
#
Note that the above suggestion does not work in R 2.13.0:
  > listOfDataFrames <- list(three=data.frame(x=11:13,y=101:103),
                             one=data.frame(x=1,y=2),
                             five=data.frame(x=1:5,y=11:15))
  > listOfDataFrames[lapply(listOfDataFrames,function(x)nrow(x)>1)]
  Error in listOfDataFrames[lapply(listOfDataFrames, function(x) nrow(x)
invalid subscript type 'list'
lapply(...) always returns a list and lists are not acceptable as
subscripts.  Instead, make the subscript one of the following:
  as.logical(lapply(...))
  sapply(...) # and hope that FUN always returns TRUE or FALSE and
length(list)>0
  vapply(..., FUN.VALUE=FALSE)

It may be a bit quicker to do the >0 outside of the loop, as in
  as.integer(lapply(listOfDataFrames, FUN=nrow)) > 0
or
  vapply(listOfDataFrames, FUN=nrow, FUN.VALUE=0L) > 0
but you need a pretty long list to notice.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com