Skip to content
Back to formatted view

Raw Message

Message-ID: <422C8F3A.1070501@stat.wisc.edu>
Date: 2005-03-07T17:28:26Z
From: Douglas Bates
Subject: Faster way of binding multiple rows of data than rbind?
In-Reply-To: <BAY101-F17BAD384690B07F80C4364E85F0@phx.gbl>

Ken Termiso wrote:
> Hi all,
> 
> I have a vector that contains the row numbers of data taken from several 
> filtering operations performed on a large data frame (20,000rows x 
> 500cols).
> 
> In order to output this subset of data, I've been looping through the 
> vector containing the row numbers (keepRows).
> 
> output <- data.frame(row.names = rownames(bigMatrix))
> 
> for(i in keepRows)
> {
>     output <- rbind(output, bigMatrix[i, ])
> }
> 
> 
> As you may guess, doing all of these rbinds takes a LOT of time, so I'm 
> wondering if there's a workaround where I can maybe use an intermediate 
> matrix-like object to store the loop output, and then coerce it back to 
> a data frame after the loop is complete??

The indexing operations in R are very flexible.  You can do this in a 
single operation as

output <- bigMatrix[keepRows, ]