Message-ID: <Pine.A41.4.44.0304090824170.51664-100000@homer27.u.washington.edu>
Date: 2003-04-09T15:29:36Z
From: Thomas Lumley
Subject: Reading in multiple files
In-Reply-To: <58CAB2332C0DD511BC7900A0C9EA316D013D1A1F@dasmtyjqf009.amedd.army.mil>
On Wed, 9 Apr 2003, Bliese, Paul D MAJ WRAIR-Wash DC wrote:
> I apologize if this is a FAQ -- I kind of recall seeing something along
> these lines before, but I couldn't find the message when I searched the
> archives.
>
> Problem:
> 1. I have hundreds of small files in a subdirectory ("c:\\temp") and I would
> like to combine the files into a single data frame.
> 2. Individually, it is easy to read each file
> >DATA<-read.csv("c:\\temp\\file1a.csv",header=T)
> 3. It is also fairly easy to add new files to the data frame one at a time:
> >DATA<-rbind(DATA,read.csv("c:\\temp\\file1b.csv",header=T))
>
> What is tedious about this solution is that we have to change the file name
> in step 3 every time.
>
> Is there a way to have R identify all the files in a directory and create
> one big data frame?
You can get the file list with
all.the.files <- list.files("C:/temp",full=TRUE)
where full=TRUE asks for absolute file paths, which will be useful if this
isn't your working directory. You could also add pattern="\\.csv$" to
ensure that you only get .csv files.
Then you could read them all in
all.the.data <- lapply( all.the.files, read.csv, header=TRUE)
and then rbind them into a data frame
DATA <- do.call("rbind", all.the.data)
In one line this would be
DATA <- do.call("rbind", lapply( list.files("C:/temp",full=TRUE),
read.csv, header=TRUE))
It should be faster to use do.call("rbind",) rather than a loop, but I
don't know if it actually is.
-thomas