how to read a group of files into one dataset?

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110825/7a789a79/attachment.pl>
Hi Jie,
you have to merge the sequential data.frames, and depending on the
structure of your inputs  and the way you want your resulting data.frame
(which you both didn't specify) either ?merge  or ?rbind should help.

cheers

Am 25.08.2011 10:17, schrieb Jie TANG:
for example : I have files with the name
 "ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in
these files into one data.frame

flnm<-paste("obs",101:114,"_err.dat",sep="")
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I  also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}

but i failed how could I modified my script?

is there any advices?
--

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790
inputDataPath  <- "/home/.../bla/";  #Directory containing data files 
szPattern      <-  ".dat";           # File extension 

# Get all files name in the specified directory
file2process <- list.files(inputDataPath, pattern=szPattern); 

 # Get number of files to be processed  
iFileCnt     <- length(file2process);  
dbMatrix     <- list();      # Empty list (Your local database)
for (i in 1:iFileCnt)
    {
       dataFile     <- sprintf("%s%s", inputDataPath, file2process[i]);
       dbMatrix[i]  <- dataFile;
    }
ldb <- lapply(dbMatrix, read.table, header = T);

local database ldb is an array of matrix, each matrix contains 1 data
file.

 # Get the matrix from list(local database)
 Mat <- as.matrix(ldb[[i]]);

I hope this will help !
Hi Jie,
you have to merge the sequential data.frames, and depending on the
structure of your inputs  and the way you want your resulting data.frame
(which you both didn't specify) either ?merge  or ?rbind should help.

cheers

Am 25.08.2011 10:17, schrieb Jie TANG:
for example : I have files with the name
 "ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in
these files into one data.frame

flnm<-paste("obs",101:114,"_err.dat",sep="")
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I  also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}

but i failed how could I modified my script?

is there any advices?
--

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hi:

Similar in vein to the other respondents, you could try something like this:
for example : I have files with the name
?"ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in
these files into one data.frame

# Your file names (assuming they are in your startup directory -
# see list.files() for a more general approach, as mentioned previously)
flnm <- paste("obs",101:114,"_err.dat",sep="")
This following assumes each data frame in flnm has the same set of
variables and  the same number of columns.

# Method 1:  base R code

  newdata <- lapply(flnm, read.table, skip = 2)
  bigdf <- do.call(rbind, newdata)

# Method 2: Use the plyr package

library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind)

bigdf and bdf should have the same number of rows; bdf will have one
more column than bigdf because the first column of bdf is an indicator
of the initial data frame it came from, with a numerical rather than a
character index.

The inner call, mlply, is analogous to the lapply() function from
method 1, and the outer call, ldply, has a similar effect to
do.call().

Here's an example. I have ten files named file_01.csv - file_10.csv in
my startup directory; each has 20 rows and 2 columns, with the same
column names in each.
files <- list.files(pattern = '^file')
files
[1] "file_01.csv" "file_02.csv" "file_03.csv" "file_04.csv" "file_05.csv"
 [6] "file_06.csv" "file_07.csv" "file_08.csv" "file_09.csv" "file_10.csv"

### Method 1:
filelist <- lapply(files, read.csv, header = TRUE)
bigdf <- ldply(filelist, rbind)
dim(bigdf)
[1] 200   2
# Show this is right by returning the numbers of rows and cols
# in each list component of filelist
sapply(filelist, nrow)
[1] 20 20 20 20 20 20 20 20 20 20
sapply(filelist, ncol)
[1] 2 2 2 2 2 2 2 2 2 2

# Method 2:
library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind)
dim(bdf)
[1] 200   3
head(bdf, 3)
X1 id count
1  1  1    47
2  1  2    36
3  1  3    53
head(bigdf, 3)
id count
1  1    47
2  2    36
3  3    53
table(bdf$X1)
1  2  3  4  5  6  7  8  9 10
20 20 20 20 20 20 20 20 20 20

HTH,
Dennis
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I ?also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}

but i failed how could I modified my script?

is there any advices?
--

? ? ? ?[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

# Method 2: Use the plyr package

library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind)
Or just

bdf <- ldply(files, read.csv, header = TRUE)

Hadley
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/