An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110825/7a789a79/attachment.pl>
how to read a group of files into one dataset?
5 messages · Jie TANG, Eik Vettorazzi, Mohammed Ouassou +2 more
Hi Jie, you have to merge the sequential data.frames, and depending on the structure of your inputs and the way you want your resulting data.frame (which you both didn't specify) either ?merge or ?rbind should help. cheers Am 25.08.2011 10:17, schrieb Jie TANG:
for example : I have files with the name
"ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in
these files into one data.frame
flnm<-paste("obs",101:114,"_err.dat",sep="")
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}
but i failed how could I modified my script?
is there any advices?
--
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Eik Vettorazzi Department of Medical Biometry and Epidemiology University Medical Center Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790
inputDataPath <- "/home/.../bla/"; #Directory containing data files
szPattern <- ".dat"; # File extension
# Get all files name in the specified directory
file2process <- list.files(inputDataPath, pattern=szPattern);
# Get number of files to be processed
iFileCnt <- length(file2process);
dbMatrix <- list(); # Empty list (Your local database)
for (i in 1:iFileCnt)
{
dataFile <- sprintf("%s%s", inputDataPath, file2process[i]);
dbMatrix[i] <- dataFile;
}
ldb <- lapply(dbMatrix, read.table, header = T);
local database ldb is an array of matrix, each matrix contains 1 data
file.
# Get the matrix from list(local database)
Mat <- as.matrix(ldb[[i]]);
I hope this will help !
On to., 2011-08-25 at 11:43 +0200, Eik Vettorazzi wrote:
Hi Jie, you have to merge the sequential data.frames, and depending on the structure of your inputs and the way you want your resulting data.frame (which you both didn't specify) either ?merge or ?rbind should help. cheers Am 25.08.2011 10:17, schrieb Jie TANG:
for example : I have files with the name
"ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in
these files into one data.frame
flnm<-paste("obs",101:114,"_err.dat",sep="")
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}
but i failed how could I modified my script?
is there any advices?
--
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hi: Similar in vein to the other respondents, you could try something like this:
On Thu, Aug 25, 2011 at 1:17 AM, Jie TANG <totangjie at gmail.com> wrote:
for example : I have files with the name ?"ma01.dat","ma02.dat","ma03.dat","ma04.dat",I want to read the data in these files into one data.frame
# Your file names (assuming they are in your startup directory - # see list.files() for a more general approach, as mentioned previously)
flnm <- paste("obs",101:114,"_err.dat",sep="")
This following assumes each data frame in flnm has the same set of
variables and the same number of columns.
# Method 1: base R code
newdata <- lapply(flnm, read.table, skip = 2)
bigdf <- do.call(rbind, newdata)
# Method 2: Use the plyr package
library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind)
bigdf and bdf should have the same number of rows; bdf will have one
more column than bigdf because the first column of bdf is an indicator
of the initial data frame it came from, with a numerical rather than a
character index.
The inner call, mlply, is analogous to the lapply() function from
method 1, and the outer call, ldply, has a similar effect to
do.call().
Here's an example. I have ten files named file_01.csv - file_10.csv in
my startup directory; each has 20 rows and 2 columns, with the same
column names in each.
files <- list.files(pattern = '^file') files
[1] "file_01.csv" "file_02.csv" "file_03.csv" "file_04.csv" "file_05.csv" [6] "file_06.csv" "file_07.csv" "file_08.csv" "file_09.csv" "file_10.csv" ### Method 1:
filelist <- lapply(files, read.csv, header = TRUE) bigdf <- ldply(filelist, rbind) dim(bigdf)
[1] 200 2 # Show this is right by returning the numbers of rows and cols # in each list component of filelist
sapply(filelist, nrow)
[1] 20 20 20 20 20 20 20 20 20 20
sapply(filelist, ncol)
[1] 2 2 2 2 2 2 2 2 2 2
# Method 2:
library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind) dim(bdf)
[1] 200 3
head(bdf, 3)
X1 id count 1 1 1 47 2 1 2 36 3 1 3 53
head(bigdf, 3)
id count 1 1 47 2 2 36 3 3 53
table(bdf$X1)
1 2 3 4 5 6 7 8 9 10 20 20 20 20 20 20 20 20 20 20 HTH, Dennis
newdata<-read.table(flnm,skip=2)
data<-(flnm,skip=2)
but the data only contains data from the flnm[1]
I ?also tried as below :
for (i in 1:9) {
data<-read.table(flnm[i],skip=2)
}
but i failed how could I modified my script?
is there any advices?
--
? ? ? ?[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
# Method 2: Use the plyr package
library('plyr')
bdf <- ldply(mlply(files, read.csv, header = TRUE), rbind)
Or just bdf <- ldply(files, read.csv, header = TRUE) Hadley
Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/