An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111017/39369f62/attachment.pl>
Reading data with 'awk' - basics?
4 messages · Brian Smith, Gabor Grothendieck, Brian Ripley
On Mon, Oct 17, 2011 at 9:23 AM, Brian Smith <bsmith030465 at gmail.com> wrote:
Hi,
I had a large file for which I require a subset of rows. Instead of reading
it all into memory, I use the awk command to get the relevant rows. However,
I'm doing it pretty inefficiently as I write the subset to disk, before
reading it into R. Is there a way that I can read it into an R object
without writing to disk? For example, this is what I do currently:
## write test sample file
mat1 <- matrix(sample(1:100,16),8,2)
fname1 <- 'temp1.txt'
fname2 <- 'temp2.txt'
write.table(mat1,fname1,sep='\t',row.names=F,col.names=F)
## Read a subset of rows, write to file, and read from file
system(paste("awk '(NR > 1 && NR < 4) {print $0}' ",fname1," >
",fname2,sep=''))
mat2 <- read.table(fname2,sep='\t')
print(mat2)
#####
Is there a way that I can skip writing to disk?
See: http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2129.html
Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
On Mon, 17 Oct 2011, Brian Smith wrote:
Hi,
I had a large file for which I require a subset of rows. Instead of reading
it all into memory, I use the awk command to get the relevant rows. However,
I'm doing it pretty inefficiently as I write the subset to disk, before
reading it into R. Is there a way that I can read it into an R object
without writing to disk? For example, this is what I do currently:
## write test sample file
mat1 <- matrix(sample(1:100,16),8,2)
fname1 <- 'temp1.txt'
fname2 <- 'temp2.txt'
write.table(mat1,fname1,sep='\t',row.names=F,col.names=F)
## Read a subset of rows, write to file, and read from file
system(paste("awk '(NR > 1 && NR < 4) {print $0}' ",fname1," >
",fname2,sep=''))
mat2 <- read.table(fname2,sep='\t')
print(mat2)
#####
Is there a way that I can skip writing to disk?
Use a pipe() connection.
thanks! [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111017/dd16dcc4/attachment.pl>