Dear all,
I'm trying to read in a whole directory of files which have two variable parts to the file name: year and month. E.g. comp198604.asc represents April of 1986 - 'comp' is fixed in each case. Years range between 1986 to 1995 and months are between 1 and 12.
Just to be clear, there are 12 files associated with each year: e.g. comp198601, comp198602, ... comp198612 through to comp199501, comp199502 ... comp199512.
I am trying to automate the reading in of these files, but am struggling to find an adequate way of achieving this. The closest I've got is by doing:
year <- 1986:1995
month <- sprintf("%02d", 1:12) # formats numbers to 2 digits (for maintaining leading zeros in file names)
filelist <- paste("C:\\Documents and Settings\\Data\\comp",year,month,".asc", sep="")
filelist
[1] "C:\\Documents and Settings\\Data\\comp198601.asc"
[2] "C:\\Documents and Settings\\Data\\comp198702.asc"
[3] "C:\\Documents and Settings\\Data\\comp198803.asc"
[4] "C:\\Documents and Settings\\Data\\comp198904.asc"
[5] "C:\\Documents and Settings\\Data\\comp199005.asc"
[6] "C:\\Documents and Settings\\Data\\comp199106.asc"
[7] "C:\\Documents and Settings\\Data\\comp199207.asc"
[8] "C:\\Documents and Settings\\Data\\comp199308.asc"
[9] "C:\\Documents and Settings\\Data\\comp199409.asc"
[10] "C:\\Documents and Settings\\Data\\comp199510.asc"
[11] "C:\\Documents and Settings\\Data\\comp198611.asc"
[12] "C:\\Documents and Settings\\Data\\comp198712.asc"
I need 1986 to remain fixed whilst it cycles through 01 to 12, before it moves onto 1987 and cycles again. There should be 120 outputs in total (10 years each with 12 months), but at present it's only reaching 12 outputs.
I'd be grateful to learn what I'm doing wrong here so that I can solve this.
Many thanks as ever,
Steve
_________________________________________________________________
25GB of FREE Online Storage ? Find out more
Reading in files with variable parts to names
9 messages · Baptiste Auguie, Rowe, Brian Lee Yung (Portfolio Analytics), Dieter Menne +4 more
Hi, If your directory contains only files you want to load anyway, then list.files() is your friend,
list.files(pattern = "comp") # or pattern =".asc" for example
If you do need to create the names manually, then you could create the combinations with expand.grid, as in,
do.call(paste, as.list(expand.grid(x = seq(1950,1960), y = 1:10))) # you'll want to tweak paste to suit your needs
HTH, baptiste
On 26 Mar 2009, at 18:40, Steve Murray wrote:
Dear all,
I'm trying to read in a whole directory of files which have two
variable parts to the file name: year and month. E.g. comp198604.asc
represents April of 1986 - 'comp' is fixed in each case. Years range
between 1986 to 1995 and months are between 1 and 12.
Just to be clear, there are 12 files associated with each year: e.g.
comp198601, comp198602, ... comp198612 through to comp199501,
comp199502 ... comp199512.
I am trying to automate the reading in of these files, but am
struggling to find an adequate way of achieving this. The closest
I've got is by doing:
year <- 1986:1995
month <- sprintf("%02d", 1:12) # formats numbers to 2 digits (for
maintaining leading zeros in file names)
filelist <- paste("C:\\Documents and Settings\\Data\
\comp",year,month,".asc", sep="")
filelist
[1] "C:\\Documents and Settings\\Data\\comp198601.asc"
[2] "C:\\Documents and Settings\\Data\\comp198702.asc"
[3] "C:\\Documents and Settings\\Data\\comp198803.asc"
[4] "C:\\Documents and Settings\\Data\\comp198904.asc"
[5] "C:\\Documents and Settings\\Data\\comp199005.asc"
[6] "C:\\Documents and Settings\\Data\\comp199106.asc"
[7] "C:\\Documents and Settings\\Data\\comp199207.asc"
[8] "C:\\Documents and Settings\\Data\\comp199308.asc"
[9] "C:\\Documents and Settings\\Data\\comp199409.asc"
[10] "C:\\Documents and Settings\\Data\\comp199510.asc"
[11] "C:\\Documents and Settings\\Data\\comp198611.asc"
[12] "C:\\Documents and Settings\\Data\\comp198712.asc"
I need 1986 to remain fixed whilst it cycles through 01 to 12,
before it moves onto 1987 and cycles again. There should be 120
outputs in total (10 years each with 12 months), but at present it's
only reaching 12 outputs.
I'd be grateful to learn what I'm doing wrong here so that I can
solve this.
Many thanks as ever,
Steve
_________________________________________________________________ 25GB of FREE Online Storage ? Find out more ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
_____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Try this to generate your year/month combinations:
expand.grid(year=1986:1995, month=1:12)
Obviously you'll have to format the months.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Steve Murray
Sent: Thursday, March 26, 2009 2:40 PM
To: r-help at r-project.org
Subject: [R] Reading in files with variable parts to names
Dear all,
I'm trying to read in a whole directory of files which have two variable
parts to the file name: year and month. E.g. comp198604.asc represents
April of 1986 - 'comp' is fixed in each case. Years range between 1986
to 1995 and months are between 1 and 12.
Just to be clear, there are 12 files associated with each year: e.g.
comp198601, comp198602, ... comp198612 through to comp199501,
comp199502 ... comp199512.
I am trying to automate the reading in of these files, but am struggling
to find an adequate way of achieving this. The closest I've got is by
doing:
year <- 1986:1995
month <- sprintf("%02d", 1:12) # formats numbers to 2 digits (for
maintaining leading zeros in file names)
filelist <- paste("C:\\Documents and
Settings\\Data\\comp",year,month,".asc", sep="")
filelist
[1] "C:\\Documents and Settings\\Data\\comp198601.asc"
[2] "C:\\Documents and Settings\\Data\\comp198702.asc"
[3] "C:\\Documents and Settings\\Data\\comp198803.asc"
[4] "C:\\Documents and Settings\\Data\\comp198904.asc"
[5] "C:\\Documents and Settings\\Data\\comp199005.asc"
[6] "C:\\Documents and Settings\\Data\\comp199106.asc"
[7] "C:\\Documents and Settings\\Data\\comp199207.asc"
[8] "C:\\Documents and Settings\\Data\\comp199308.asc"
[9] "C:\\Documents and Settings\\Data\\comp199409.asc"
[10] "C:\\Documents and Settings\\Data\\comp199510.asc"
[11] "C:\\Documents and Settings\\Data\\comp198611.asc"
[12] "C:\\Documents and Settings\\Data\\comp198712.asc"
I need 1986 to remain fixed whilst it cycles through 01 to 12, before it
moves onto 1987 and cycles again. There should be 120 outputs in total
(10 years each with 12 months), but at present it's only reaching 12
outputs.
I'd be grateful to learn what I'm doing wrong here so that I can solve
this.
Many thanks as ever,
Steve
_________________________________________________________________
25GB of FREE Online Storage - Find out more
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------------------------
This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to "Merrill Lynch" are references to any company in the Merrill Lynch & Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing.
--------------------------------------------------------------------------
Steve Murray <smurray444 <at> hotmail.com> writes:
I'm trying to read in a whole directory of files which have two variable parts to the file name: year and month. E.g. comp198604.asc represents
< April of 1986 - 'comp' is fixed in each case. Years range between
1986 to 1995 and months are between 1 and 12. Just to be clear, there are 12 files associated with each year: e.g.
comp198601, comp198602, ...
comp198612 through to comp199501, comp199502 ... comp199512.
gr = expand.grid(as.character(1986:1995),sprintf("%02d", 1:12) ,
stringsAsFactors =FALSE)
filelist = paste(a[,1],a[2,],".asc",sep="")
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090326/97cb85a9/attachment-0002.pl>
Dear all, Thanks for the help in the previous posts. I've considered each one and have nearly managed to get it working. The structure of the filelist being produced is correct, except for a single space which I can't seem to eradicate! This is my amended code, followed by the first twelve rows of the output (it really goes up to 120 rows).
filelist <- paste("C:\\Documents and Settings\\Data\\comp_runoff_hd_",do.call(paste, expand.grid(year = sprintf("%04d", seq(1986,1995)), month = sprintf("%02d",1:12))),".asc", sep="")
filelist
[1] "C:\\Documents and Settings\\Data\\comp1986 01.asc"
[2] "C:\\Documents and Settings\\Data\\comp1987 01.asc"
[3] "C:\\Documents and Settings\\Data\\comp1988 01.asc"
[4] "C:\\Documents and Settings\\Data\\comp1989 01.asc"
[5] "C:\\Documents and Settings\\Data\\comp1990 01.asc"
[6] "C:\\Documents and Settings\\Data\\comp1991 01.asc"
[7] "C:\\Documents and Settings\\Data\\comp1992 01.asc"
[8] "C:\\Documents and Settings\\Data\\comp1993 01.asc"
[9] "C:\\Documents and Settings\\Data\\comp1994 01.asc"
[10] "C:\\Documents and Settings\\Data\\comp1995 01.asc"
[11] "C:\\Documents and Settings\\Data\\comp1986 02.asc"
[12] "C:\\Documents and Settings\\Data\\comp1987 02.asc"
I've tried inserting sep="" after the 'month=sprintf("%02d",1:12)' but this doesn't appear to solve the problem - in fact it doesn't change the output at all...!
Any help would be much appreciated,
Steve
That's because do.call wants a list:
what about this one:
> do.call( sprintf, append( list("C:\\Documents and
Settings\\Data\\comp_runoff_hd_%04d%02d.asc"), expand.grid(
seq(1986,1995), 1:12) ) )
Romain
Steve Murray wrote:
Dear all, Thanks for the help in the previous posts. I've considered each one and have nearly managed to get it working. The structure of the filelist being produced is correct, except for a single space which I can't seem to eradicate! This is my amended code, followed by the first twelve rows of the output (it really goes up to 120 rows).
filelist <- paste("C:\\Documents and Settings\\Data\\comp_runoff_hd_",do.call(paste, expand.grid(year = sprintf("%04d", seq(1986,1995)), month = sprintf("%02d",1:12))),".asc", sep="")
filelist
[1] "C:\\Documents and Settings\\Data\\comp1986 01.asc"
[2] "C:\\Documents and Settings\\Data\\comp1987 01.asc"
[3] "C:\\Documents and Settings\\Data\\comp1988 01.asc"
[4] "C:\\Documents and Settings\\Data\\comp1989 01.asc"
[5] "C:\\Documents and Settings\\Data\\comp1990 01.asc"
[6] "C:\\Documents and Settings\\Data\\comp1991 01.asc"
[7] "C:\\Documents and Settings\\Data\\comp1992 01.asc"
[8] "C:\\Documents and Settings\\Data\\comp1993 01.asc"
[9] "C:\\Documents and Settings\\Data\\comp1994 01.asc"
[10] "C:\\Documents and Settings\\Data\\comp1995 01.asc"
[11] "C:\\Documents and Settings\\Data\\comp1986 02.asc"
[12] "C:\\Documents and Settings\\Data\\comp1987 02.asc"
I've tried inserting sep="" after the 'month=sprintf("%02d",1:12)' but this doesn't appear to solve the problem - in fact it doesn't change the output at all...!
Any help would be much appreciated,
Steve
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Romain Francois Independent R Consultant +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr
Does this give you what you want (just did it in two steps);
x <- expand.grid(year = sprintf("%04d", seq(1986, 1995)), month = sprintf("%02d", 1:12))
filelist <- paste("C:\\Documents and Settings\\Data\\comp_runoff_hd_", paste(x$year, x$month, sep=''), '.asc', sep='')
filelist
[1] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198601.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198701.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198801.asc" [4] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198901.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199001.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199101.asc" [7] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199201.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199301.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199401.asc" [10] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199501.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198602.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198702.asc" [13] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198802.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198902.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199002.asc" [16] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199102.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199202.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199302.asc" [19] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199402.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199502.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198603.asc" [22] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198703.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198803.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198903.asc" [25] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199003.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199103.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199203.asc" [28] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199303.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199403.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199503.asc" [31] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198604.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198704.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198804.asc" [34] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198904.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199004.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199104.asc" [37] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199204.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199304.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199404.asc" [40] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199504.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198605.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198705.asc" [43] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198805.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_198905.asc" "C:\\Documents and Settings\\Data\\comp_runoff_hd_199005.asc"
On Fri, Mar 27, 2009 at 9:56 AM, Steve Murray <smurray444 at hotmail.com> wrote:
Dear all, Thanks for the help in the previous posts. I've considered each one and have nearly managed to get it working. The structure of the filelist being produced is correct, except for a single space which I can't seem to eradicate! This is my amended code, followed by the first twelve rows of the output (it really goes up to 120 rows).
filelist <- paste("C:\\Documents and Settings\\Data\\comp_runoff_hd_",do.call(paste, expand.grid(year = sprintf("%04d", seq(1986,1995)), month = sprintf("%02d",1:12))),".asc", sep="")
filelist
?[1] "C:\\Documents and Settings\\Data\\comp1986 01.asc"
?[2] "C:\\Documents and Settings\\Data\\comp1987 01.asc"
?[3] "C:\\Documents and Settings\\Data\\comp1988 01.asc"
?[4] "C:\\Documents and Settings\\Data\\comp1989 01.asc"
?[5] "C:\\Documents and Settings\\Data\\comp1990 01.asc"
?[6] "C:\\Documents and Settings\\Data\\comp1991 01.asc"
?[7] "C:\\Documents and Settings\\Data\\comp1992 01.asc"
?[8] "C:\\Documents and Settings\\Data\\comp1993 01.asc"
?[9] "C:\\Documents and Settings\\Data\\comp1994 01.asc"
?[10] "C:\\Documents and Settings\\Data\\comp1995 01.asc"
?[11] "C:\\Documents and Settings\\Data\\comp1986 02.asc"
?[12] "C:\\Documents and Settings\\Data\\comp1987 02.asc"
I've tried inserting sep="" after the 'month=sprintf("%02d",1:12)' but this doesn't appear to solve the problem - in fact it doesn't change the output at all...!
Any help would be much appreciated,
Steve
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Thanks, that's great - just what I was looking for.