Skip to content

Reading in files with variable parts to names

9 messages · Baptiste Auguie, Rowe, Brian Lee Yung (Portfolio Analytics), Dieter Menne +4 more

#
Dear all,

I'm trying to read in a whole directory of files which have two variable parts to the file name: year and month. E.g. comp198604.asc represents April of 1986 - 'comp' is fixed in each case. Years range between 1986 to 1995 and months are between 1 and 12.

Just to be clear, there are 12 files associated with each year: e.g. comp198601, comp198602, ... comp198612  through to comp199501, comp199502 ... comp199512.

I am trying to automate the reading in of these files, but am struggling to find an adequate way of achieving this. The closest I've got is by doing:



year <- 1986:1995
month <- sprintf("%02d", 1:12)  # formats numbers to 2 digits (for maintaining leading zeros in file names)

filelist <- paste("C:\\Documents and Settings\\Data\\comp",year,month,".asc", sep="")

filelist

 [1] "C:\\Documents and Settings\\Data\\comp198601.asc"
 [2] "C:\\Documents and Settings\\Data\\comp198702.asc"
 [3] "C:\\Documents and Settings\\Data\\comp198803.asc"
 [4] "C:\\Documents and Settings\\Data\\comp198904.asc"
 [5] "C:\\Documents and Settings\\Data\\comp199005.asc"
 [6] "C:\\Documents and Settings\\Data\\comp199106.asc"
 [7] "C:\\Documents and Settings\\Data\\comp199207.asc"
 [8] "C:\\Documents and Settings\\Data\\comp199308.asc"
 [9] "C:\\Documents and Settings\\Data\\comp199409.asc"
[10] "C:\\Documents and Settings\\Data\\comp199510.asc"
[11] "C:\\Documents and Settings\\Data\\comp198611.asc"
[12] "C:\\Documents and Settings\\Data\\comp198712.asc"


I need 1986 to remain fixed whilst it cycles through 01 to 12, before it moves onto 1987 and cycles again. There should be 120 outputs in total (10 years each with 12 months), but at present it's only reaching 12 outputs.

I'd be grateful to learn what I'm doing wrong here so that I can solve this.

Many thanks as ever,

Steve


_________________________________________________________________
 25GB of FREE Online Storage ? Find out more
#
Hi,

If your directory contains only files you want to load anyway, then  
list.files() is your friend,
If you do need to create the names manually, then you could create the  
combinations with expand.grid, as in,
HTH,

baptiste
On 26 Mar 2009, at 18:40, Steve Murray wrote:

            
_____________________________

Baptiste Augui?

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
#
Try this to generate your year/month combinations:
Obviously you'll have to format the months.


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Steve Murray
Sent: Thursday, March 26, 2009 2:40 PM
To: r-help at r-project.org
Subject: [R] Reading in files with variable parts to names



Dear all,

I'm trying to read in a whole directory of files which have two variable
parts to the file name: year and month. E.g. comp198604.asc represents
April of 1986 - 'comp' is fixed in each case. Years range between 1986
to 1995 and months are between 1 and 12.

Just to be clear, there are 12 files associated with each year: e.g.
comp198601, comp198602, ... comp198612  through to comp199501,
comp199502 ... comp199512.

I am trying to automate the reading in of these files, but am struggling
to find an adequate way of achieving this. The closest I've got is by
doing:



year <- 1986:1995
month <- sprintf("%02d", 1:12)  # formats numbers to 2 digits (for
maintaining leading zeros in file names)

filelist <- paste("C:\\Documents and
Settings\\Data\\comp",year,month,".asc", sep="")

filelist

 [1] "C:\\Documents and Settings\\Data\\comp198601.asc"
 [2] "C:\\Documents and Settings\\Data\\comp198702.asc"
 [3] "C:\\Documents and Settings\\Data\\comp198803.asc"
 [4] "C:\\Documents and Settings\\Data\\comp198904.asc"
 [5] "C:\\Documents and Settings\\Data\\comp199005.asc"
 [6] "C:\\Documents and Settings\\Data\\comp199106.asc"
 [7] "C:\\Documents and Settings\\Data\\comp199207.asc"
 [8] "C:\\Documents and Settings\\Data\\comp199308.asc"
 [9] "C:\\Documents and Settings\\Data\\comp199409.asc"
[10] "C:\\Documents and Settings\\Data\\comp199510.asc"
[11] "C:\\Documents and Settings\\Data\\comp198611.asc"
[12] "C:\\Documents and Settings\\Data\\comp198712.asc"


I need 1986 to remain fixed whilst it cycles through 01 to 12, before it
moves onto 1987 and cycles again. There should be 120 outputs in total
(10 years each with 12 months), but at present it's only reaching 12
outputs.

I'd be grateful to learn what I'm doing wrong here so that I can solve
this.

Many thanks as ever,

Steve


_________________________________________________________________
 25GB of FREE Online Storage - Find out more

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------------
This message w/attachments (message) may be privileged, confidential or proprietary, and if you are not an intended recipient, please notify the sender, do not use or share it and delete it. Unless specifically indicated, this message is not an offer to sell or a solicitation of any investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Merrill Lynch. Subject to applicable law, Merrill Lynch may monitor, review and retain e-communications (EC) traveling through its networks/systems. The laws of the country of each sender/recipient may impact the handling of EC, and EC may be archived, supervised and produced in countries other than the country in which you are located. This message cannot be guaranteed to be secure or error-free. References to "Merrill Lynch" are references to any company in the Merrill Lynch & Co., Inc. group of companies, which are wholly-owned by Bank of America Corporation. Securities and Insurance Products: * Are Not FDIC Insured * Are Not Bank Guaranteed * May Lose Value * Are Not a Bank Deposit * Are Not a Condition to Any Banking Service or Activity * Are Not Insured by Any Federal Government Agency. Attachments that are part of this E-communication may have additional important disclosures and disclaimers, which you should read. This message is subject to terms available at the following link: http://www.ml.com/e-communications_terms/. By messaging with Merrill Lynch you consent to the foregoing.
--------------------------------------------------------------------------
#
Steve Murray <smurray444 <at> hotmail.com> writes:
< April of 1986 - 'comp' is fixed in each case. Years range between
comp198601, comp198602, ...
gr = expand.grid(as.character(1986:1995),sprintf("%02d", 1:12) ,
  stringsAsFactors =FALSE)
filelist = paste(a[,1],a[2,],".asc",sep="")
#
Dear all,

Thanks for the help in the previous posts. I've considered each one and have nearly managed to get it working. The structure of the filelist being produced is correct, except for a single space which I can't seem to eradicate! This is my amended code, followed by the first twelve rows of the output (it really goes up to 120 rows).
[1] "C:\\Documents and Settings\\Data\\comp1986 01.asc"
 [2] "C:\\Documents and Settings\\Data\\comp1987 01.asc"
 [3] "C:\\Documents and Settings\\Data\\comp1988 01.asc"
 [4] "C:\\Documents and Settings\\Data\\comp1989 01.asc"
 [5] "C:\\Documents and Settings\\Data\\comp1990 01.asc"
 [6] "C:\\Documents and Settings\\Data\\comp1991 01.asc"
 [7] "C:\\Documents and Settings\\Data\\comp1992 01.asc"
 [8] "C:\\Documents and Settings\\Data\\comp1993 01.asc"
 [9] "C:\\Documents and Settings\\Data\\comp1994 01.asc"
 [10] "C:\\Documents and Settings\\Data\\comp1995 01.asc"
 [11] "C:\\Documents and Settings\\Data\\comp1986 02.asc"
 [12] "C:\\Documents and Settings\\Data\\comp1987 02.asc"


I've tried inserting sep="" after the 'month=sprintf("%02d",1:12)' but this doesn't appear to solve the problem - in fact it doesn't change the output at all...!

Any help would be much appreciated,

Steve
#
That's because do.call wants a list:

what about this one:

 > do.call( sprintf, append(  list("C:\\Documents and 
Settings\\Data\\comp_runoff_hd_%04d%02d.asc"), expand.grid( 
seq(1986,1995), 1:12) ) )

Romain
Steve Murray wrote:

  
    
#
Does this give you what you want (just did it in two steps);
[1] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198601.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198701.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198801.asc"
  [4] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198901.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199001.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199101.asc"
  [7] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199201.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199301.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199401.asc"
 [10] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199501.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198602.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198702.asc"
 [13] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198802.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198902.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199002.asc"
 [16] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199102.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199202.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199302.asc"
 [19] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199402.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199502.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198603.asc"
 [22] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198703.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198803.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198903.asc"
 [25] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199003.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199103.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199203.asc"
 [28] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199303.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199403.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199503.asc"
 [31] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198604.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198704.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198804.asc"
 [34] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198904.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199004.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199104.asc"
 [37] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199204.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199304.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199404.asc"
 [40] "C:\\Documents and Settings\\Data\\comp_runoff_hd_199504.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198605.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198705.asc"
 [43] "C:\\Documents and Settings\\Data\\comp_runoff_hd_198805.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_198905.asc"
"C:\\Documents and Settings\\Data\\comp_runoff_hd_199005.asc"
On Fri, Mar 27, 2009 at 9:56 AM, Steve Murray <smurray444 at hotmail.com> wrote: