Skip to content

Running R Script on a Sequence of Files

14 messages · Chris Poliquin, Barry Rowlingson, Bert Gunter +7 more

#
Hi,

I have about 900 files that I need to run the same R script on.  I  
looked over the R Data Import/Export Manual and  couldn't come up with  
a way to read in a sequence of files.

The files all have unique names and are in the same directory.  What I  
want to do is:
1) Create a list of the file names in the directory (this is really  
what I need help with)
2) For each item in the list...
	a) open the file with read.table
	b) perform some analysis
	c) append some results to an array or save them to another file
3) Next File

My initial instinct is to use Python to rename all the files with  
numbers 1:900 and then read them all, but the file names contain some  
information that I would like to keep intact and having to keep a  
separate database of original names and numbers seems inefficient.  Is  
there a way to have R read all the files in a directory one at a time?

- Chris
#
2008/12/5 Chris Poliquin <poliquin at sas.upenn.edu>:
I can't believe the two 'solutions' already posted. It's easy:

 ?list.files

Barry
#
R has quite a few functions to get and manipulate filenames to facilitate
exactly what you want to do. See ?files and especially the links at the end
to the file name manipulation functions.

e.g. dir("pathname") lists all file names in the directory "pathname."
?list.files gives details.

-- Bert Gunter

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Chris Poliquin
Sent: Friday, December 05, 2008 10:02 AM
To: r-help at r-project.org
Subject: [R] Running R Script on a Sequence of Files

Hi,

I have about 900 files that I need to run the same R script on.  I  
looked over the R Data Import/Export Manual and  couldn't come up with  
a way to read in a sequence of files.

The files all have unique names and are in the same directory.  What I  
want to do is:
1) Create a list of the file names in the directory (this is really  
what I need help with)
2) For each item in the list...
	a) open the file with read.table
	b) perform some analysis
	c) append some results to an array or save them to another file
3) Next File

My initial instinct is to use Python to rename all the files with  
numbers 1:900 and then read them all, but the file names contain some  
information that I would like to keep intact and having to keep a  
separate database of original names and numbers seems inefficient.  Is  
there a way to have R read all the files in a directory one at a time?

- Chris

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Use dir to get the names and then lapply over them with a
custom anonymous function where L is a list of the returned
values:

# assumes file names are those in
# current directory that end in .dat
filenames <- dir(pattern = "\\.dat$")

L <- lapply(filenames, function(x) {
  DF <- read.table(x, ...whatever...)
  somefunction(DF)
})

Now L is a list of the returned 900 values.  Alternately you could use
a loop.
On Fri, Dec 5, 2008 at 1:01 PM, Chris Poliquin <poliquin at sas.upenn.edu> wrote:
#
Me neither.
That's what I would use, too. If the OP is on a UNIX platform,
run the R-script in a loop in the shell is an alternative.
Something like this (bourne shell syntax):

for datafile in *.csv ; do
	Rscript analyze.R $datafile
done

The R script (analyze.R) can use commandArgs() to read the filename
argument.

cu
	Philipp
#
Is there a way to list only the files in a given directory without
passing pattern="..." to list.files()?
On Fri, Dec 5, 2008 at 5:10 PM, Kyle. <ambertk at gmail.com> wrote:
#
"This is almost a macro problem. It could be done in SAS language using
the WPS product (660 USD) I think. ..." 

OUCH! Why do it the complicated way??? Check out ?dir, ?list.files, and
then ?lapply for a simple start.

Don't give up so soon! When it comes to R there is no need to punt - you
can always keep possession of the ball ... :-)

Cheers,
Jagat

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Ajay ohri
Sent: Friday, December 05, 2008 12:59 PM
To: Chris Poliquin
Cc: r-help at r-project.org
Subject: Re: [R] Running R Script on a Sequence of Files

This is almost a macro problem. It could be done in SAS language using
the
WPS product (660 USD) I think.
It is a familiar problem and I would be quite interested in the result.

Is there any concept of Macros in R or a package to do the same.

Regards,

Ajay

On Fri, Dec 5, 2008 at 11:31 PM, Chris Poliquin
<poliquin at sas.upenn.edu>wrote:

            
looked
to
want
what I
numbers
information
database of
R read
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Try this:

dir()[!file.info(dir())$isdir]


On Fri, Dec 5, 2008 at 2:30 PM, Gustavo Carvalho
<gustavo.bio+R at gmail.com> wrote:
#
It seems that you have 900 files with the same parameters in each file (I
might be reading more between the lines here than you inferred). However if
this is the case, why not import each of the files into a common database
and then link the database using ODBC connectivity options.  If that is
practical, you could then code a series of subsetting options to select the
data you need for specific analysis, write reports, and then iteratively
select the next set of records.

I may be suggesting a very simple solution, so forgive me if this
trivializes your problem too greatly.

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

Steve_Friedman at nps.gov
Office (305) 224 - 4282
Fax     (305) 224 - 4147


                                                                           
             Chris Poliquin                                                
             <poliquin at sas.upe                                             
             nn.edu>                                                    To 
             Sent by:                  r-help at r-project.org                
             r-help-bounces at r-                                          cc 
             project.org                                                   
                                                                   Subject 
                                       [R] Running R Script on a Sequence  
             12/05/2008 01:01          of Files                            
             PM EST                                                        
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Hi,

I have about 900 files that I need to run the same R script on.  I
looked over the R Data Import/Export Manual and  couldn't come up with
a way to read in a sequence of files.

The files all have unique names and are in the same directory.  What I
want to do is:
1) Create a list of the file names in the directory (this is really
what I need help with)
2) For each item in the list...
             a) open the file with read.table
             b) perform some analysis
             c) append some results to an array or save them to another
file
3) Next File

My initial instinct is to use Python to rename all the files with
numbers 1:900 and then read them all, but the file names contain some
information that I would like to keep intact and having to keep a
separate database of original names and numbers seems inefficient.  Is
there a way to have R read all the files in a directory one at a time?

- Chris

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
1 day later
#
Thanks a lot!

On Fri, Dec 5, 2008 at 5:54 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote: