Skip to content

loading multiple CSV files into a single data frame

6 messages · victor jimenez, Gabor Grothendieck, oliver +2 more

#
On Thu, May 3, 2012 at 2:07 PM, victor jimenez <betabandido at gmail.com> wrote:
If your csv files all have the same columns and represent time series
then read.zoo in the zoo package can read multiple csv files in at
once using a single read.zoo command producing a single zoo object.

library(zoo)
?read.zoo
vignette("zoo-read")

Also see the other zoo vignettes and help files.
#
On Thu, May 03, 2012 at 11:40:42PM +0200, victor jimenez wrote:
[...]

Maybe things will be clearer if you would provide an example
with the tree and some example data, which you provide as a*.zip file.

As I undertand your question, you have a some variables' values
stored in the csv-files, and other values of your variables
are given as directory structure.

So you need to convert the structure of your directory
into values fo your dataframe.

You need to have a dataframe that contains all possible values that are of
interest to you.
Some of them are loaded via the csv-load and others are just picked
from the directory structure.

You just have to fill in the data from the csv into the dataframe,
and the values/variables that are implictly given via the directory structure,
you just set when importing.

Maybe just read in the csv-files and add the missing values.

So if the variable on the cahcing mechanism is
encode as part of the path to the file, e.g. "direct-mapped",
then just set the chace value to "direct-mapped".


Ciao,
   Oliver

P.S.: In my understandiung this would be rather r-users instead of r-devel,
      because I think r-devel seems to be more focussed on internals and
      package stuff, while your problem is rather a user problem
      (any R user needs some kind of "programming" to get things done).
#
Victor,

I understand you as follows

	The first two columns of the desired combined dataframe are the last two
levels of the pathname to the csv file.

	The columns in all the data.csv files are the same, namely, there is only
one column, and it is named PERF.

If so, the following should work (on unix)

do.call(rbind,lapply(Sys.glob('results/*/*/data.csv'),function(path)
{within(read.csv(path),{ SIZE<-basename(dirname(path));
ASSOC<-basename(dirname(dirname(path)))})}))
On 5/3/12 4:40 PM, "victor jimenez" <betabandido at gmail.com> wrote:

            
#
On May 3, 2012, at 5:40 PM, victor jimenez wrote:

            
You don't need to touch the CSV files, simply add values at load time - this is all easily doable in one line ;)
A B V1 V2       V3
1 1 2  1  a data.csv
2 3 4  1  a data.csv
3 1 2  1  b data.csv
4 3 4  1  b data.csv
5 1 2  2  a data.csv
6 3 4  2  a data.csv