Skip to content

Unexpected behaviour of write.csv - read.csv

9 messages · Duncan Murdoch, Rainer M Krug, Ivan Calandra +4 more

#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi

Assuming the following:
'data.frame':	10 obs. of  2 variables:
 $ a: int  1 2 3 4 5 6 7 8 9 10
 $ b: num  0.692 0.325 0.634 0.16 0.873 ...
'data.frame':	10 obs. of  3 variables:
 $ X: int  1 2 3 4 5 6 7 8 9 10
 $ a: int  1 2 3 4 5 6 7 8 9 10
 $ b: num  0.692 0.325 0.634 0.16 0.873 ...
Using the two functions write.csv and read.csv, I would assume, that the
resulting data.frame x2 be identical with x, but it has an additional
column X, which contains the row names of x.

I know read.table and write.table which work as expected, but I would
like to use a csv for data exchange reasons.

I know that I can use
write.csv(x, "x.csv", row.names=FALSE)

and it would work, but shouldn't that be the default behaviour?

And if this is not compliant with csv files, shouldn't the function
read.csv convert the first column into the row names?

Cheers,

Rainer

- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:        +33 - (0)9 53 10 27 44
Cell:       +27 - (0)8 39 47 90 42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:      Rainer at krugs.de

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0u4X8ACgkQoYgNqgF2egrLIgCeIqAevHGcOAK56qPcpNJ+vWav
iF0An2pk1RsY1GLJbvdMHG7FFpx437gB
=d5aG
-----END PGP SIGNATURE-----
#
On 11-01-13 6:26 AM, Rainer M Krug wrote:
I don't think so.  The CSV format is an export format which holds less 
information than a dataframe.  By exporting the dataframe to CSV and 
importing the result, you are discarding information and you should 
expect to get something different.

If you want to save a dataframe to disk and read it back unchanged, you 
should use save() and load().

Duncan Murdoch
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 01/13/2011 02:56 PM, Duncan Murdoch wrote:
OK - I can follow this logic - and I think I can accept it.
And now my question from a previous thread (write.table equivalent for
lists?) comes up again:

using save() and load() definitely works, but it is highly unsave - as
it even keeps the names of the object, more then one can be saved, I can
not easily assign the saved object to a new name, I have problems using
the saved object if I have forgotten what the variable name was.

So I would like to expand my previous question: what are the proper
functions to store R objects? One could argue that all write...
functions are export functions - therefore keeping the data, but not
necessarily column names, rownames, attributes, ...

So what can I really do to save an R object for later usage in R?

Rainer
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:        +33 - (0)9 53 10 27 44
Cell:       +27 - (0)8 39 47 90 42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:      Rainer at krugs.de

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0vB2oACgkQoYgNqgF2egqenQCeJJNdIiX2faKBPGeilzOz73wM
RmoAn05oGZvo41wCp1+hWwTqTmNoQrNo
=xfWD
-----END PGP SIGNATURE-----
#
Hi,

I thought this was already clear from the replies to your previous post:
- save/load
- saveObject/loadObject from R.utils
- dput/dget (I don't remember who proposed it sorry)

There might be more possibilities, but that should do what you're 
looking for. And you should already know how each of them work and 
therefore the pros and cons.

HTH,
Ivan

Le 1/13/2011 15:08, Rainer M Krug a ?crit :

  
    
#
Add serialize()/unserialize() from base.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On Thu, 13 Jan 2011, Duncan Murdoch wrote:

            
You need to read it with read.csv("x.csv", row.names=1)

Nothing in the csv format lets R know that the first column is the row 
names (in the format used by read.table, having a header that is one 
column short does).  Now R could guess that a .csv file with an empty 
string for the first column name is meant to be the row names, but 
that would be merely a guess based on one (barely documented for 
spreadsheets) convention.
Or one of the other serialization options such as serialize() and 
.saveRDS().  R's own admin uses .saveRDS() for such purposes.

  
    
#
On Thu, Jan 13, 2011 at 1:06 PM, Prof Brian Ripley
<ripley at stats.ox.ac.uk> wrote:
read.csv / read.table already use heuristics to determine the column
types so adding this to the heuristic seems not to be a departure from
the established philosophy.
#
Another option to consider is instead of save/load to use save/attach.  You save the data, but then instead of loading it back into the global environment you use the attach function to attach it in a new environment (position 2 on the search list by default).  It will be attached with the same name as it had when you saved it, but it will not overwrite something by the same name in the global environment.  You can still use it as it is, or assign it to a new name in the global environment.  When you are through using it you can then detach it.
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 01/13/2011 07:06 PM, Prof Brian Ripley wrote:
Thanks - that makes sense.
OK - accepted - assuming things which are only barely documented is the
first step towards incompatibilities - and that is the last thing would
like to have.

Just for clarification, it might be useful to state this in the help
page - or did I miss it there? - as this is an important point and
difference between write.table and write.csv.
They look exactly like what I was looking for, but it says in the help page:

####################################
Details:
     Since these are internal, the file format is subject to change
     without notice.  The current format is that of ?serialize?,
     compressed as if by ?gzip? if ?compress = FALSE?.
####################################

This sounds frightening - unless, that the existing version is kept and
can be used even if the default version changes.

Cheers,

Rainer
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
- -- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Natural Sciences Building
Office Suite 2039
Stellenbosch University
Main Campus, Merriman Avenue
Stellenbosch
South Africa

Tel:        +33 - (0)9 53 10 27 44
Cell:       +27 - (0)8 39 47 90 42
Fax (SA):   +27 - (0)8 65 16 27 82
Fax (D) :   +49 - (0)3 21 21 25 22 44
Fax (FR):   +33 - (0)9 58 10 27 44
email:      Rainer at krugs.de

Skype:      RMkrug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0wBHcACgkQoYgNqgF2egpXRQCfcmEfGcAyziEjT+Z9yr5LblMm
1fMAnRzcnlkyE27/IcMOh/Wjjum0KtZt
=as6T
-----END PGP SIGNATURE-----