csv version of data in an R object
On Sat, Apr 21, 2012 at 3:28 PM, Max Kuhn <mxkuhn at gmail.com> wrote:
For a package, I need to write a csv version of a data set to an R object. Right now, I use: ? ?out <- capture.output( ? ? ? ? ? ? ? ? ? ? ? ? ?write.table(x, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?sep = ",", ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?na = "?", ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?file = "", ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?quote = FALSE, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?row.names = FALSE, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?col.names = FALSE)) To me, this is fairly slow; 131 seconds for a data frame with 8100 rows and 1400 columns. The data will be in a data frame; I know write.table() would be faster with a matrix. I was looking into converting the data frame to a character matrix using as.matrix() or, better yet, format() prior to the call above. However, I'm not sure what an appropriate value of 'digits' should be so that the character version of numeric data has acceptable fidelity. I also tried using a text connection and sink() as shown in ?textConnection but there was no speedup.
You could try a loop over each row, and use 'paste' to join each element in a row by commas. Then use 'paste' again to join everything you've got (a vector of rows) by a '\n' character. something like: paste(apply(x,1,paste,collapse=","),collapse="\n") # untested you probably also want to stick a final \n on it. Is it faster? I don't know! Barry