Skip to content

embedding data frame in R code?

6 messages · ivo welch, R. Michael Weylandt, David Winsemius +3 more

#
I would like to insert a few modest size data frames directly into my
R code.  a short illustration example of what I want is

d <- read.csv(  _END_, row.names=1  )
 , "col1", "col2"
"row1",1,2
"row2",3,4
__END__

right now, the data sits in external files.  I could put each column
into its own vector and then combine into a data frame, but this seems
ugly.  is there a better way to embed data frames?  I searched for the
answer via google, but could not find it.  it wasn't obvious in the
data import/export guide.

regards,

/iaw
----
Ivo Welch (ivo.welch at gmail.com)
#
I'm not sure I entirely understand the question, but the closest thing
I can think of to a data frame literal, excepting dput(), would be
this:

d <- read.csv(textConnection("
a, b
1, cow
2, dog
3, cat"), header = TRUE)

and you probably want closeAllConnections() immediately following to
avoid a warning.

Best,
Michael
On Thu, Aug 2, 2012 at 7:57 PM, ivo welch <ivo.welch at gmail.com> wrote:
#
On Aug 2, 2012, at 5:57 PM, ivo welch wrote:

            
It's not really
  d <- read.csv(  text=' "col1", "col2"
  "row1",1,2
  "row2",3,4'
  , row.names=1  )

  d
############
      col1 col2
row1    1    2
row2    3    4
#############
  dput(d)

##########
structure(list(col1 = c(1L, 3L), col2 = c(2L, 4L)), .Names = c("col1",
"col2"), class = "data.frame", row.names = c("row1", "row2"))
#
# So you can assign the 'structure' to a name and make a duplicate:

  d2 <- structure(list(col1 = c(1L, 3L), col2 = c(2L, 4L)), .Names =  
c("col1",
  "col2"), class = "data.frame", row.names = c("row1", "row2"))
  d2
#-----------
      col1 col2
row1    1    2
row2    3    4


Besides dput, there is also the dump function. The import-export  
manual does mention it, but not in a context which would be very  
helpful. The source() function will run text through the parse-eval- 
print loop.

  source(textConnection(' d3 <- structure(list(col1 = c(1L, 3L), col2  
= c(2L, 4L)), .Names = c("col1", "col2"), class = "data.frame",  
row.names = c("row1", "row2"))' )  )

 > d3
      col1 col2
row1    1    2
row2    3    4
David Winsemius, MD
Alameda, CA, USA
#
No you do not want to close all connections.  You should close each
connection that you open, but not others (they may be used by other
functions like sink() or capture.output()).  Use something like:

   readTableFromText <-function (text, ...) {
       tc <- textConnection(text, open = "r")
       on.exit(close(tc))
       read.table(tc, ...)
   }

as in

   > readTableFromText(c("10 ant", "20 bear", "30 cougar"), row.names=NULL)
     V1     V2
   1 10    ant
   2 20   bear
  3 30 cougar
  > showConnections() # no open connections
        description class mode text isopen can read can write
  >

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On 3 August 2012 at 03:15, William Dunlap wrote:
| > and you probably want closeAllConnections() immediately following
| 
| No you do not want to close all connections.  You should close each
| connection that you open, but not others (they may be used by other
| functions like sink() or capture.output()).  Use something like:
| 
|    readTableFromText <-function (text, ...) {
|        tc <- textConnection(text, open = "r")
|        on.exit(close(tc))
|        read.table(tc, ...)
|    }
| 
| as in
| 
|    > readTableFromText(c("10 ant", "20 bear", "30 cougar"), row.names=NULL)
|      V1     V2
|    1 10    ant
|    2 20   bear
|   3 30 cougar
|   > showConnections() # no open connections
|         description class mode text isopen can read can write
|   >

Nice example, but you no longer need this. 

I forgot which version changed this (R 2.15.0 maybe?) but now the much
simpler direct use works without the need to close the connection:

  R> tab <- read.table(textConnection("a b\n1 2\n3 4"), header=TRUE)
  R> tab
    a b
  1 1 2
  2 3 4
  R> showConnections()
       description class mode text isopen can read can write
  R> 


Dirk
#
On Thu, Aug 2, 2012 at 8:57 PM, ivo welch <ivo.welch at gmail.com> wrote:
This is only a small variation on what others have already proposed
but using text= has the advantage over using an explicit
textConnection that the closing of the connection is handled for you
and separating the data and the read.csv code has the advantage that
if you want to re-read it then its easy to repeat the single read.csv
line.  This latter point is particularly helpful if you are
interactively working with it but even if not it looks better IMHO
than squishing the data within the read.csv .

Lines <- ' "col1","col2"
"row1",1,2
"row2",3,4 '

DF <- read.csv(text = Lines)

Actually if the data frame is small enough the compactness of the
following may outweigh any trepidation you have:

DF <- data.frame(col1 = c(1, 3), col2 = c(2, 4), row.names = c("row1", "row2"))