Skip to content

Removing "row.names"

2 messages · David James, Kurt Hornik

#
Data frames were originally meant to be used in modeling functions.
The opening paragraph in Chapter 3 (Data for Models) in the White Book
says:
 
  "This chapter describes the general structure for data that
  will be used throughout the book.  In particular, it introduces the
  data frame, a class of objects to represent the data typically encounterd  
  in fitting models."

However, data.frames may not be quite appropriate for representing
other types of tabular data (certainly a data.frame does not capture
the essence of, say, a "relational" table in the SQL sense, which doesn't
have the concept of row names).  Several manifestations of this problem are 
coercing character data to factors "at the drop of a hat" (as someone wrote 
here or in s-news), the row.names issue now being discussed,  problems 
including general objets in the "cells" of the data.frame, etc.  

I think that the concept of a data.frame to represent data for fitting
models is fine, but we may (certainly I) have abused this concept.  We need 
other classes of tabular data objects in addition (not as a replacement) to 
data.frames, together with coercion methods and perhaps other utilities.


David A. James
Statistics Research, Room 2C-253            Phone:  (908) 582-3082       
Bell Labs, Lucent Technologies              Fax:    (908) 582-3340
Murray Hill, NJ 09794-0636

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
#
Thomas had said that yes it would be nice to have something with less
restrictions for modeling, but that it was uneconomical at least to
introduce a new class that data.frame would then inherit from.

I interpret your comment as suggesting that we introduce a new class for
holding tabular data?  Do you have specific ideas on this?

-k
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._