Skip to content
Prev 7482 / 63424 Next

(PR#1608) merge.data.frame can coerce character vectors to factor in some circumstances (PR#1608)

I do use data frames for storing character data, fully aware that I'm
stretching their intended use.  Data frames came about in the context
of modeling software (see "Statistical Models in S," the "White book"
by Chambers and Hastie, eds).  Originally, the primary use of data
frames was for holding the data given to the model fitting functions,
and thus the classes of objects that LM, GLM, GAM, Tree, etc.,
required are simple ones (numeric and factors -- note that character
vectors are not well suited for fitting models).  Very soon after,
people began to include other types of objects (Terry Therneau's
censored/survival classes, among others, come to my mind).  So the
behaviour of the data.frame class has evolved into what we are
currently using, and some of its apparent "idiosincracies" make
perfect sense in light of its original intended purpose.

It has been argued before that we may need other more general
container classes to hold other "tabular" data (e.g., contigency
tables, data from relational databases) that don't require the 
restriction that data frames have traditionally imposed.  Of course
is not obvious to me that introducing yet another set of classes
is necessarily a good thing --- a lot of care and thought would have
to be put into the effort to ensure that any new container classes (or
any other type, for that matter) are well designed and with a clear
purpose, just like data frames were well-designed for the purpose
of holding data for fitting models.
David Kane <David Kane wrote:

  
    
Message-ID: <20020529093243.B24223@jessie.research.bell-labs.com>
In-Reply-To: <15604.52739.590128.620397@gargle.gargle.HOWL>; from a296180@agate.fmr.com on Wed, May 29, 2002 at 08:48:03AM -0400