Changing the generic of as.data.frame
On Mon, 22 May 2006, Prof Brian Ripley wrote:
On Mon, 22 May 2006, Bill Dunlap wrote:
On Mon, 22 May 2006, Prof Brian Ripley wrote:
The other motivation was to allow the option to not convert character vectors to factors, which needed an additional argument to as.data.frame.character. So data.frame now has an argument 'charToFactor' controlled by a global option (which also controls the default of as.is in read.table). More experience will be needed as to whether it is safe to work with the global option set to FALSE, so that aspect should be regarded as experimental until 2.4.0 is released or it is withdrawn.
Splus's data.frame() and as.data.frame() have had the 'stringsAsFactors'
argument to data.frame and as.data.frame since version 6.0 (2001). Their
default values come from options("stringsAsFactors"). read.table() and
a few other data.frame-oriented functions have the same argument.
It looks like stringsAsFactors has the same functionality as your
new charToFactor. Would it be feasible to change its name to stringsAsFactors?
It would, but then I think we would want to ensure it did precisely the
same thing. If there a description of what exactly
options("stringsAsFactors") affects? (?options suggests it is data.frame,
read.table and importData, and nothing else).
I just noticed that data.frame accepts a vector of logicals for
stringsAsFactors: one element per ... argument. This is not in
the help file.
Splus> data.frame
function(..., row.names = NULL, check.rows = F, check.names = T, na.strings =
"NA", dup.row.names = F, stringsAsFactors = default.stringsAsFactors(
))
{
dots <- match.call(expand.dots = F)$...
n <- length(dots) - 1
...
stringsAsFactors <- rep(stringsAsFactors, len = n)
for(i in seq(length = n)) {
xi <- data.frameAux(eval(as.name(paste("..", i, sep = ""))),
na.strings = na.strings, stringsAsFactors =
stringsAsFactors[i])
Splus's importData(), which outputs a data.frame from a various other
file formats or database connections, also uses stringsAsFactors.
It also affects as.data.frame() and the character method for data.frameAux()
(which is not expected to be called directly -- it is a support function for
data.frame() and as.data.frame()).
In the bigdata library bdFrame() uses it in the same way that data.frame()
does.
There may be a few stray functions that pass their ... arguments
to data.frame, but I cannot think of any now.
Its default value should always be default.stringsAsFactors(),
but I see the bdFrame() uses just FALSE. default.stringsAsFactors()
looks at options("stringsAsFactor") and maps NULL and TRUE to TRUE.
Splus> default.stringsAsFactors
function()
{
val <- .Options$stringsAsFactors
if(is.null(val))
val <- T
if(!is.logical(val) || is.na(val) || length(val) != 1)
stop("options('stringsAsFactors') not set to T or F")
val
}
I believe that Terry Therneau has been using Splus with
options(stringsAsFactors=FALSE) for quite a while and hasn't reported
any problems.
Splus> help(data.frame)
Construct a Data Frame Object
USAGE:
data.frame(..., row.names, check.rows=F, check.names=T,
na.strings="NA", dup.row.names=F, stringsAsFactors=<<see below>>)
data.frameAux(x, ...)
as.data.frame(x, row.names=NULL, stringsAsFactors=<<see below>>, ...)
is.data.frame(x)
...
stringsAsFactors
a logical flag; if TRUE then convert character arguments
to factors whose levels are the unique strings in the
argument. This may save time and space if there a many
repeated values in the strings and may make the
statistical modelling functions easier to use. The
default is TRUE, unless one sets options(stringsAsFactors=FALSE).
...
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and do
not necessarily reflect Insightful Corporation policy or position."
-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
---------------------------------------------------------------------------- Bill Dunlap Insightful Corporation bill at insightful dot com 360-428-8146 "All statements in this message represent the opinions of the author and do not necessarily reflect Insightful Corporation policy or position."