read.spss defaults

2 messages · Robert Baer, Thomas Lumley

Wed, Feb 25, 2004 9:00 AM #

The read.spss parameter defaults are:
   use.value.labels=TRUE,
   to.data.frame=FALSE,

Is there some reasoning other than historical for this choice?  In most
instances, it seems that the opposite default choice
(use.value.labels=FALSE, to.data.frame=TRUE,) would better preserve any
existing structure of the underlying SPSS dataset as it is imported in to R.
I feel especially strongly about the to.data.frame=TRUE being the desirable
default given the central role of data frames in R.

Of course, I guess a user could always write a wrapper function, but the
instances where you wouldn't find the wrapper function useful seem minimal..

Any insights?

Rob Baer

Thomas Lumley

Wed, Feb 25, 2004 10:09 AM #

On Wed, 25 Feb 2004, Robert W. Baer, Ph.D. wrote:

I think the reason for to.data.frame=FALSE is that for a large dataset the
conversion to data frame takes a lot longer than the reading.

In particular, if you want to use just a subset of variables it will be
quicker to subset before you construct the data frame.


	-thomas