Message-ID: <Pine.A41.4.58.0402251008190.32606@homer01.u.washington.edu>
Date: 2004-02-25T18:09:37Z
From: Thomas Lumley
Subject: read.spss defaults
In-Reply-To: <003f01c3fbc0$daa75d80$2e80010a@BigBaer>
On Wed, 25 Feb 2004, Robert W. Baer, Ph.D. wrote:
> The read.spss parameter defaults are:
> use.value.labels=TRUE,
> to.data.frame=FALSE,
>
> Is there some reasoning other than historical for this choice? In most
> instances, it seems that the opposite default choice
> (use.value.labels=FALSE, to.data.frame=TRUE,) would better preserve any
> existing structure of the underlying SPSS dataset as it is imported in to R.
> I feel especially strongly about the to.data.frame=TRUE being the desirable
> default given the central role of data frames in R.
>
I think the reason for to.data.frame=FALSE is that for a large dataset the
conversion to data frame takes a lot longer than the reading.
In particular, if you want to use just a subset of variables it will be
quicker to subset before you construct the data frame.
-thomas