read.spss: option "to.data.frame" and string variables
Dear R-users,
I am using R version 2.10.1 and package foreign version 0.8-39 under windows.
When reading .sav-Files (PASW Statistics 18.0.1) containing string variables, these are automatically converted to factors when using option "to.data.frame = TRUE" (see example below).
It's clear to me why this happens (the default behaviour of a call to as.data.frame). But this is not always what one might want (or even be aware of).
So maybe one of the following improvements could be made?
* Add a description of this behaviour in ?read.spss.
* Or (even better): Add an extra argument, like: read.spss("C:\\temp\\test.sav", to.data.frame = TRUE, stringsAsFactors = FALSE).
Just a suggestion;
kind regards
Heinrich.
# EXAMPLE:
Suppose there is a simple file "test.sav", containing one variable ("x") of type STRING with 3 values (a,b,c).
library(foreign)
test <- read.spss("C:\\temp\\test.sav")
test
$x [1] "a " "b " "c " attr(,"label.table") attr(,"label.table")$x NULL attr(,"codepage") [1] 1252
is.factor(test$x)
[1] FALSE
is.character(test$x)
[1] TRUE # Ok, that's just fine. But things change when using option "to.data.frame = TRUE":
test <- read.spss("C:\\temp\\test.sav", to.data.frame = TRUE)
test
x 1 a 2 b 3 c
is.factor(test$x)
[1] TRUE
is.character(test$x)
[1] FALSE