read.spss in R 2.1.0 & make basic dataframe

An embedded and charset-unspecified text was scrubbed...
Name: not available
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20050526/50875e38/attachment.pl
On a related note, do other users routinely use read.spss with the
defaults of  "to.data.frame=F" or "use.value.labels=T"?  My experience
is that I am always using the non-default values in which case it would
be helpful to change the defaults to "to.data.frame=T" and
"use.value.labels=F".  It would also probably make sense to change the
default for "trim.factor.names=T".  Interested in others' perspective.

Actually, most of this is me rather than Saikat.

I use use.value.labels=TRUE most of the time.  The main point of 
to.data.frame=TRUE is that it is quite a lot faster for large files, 
especially if you are going to use only a few of the variables. I think 
Brian Ripley spoke up in favour of it for this reason last time the issue 
was raised.

The reason I made trim.factor.names=FALSE the default was backwards 
compatibility, but it probably makes sense to switch it at some point.

Incidentally, PSPP (the original source of the code) now has a version 
that reads long variable names from post-version 12 SPSS files. This 
confirms that the "unrecognised record type 7, subtype 13" message really 
is due to long variable names and so is harmless.  It also means that 
anyone who wants long variable names badly enough could work out a patch.

 	-thomas
The main problem you are experiencing is that edit() (more precisely the 
method edit.data.frame()) is a bit restricted - I think contributions 
are welcome.
Note that coding must be done very careful here (and is not trivial at 
all) in order to deal with different kinds of attributes, in particular 
names and factor stuff.

Uwe Ligges

Recent changes to read.spss() in the foreign package return a dataframe
containing additional attributes.  For example,

TEMP<-read.spss(choose.files(), to.data.frame=T,use.value.labels=F)

str(TEMP)

`data.frame':   780 obs. of  8 variables:

 $ EXPOS01: atomic  1 1 2 1 2 3 2 4 2 1 ...

  ..- attr(*, "value.labels")= Named num  5 4 3 2 1

  .. ..- attr(*, "names")= chr  "Yes, experienced it with Extreme
Impact" "Yes, experienced it with Moderate Impact" "Yes, experienced it
with A Little Impact" "Yes, experienced it with No Impact" ...

 $ EXPOS02: atomic  1 1 1 1 1 1 1 1 1 1 ...

  ..- attr(*, "value.labels")= Named num  5 4 3 2 1

  .. ..- attr(*, "names")= chr  "Yes, experienced it with Extreme
Impact" "Yes, experienced it with Moderate Impact" "Yes, experienced it
with A Little Impact" "Yes, experienced it with No Impact" ...

Unfortunately, these changes may be ahead of their time (certainly ahead
of several functions).  For instance edit balks at the changes:

edit(TEMP)

Error in edit.data.frame(TEMP) : can only handle vector and factor
elements

It used to be that the command "as.data.frame" or "data.frame" would
return a fairly basic data.frame and "fix" the problem.  However, this
does not work (obviously because TEMP is already a data.frame).  For
example,

TEMP<-as.data.frame(TEMP)

edit(TEMP)

Error in edit.data.frame(TEMP) : can only handle vector and factor
elements

It is possible to use "as.matrix", and then "data.frame" the result of
"as.matrix", but this gets a bit cumbersome.

The question is:  Is there a simple command to strip additional
attribute characteristics from a data.frame and get a simple, easy to
use, uncomplicated data.frame?

On a related note, do other users routinely use read.spss with the
defaults of  "to.data.frame=F" or "use.value.labels=T"?  My experience
is that I am always using the non-default values in which case it would
be helpful to change the defaults to "to.data.frame=T" and
"use.value.labels=F".  It would also probably make sense to change the
default for "trim.factor.names=T".  Interested in others' perspective.

Appreciate all the great work Saikat DebRoy has done...just trying to
improve an already useful function.

Paul

	[[alternative HTML version deleted]]

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html