Skip to content
Prev 26153 / 63424 Next

(PR#9896) read.spss converts string variables with

The problem here is that the values in the data do not have trailing
blanks, and the corresponding values in the label table do.  That's an
issue about the specific SPSS file, not mentioned in this report.

Take a look at the data read with use.value.labels=FALSE:
[1] "CZE" "CZE" "CZE" "CZE" "CZE" "CZE" "CZE" "CZE" "CZE" "CZE"
attr(,"value.labels")
      United States            Uruguay             Turkey
         "USA     "         "URY     "         "TUR     "
            Tunisia           Thailand     Chinese Taipei
         "TUN     "         "THA     "         "TAP     "
             Sweden          Slovenia     Slovak Republic
         "SWE     "         "SVN     "         "SVK     "
...

There is another example of this in the test suite:
...
       WT58           DAYOFWK     VITAL10    FAMHXCVR        CHD
  Min.   :123.0   MISSING :130   ALIVE:179   NO  :  0   Min.   :0.0
  1st Qu.:156.0   SUNDAY  : 19   DEAD : 61   YES :  0   1st Qu.:0.0
  Median :171.0   TUESDAY : 19               NA's:240   Median :0.5
  Mean   :173.4   WEDNSDAY: 17                          Mean   :0.5
  3rd Qu.:187.0   SATURDAY: 16                          3rd Qu.:1.0
  Max.   :278.0   THURSDAY: 15                          Max.   :1.0
                  (Other) : 24

where the label.table attribute has

Browse[1]> vl[[12]]
        YES         NO
"Y       " "N       "

but the values are "Y" or "N".  And it has been that way since at least R 
1.6.2.

I think this has to be a case unanticipated by the original author of 
read.spss, and needs to be covered by a new argument to read.spss, since 
presumably trimming when matching might not always be required.
On Wed, 5 Sep 2007, Prof Brian Ripley wrote: