Skip to content
Prev 27921 / 63424 Next

Subsetting vectors/arrays using factors can be seen as misleading

Dear list,

Subsetting vectors/arrays using factors can be seen as misleading, and
I was thinking that it could be discouraged (at least by issuing a
warning).
I could not find whether this was discussed earlier, but I can be
pointed to a reference if I missed any.

The "extract" operator "[" can take as arguments either vectors of
integers or vectors of characters in order to subset a data structure.
For example:
a
1
a
1

Using a factor caused some confusion to someone here, and I have to
admit that it can indeed appear misleading:
[1] a
Levels: b a c
b
2

The dual nature of the factor (vector of integers, with an attached
vector of levels), is not always clear to many users, especially since
factors are treated differently in other situations.
Example:
[1] FALSE
[1] TRUE

This is making me suggest that indexing using a factor could issue a
warning, and the user should explicitly wrap the vector with either
"as.integer" or "as.character".


L.

PS: All examples above were run with
platform       x86_64-unknown-linux-gnu
arch           x86_64
os             linux-gnu
system         x86_64, linux-gnu
status         Under development (unstable)
major          2
minor          7.0
year           2008
month          03
day            12
svn rev        44742
language       R
version.string R version 2.7.0 Under development (unstable) (2008-03-12 r44742)