Skip to content

factor level issue after subsetting

5 messages · Schreiber, Stefan, Nordlund, Dan (DSHS/RDA), Justin Haynes +1 more

#
That is the nature of factors.  Once created, unused levels must be xplicitly dropped

plot(droplevels(dat.sub$treat),dat.sub$yield)


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204



A work around is exporting and importing the new subset. Then it's
#
first of all, the subsetting line is overly complicated.

dat.sub<-dat[dat$treat!='cont',]

will work just fine.  R does exactly what you're describing.  It knows
the levels of the factor.  Once you remove 'cont' from the data, that
doesn't mean that the level is removed from the factor:
'data.frame':	100 obs. of  2 variables:
 $ let: Factor w/ 5 levels "a","b","c","d",..: 1 5 1 4 3 5 2 2 1 3 ...
 $ num: num  0.224 -0.523 0.974 -0.268 -0.61 ...
'data.frame':	82 obs. of  2 variables:
 $ let: Factor w/ 5 levels "a","b","c","d",..: 5 4 3 5 2 2 3 3 5 3 ...
 $ num: num  -0.523 -0.268 -0.61 -1.383 -0.193 ...
[1] e d c b
Levels: a b c d e
[1] e d c b
Levels: e d c b
Factor w/ 4 levels "e","d","c","b": 1 2 3 1 4 4 3 3 1 3 ...
by redefining your factor you can eliminate the problem.  the other
option, if you don't want factors to begin with is:

options(stringsAsFactors=FALSE)  # to set the global option

or

dat<-read.csv("~/MyFiles/data.csv",stringsAsFactors=FALSE)  # to set
the option locally for this single read.csv call.


On Tue, Nov 1, 2011 at 2:28 PM, Schreiber, Stefan
<Stefan.Schreiber at ales.ualberta.ca> wrote: