Skip to content

Including only a subset of the levels of a factor XXXX

4 messages · Dan Abner, R. Michael Weylandt, David Winsemius +1 more

#
On Sep 1, 2011, at 2:59 PM, Dan Abner wrote:

            
Actually it could be that it did succeed but you just have levels  
attributes that are unpopulated in your result. Try:

table{income)

If that looks correct, then do this:

income <- factor(income)  # will drop unused levels
If on the other hand you got the wrong values then there was an  
undesired coercion of either 'factor' class to 'numeric' or of  
'numeric' to 'character'. I am fairly sure this will remove any  
ambiguity:

income<-pp_income[as.character(pp_income)
                         %in% as.character(1:9)]

(You would still get the puzzling extra levels if you looked with  
levels(income).)
David Winsemius, MD
West Hartford, CT
#
On Sep 1, 2011, at 21:11 , R. Michael Weylandt wrote:

            
...most expediently by using factor(), as others have pointed out. Or droplevels() for data frames.

We had the converse issue just the other day (Aug 30) when someone had problems with "showing zero frequencies in xtabs", which turned out to be caused by the tabulated data _not_ being factors, hence not containing information about which values could have been there but wasn't.

The behavior of subsetting operators is so as to make things like tables and barplots consistent across subsets, but there are cases where you want the extra levels dropped. However, the default is as it is, because it is easier to drop levels than to reinstate them. Neither is impossible, of course.