An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110901/db145e17/attachment.pl>
Including only a subset of the levels of a factor XXXX
4 messages · Dan Abner, R. Michael Weylandt, David Winsemius +1 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110901/e81e0e86/attachment.pl>
On Sep 1, 2011, at 2:59 PM, Dan Abner wrote:
Hello everyone, I have the following factor: levels(pp_income) [1] "" "1" "2" "3" "4" "5" "6" "7" [9] "8" "9" "Renter" I want to subset so that only values 1:9 are included. I have the following:
income<-pp_income[pp_income %in% c(1:9)] levels(income)
[1] "" "1" "2" "3" "4" "5" "6" "7" [9] "8" "9" "Renter" Why is this not working
Actually it could be that it did succeed but you just have levels
attributes that are unpopulated in your result. Try:
table{income)
If that looks correct, then do this:
income <- factor(income) # will drop unused levels
and can someone please suggest a solution?
If on the other hand you got the wrong values then there was an
undesired coercion of either 'factor' class to 'numeric' or of
'numeric' to 'character'. I am fairly sure this will remove any
ambiguity:
income<-pp_income[as.character(pp_income)
%in% as.character(1:9)]
(You would still get the puzzling extra levels if you looked with
levels(income).)
Thank you! Dan [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD West Hartford, CT
On Sep 1, 2011, at 21:11 , R. Michael Weylandt wrote:
Dropping all occurences of a factor does not drop that level. This actually turns out to be much more useful than it first might appear, but if you really need to get around it, it can be done.
...most expediently by using factor(), as others have pointed out. Or droplevels() for data frames. We had the converse issue just the other day (Aug 30) when someone had problems with "showing zero frequencies in xtabs", which turned out to be caused by the tabulated data _not_ being factors, hence not containing information about which values could have been there but wasn't. The behavior of subsetting operators is so as to make things like tables and barplots consistent across subsets, but there are cases where you want the extra levels dropped. However, the default is as it is, because it is easier to drop levels than to reinstate them. Neither is impossible, of course.
Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com "D?den skal tape!" --- Nordahl Grieg