subsetting with condition
On Jun 1, 2011, at 7:00 PM, kristina p wrote:
Dear R Team, I am a new R user and I am currently trying to subset my data under a special condition. I have went through several pages of the subsetting section here on the forum, but I was not able to find an answer. My data is as follows: ID NAME MS Pol. Party 1 John x F 2 Mary s S 3 Katie x O 4 Sarah p L 5 Martin x O 6 Angelika x F 7 Smith x O ....
Assume this is in a dataframe, 'pol', and that you have corrected the error in colnames, so that it is Pol_Party. the ave function is particularly useful when you need to have a vector that "lines up along side" the other columns pol[ave(seq_along(pol$ID), pol$Pol_Party, FUN=length) >= 3 , ] ID NAME MS Pol_Party 3 3 Katie x O 5 5 Martin x O 7 7 Smith x O (The use of seq_along ensures you will get duplicates of ID that are in any qualifying Parties. Another way to generate the values would be to table()-ulate and pick out the names of qualifying Parties: > pol[ pol$Pol_Party %in% names(tabl.party)[tabl.party >= 3], ] ID NAME MS Pol_Party 3 3 Katie x O 5 5 Martin x O 7 7 Smith x O
I am intested in only those observations, where there are at least three members of 1 political party. That is, I need to throw out all cases in the example above, except for members of party "O".
Both methods use logical indexing with the "[.data.frame" function,
Would really appreciate your help.
David Winsemius, MD West Hartford, CT