Skip to content
Prev 261629 / 398502 Next

subsetting with condition

On Jun 1, 2011, at 7:00 PM, kristina p wrote:

            
Assume this is in a dataframe, 'pol', and that you have corrected the  
error in colnames, so that it is Pol_Party. the ave function is  
particularly useful when you need to have a vector that "lines up  
along side" the other columns

pol[ave(seq_along(pol$ID), pol$Pol_Party, FUN=length) >= 3 , ]
   ID   NAME MS Pol_Party
3  3  Katie  x         O
5  5 Martin  x         O
7  7  Smith  x         O

(The use of seq_along ensures you will get duplicates of ID that are  
in any qualifying Parties.

Another way to generate the values would be to table()-ulate and pick  
out the names of qualifying Parties:

 > pol[ pol$Pol_Party %in% names(tabl.party)[tabl.party >= 3], ]
   ID   NAME MS Pol_Party
3  3  Katie  x         O
5  5 Martin  x         O
7  7  Smith  x         O
Both methods use logical indexing with the "[.data.frame" function,