help matching rows of a data frame
Hi!
2017-09-18 07:13 -0500, Therneau, Terry M., Ph.D. wrote:
This question likely has a 1 line answer, I'm just not seeing it.??(2, 3, or 10 lines is? fine too.) For a vector I can do group??<- match(x, unqiue(x)) to get a vector that labels each? element of x.
Actually, you get a vector of indices matching 'unique(x)', not a labelled vector.
x<-c("A","B","C","A","C","D")
group<-match(x, unique(x))
group
[1] 1 2 3 1 3 4
What is an equivalent if x is a data frame?
So you will generate an index where duplicated rows have the row index of the first occurrence, right? This could work:
?x<-data.frame("X0"=c("A","B","C","C","D","A"), "X1"=c(1,2,1,1,3,1))
group<-rownames(x)
?for (i in 1:(nrow(x)-1)) {?
? ? ?for (j in (i+1):nrow(x)) {?
? ? ? ? if (sum(as.numeric(x[i,]==x[j,]))==ncol(x)) {?
? ? ? ? ? ?group[j]<-group[i] }
? ? ?}
? ?}
?group
[1] "1" "2" "3" "3" "5" "1" HTH, Kimmo