Identify row indices corresponding to each distinct row of a matrix

Thanks. It makes sense.

Jeff Newmiller <jdnewmil at dcn.davis.ca.us> ?2018?11?8??? ??8:05???
The duplicated function returns TRUE for rows that have already
appeared... exactly one of the rows is not represented in the output of
duplicated. For the intended purpose of removing duplicates this behavior
is ideal. I have no idea what your intended purpose is, since every row has
duplicates elsewhere in the matrix. If you really want every set identified
this way then a loop/apply seems inevitable (most opportunities for
optimization come about by not visiting every combination).

Cm <- as.matrix( C )
D <- which( !duplicated( Cm, MARGIN=1 ) )
nCm <- nrow( Cm )
F <- lapply( D, function(d) {
   idxrep <- rep( d, nCm )
   which( 0 == unname( rowSums( Cm[idxrep,] != Cm ) ) )
  } )

On November 8, 2018 1:42:40 PM PST, li li <hannah.hlx at gmail.com> wrote:
Thanks to all the reply. I will try to use plain text in the future.
One question regarding using "which( ! duplicated( m, MARGIN=1 ) )".
This seems to return the fist row indices corresponding to the distinct
rows but it does not give all the row indices
corresponding to each of the distinct rows. For example, in the my
example
below, rows 1, 13 15 are all (1,9).
Thanks.
 Hanna
A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)
B <- rbind(A,A,A)
C <- as.data.frame(B[sample(nrow(B)),])
C
  V1 V2
1   1  9
2   2 10
3   3 11
4   5 13
5   7 15
6   6 14
7   4 12
8   3 11
9   8 16
10  5 13
11  7 15
12  2 10
13  1  9
14  8 16
15  1  9
16  3 11
17  7 15
18  4 12
19  2 10
20  6 14
21  4 12
22  8 16
23  5 13
24  6 14
T <- unique(C)
T
 V1 V2
1  1  9
2  2 10
3  3 11
4  5 13
5  7 15
6  6 14
7  4 12
9  8 16
i <- 1
which(C[,1]==T[i,1]& C[,2]==T[i,2])
[1]  1 13 15

Bert Gunter <bgunter.4567 at gmail.com> ?2018?11?8??? ??10:43???

Yes -- much better than mine. I didn't know about the MARGIN argument
of
duplicated().

-- Bert

On Wed, Nov 7, 2018 at 10:32 PM Jeff Newmiller
<jdnewmil at dcn.davis.ca.us>
wrote:

Perhaps

which( ! duplicated( m, MARGIN=1 ) )

? (untested)

On November 7, 2018 9:20:57 PM PST, Bert Gunter
<bgunter.4567 at gmail.com>
wrote:
A mess -- due to your continued use of html formatting.

But something like this may do what you want (hard to tell with the
mess):

m <- matrix(1:16,nrow=8)[rep(1:8,2),]
m
     [,1] [,2]
[1,]    1    9
[2,]    2   10
[3,]    3   11
[4,]    4   12
[5,]    5   13
[6,]    6   14
[7,]    7   15
[8,]    8   16
[9,]    1    9
[10,]    2   10
[11,]    3   11
[12,]    4   12
[13,]    5   13
[14,]    6   14
[15,]    7   15
[16,]    8   16
vec <- apply(m,1,paste,collapse="-") ## converts rows into
character
vector
vec
[1] "1-9"  "2-10" "3-11" "4-12" "5-13" "6-14" "7-15" "8-16" "1-9"
"2-10"
"3-11" "4-12" "5-13" "6-14"
[15] "7-15" "8-16"
## Then maybe:
tapply(seq_along(vec),vec, I)
$`1-9`
[1] 1 9

$`2-10`
[1]  2 10

$`3-11`
[1]  3 11

$`4-12`
[1]  4 12

$`5-13`
[1]  5 13

$`6-14`
[1]  6 14

$`7-15`
[1]  7 15

$`8-16`
[1]  8 16

## gives the row numbers for each unique row
There may well be slicker ways to do this -- if this is actually
what
you
want to do.

-- Bert

On Wed, Nov 7, 2018 at 7:56 PM li li <hannah.hlx at gmail.com> wrote:

Hi all,
   I use the following example to illustrate my question. As you
can
see,
in matrix C some rows are repeated and I would like to find the
indices of
the rows corresponding to each of the distinct rows.
  For example, for the row c(1,9), I have used the "which"
function
to
identify the row indices corresponding to c(1,9). Using this
approach, in
order to cover all distinct rows, I need to use a for loop.
   I am wondering whether there is an easier way where a for loop
can
be
avoided?
   Thanks very much!
      Hanna

A <- matrix(c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16),8,2)> B
<-
rbind(A,A,A)> C <- as.data.frame(B[sample(nrow(B)),])> C   V1 V2
1   1  9
2   2 10
3   3 11
4   5 13
5   7 15
6   6 14
7   4 12
8   3 11
9   8 16
10  5 13
11  7 15
12  2 10
13  1  9
14  8 16
15  1  9
16  3 11
17  7 15
18  4 12
19  2 10
20  6 14
21  4 12
22  8 16
23  5 13
24  6 14> T <- unique(C)> T  V1 V2
1  1  9
2  2 10
3  3 11
4  5 13
5  7 15
6  6 14
7  4 12
9  8 16> > i <- 1                    > which(C[,1]==T[i,1]&
C[,2]==T[i,2])[1]  1 13 15

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible
code.

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.

--
Sent from my phone. Please excuse my brevity.

Identify row indices corresponding to each distinct row of a matrix

Thread (8 messages)