Ordering a matrix based on cluster no

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110626/01a7e723/attachment.pl>
Combine the two matrices into one data.frame and order them
Example done using data.frames rather than matrices but just use use data.frame(x,y) to convert to a data.frame

bmat <- data.frame(matrix(1:25,5))
smat <- data.frame(aa= LETTERS[1:25],
       bb = rep(c("a","b","c", "d", "e"),5))
df1  <- data.frame(smat, bmat)
orddata  <- df1[order(df1[,2],decreasing=TRUE),]

I hope this helps.

From: Aparna Sampath <aparna.sampath26 at gmail.com>
Subject: [R] Ordering a matrix based on cluster no
To: r-help at r-project.org
Received: Sunday, June 26, 2011, 9:42 AM
Hi All

I have a symmetric matrix of genes ( 100x100 matrix). I
also have a matrix
(100x2) of two columns where column 1 has the gene names
and column 2 has
the cluster it belongs to (they are sorted and grouped
based on the cluster
no).

I would like to order the rows and columns of the 100x 100
matrix such that
the first n genes correspond to cluster 1 and next n genes
correspond to
cluster 2 and so on. The order of genes is taken from the
sorted
matrix(100x2).

Can someone tell me how to do this in R.

I tried the grep() but I get a message saying that the
length of pattern >1
so only first element will be compared. But i want to check
for each gene in
the 100x100 matrix for its cluster number and then group
it.

I also tried the order() but it did not help either.

Thanks for the help! :)

Aparna

-- 
Aparna Sampath
Master of Science (Bioinformatics)
Nanyang Technological University
Mob no : +65 91601854

??? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.

Thanks for the help! But when I tried it, it does not work the same way I
want. :(
 after combining the two matrices, they look like this:

             V1            V2               X           TEL.AML1.C41  
Hyperdip.50.C23
1     TEL.AML1.C41   1    TEL.AML1.C41       1.0000000      0.00000000
2  Hyperdip.50.C23   1    Hyperdip.50.C23    0.0000000      1.00000000
3       BCR.AB.LC1    1    BCR.AB.LC1           0.1212121      0.78125000
4  Hyperdip.50.C13   1    Hyperdip.50.C13    0.0000000      1.00000000
6       TEL.AML1.9    1    T.ALL.C5              0.0000000      0.03225807
7       TEL.AML1.8    1    TEL.AML1.9          1.0000000      0.00000000
8   Hyperdip.50.C7    1   TEL.AML1.8           1.0000000      0.00000000
9     TEL.AML1.C37   1   Hyperdip.50.C7       0.0000000      1.00000000
11    TEL.AML1.C47  1   TEL.AML1.C37        1.0000000      0.00000000
13  Hyperdip.50.11   1   MLL.6                    0.0000000      0.03225807

when i do :  orddata1  <- df2_a[order(df2_a[,1],decreasing=T),] 

I get the result:

           V1           V2                   X           TEL.AML1.C41
Hyperdip.50.C23
22   TEL.AML1.C49   1       TEL.AML1.2M.1            1          0.00000000
11   TEL.AML1.C47   1       TEL.AML1.C37              1         0.00000000
1    TEL.AML1.C41   1        TEL.AML1.C41             1          0.00000000
9    TEL.AML1.C37   1        Hyperdip.50.C7            0          1.00000000
6    TEL.AML1.9      1        T.ALL.C5                     0        
0.03225807
7    TEL.AML1.8      1        TEL.AML1.9                 1        
0.00000000
16  TEL.AML1.2M.4  1        Hyperdip.50.11            0         1.00000000
19  TEL.AML1.2M.1  1        TEL.AML1.2M.4            1         0.00000000
15  Hyperdip.50.R2  1        T.ALL.C10                   0        
0.00000000
20  Hyperdip.50.C9  1        BCR.ABL.Hyperdip.R5     0         1.00000000

The results are not right! I want it to look for the gene TEL.AML1.C49 in
the second matrix and group it accordingly. 

Aparna 

--
View this message in context: http://r.789695.n4.nabble.com/Ordering-a-matrix-based-on-cluster-no-tp3625956p3627017.html
Sent from the R help mailing list archive at Nabble.com.
Hi
First a handy point : When supplying sample data it is a good idea to use dput(). See ?dput for an explanation.  It makes it much easier to see what the data looks like and to work with it.  Sample data pasted into a e-mail can get badly mangled.  Yours was not bad but still need a bit of cleaning up.  I think I'll suggest that this idea goes in the posting guidelines.

Next point: Do you really have two matrices? It looks from the examples you have supplied that you have two data.frames unless all the numerics in the two data sets actually are character values.  Try str(data.set) to see what they are.

On to more substantive matters.  It looks like I misread part of the post. I had assumed that both the 100 X 100 matrix and the 2 X 100 were sorted already and just needed to be joined and sorted.

It looks to me like what you want to do is to merge (?merge for info)  the two data sets based on the gene names (in merge they should have the same variable name to make life easy ) and then apply the order command to sort by cluster

Of the top of my head, and using the example names I used earlier  I thin;k you want something like  this assuming a common name for the gene name.

merge (smat, bmat,  by= ?gene?)  # untested

Then apply the order command.

I hope I understood the problem this time

From: Aparna Sampath <aparna.sampath26 at gmail.com>
Subject: Re: [R] Ordering a matrix based on cluster no
To: r-help at r-project.org
Received: Monday, June 27, 2011, 2:03 AM
Thanks for the help! But when I tried
it, it does not work the same way I
want. :(
 after combining the two matrices, they look like this:

? ? ? ? ?
???V1? ? ? ? ?
? V2? ? ? ? ? ?
???X? ? ? ?
???TEL.AML1.C41? 
Hyperdip.50.C23
1?
???TEL.AML1.C41???1?
? TEL.AML1.C41? ?
???1.0000000? ? ? 0.00000000
2? Hyperdip.50.C23???1? ?
Hyperdip.50.C23? ? 0.0000000? ? ?
1.00000000
3? ? ???BCR.AB.LC1? ?
1? ? BCR.AB.LC1? ? ? ?
???0.1212121? ? ? 0.78125000
4? Hyperdip.50.C13???1? ?
Hyperdip.50.C13? ? 0.0000000? ? ?
1.00000000
6? ? ???TEL.AML1.9? ?
1? ? T.ALL.C5? ? ? ? ?
? ? 0.0000000? ? ? 0.03225807
7? ? ???TEL.AML1.8? ?
1? ? TEL.AML1.9? ? ? ? ?
1.0000000? ? ? 0.00000000
8???Hyperdip.50.C7? ?
1???TEL.AML1.8? ? ? ?
???1.0000000? ? ? 0.00000000
9?
???TEL.AML1.C37???1???Hyperdip.50.C7?
? ???0.0000000? ? ?
1.00000000
11? ? TEL.AML1.C47?
1???TEL.AML1.C37? ? ? ?
1.0000000? ? ? 0.00000000
13?
Hyperdip.50.11???1???MLL.6?
? ? ? ? ? ? ? ?
? 0.0000000? ? ? 0.03225807

when i do :? orddata1? <-
df2_a[order(df2_a[,1],decreasing=T),] 

I get the result:

? ? ? ? ???V1?
? ? ? ???V2? ?
? ? ? ? ? ?
???X? ? ? ?
???TEL.AML1.C41
Hyperdip.50.C23
22???TEL.AML1.C49???1?
? ???TEL.AML1.2M.1? ? ?
? ? ? 1? ? ? ? ?
0.00000000
11???TEL.AML1.C47???1?
? ???TEL.AML1.C37? ? ?
? ? ? ? 1? ? ?
???0.00000000
1? ? TEL.AML1.C41???1? ?
? ? TEL.AML1.C41? ? ? ? ?
???1? ? ? ? ?
0.00000000
9? ? TEL.AML1.C37???1? ?
? ? Hyperdip.50.C7? ? ? ?
? ? 0? ? ? ? ?
1.00000000
6? ? TEL.AML1.9? ? ? 1?
? ? ? T.ALL.C5? ? ? ?
? ? ? ? ? ???0?
? ? ? 
0.03225807
7? ? TEL.AML1.8? ? ? 1?
? ? ? TEL.AML1.9? ? ? ?
? ? ? ???1? ? ?
? 
0.00000000
16? TEL.AML1.2M.4? 1? ? ? ?
Hyperdip.50.11? ? ? ? ? ?
0? ? ? ???1.00000000
19? TEL.AML1.2M.1? 1? ? ? ?
TEL.AML1.2M.4? ? ? ? ? ?
1? ? ? ???0.00000000
15? Hyperdip.50.R2? 1? ? ? ?
T.ALL.C10? ? ? ? ? ? ?
? ???0? ? ? ? 
0.00000000
20? Hyperdip.50.C9? 1? ? ? ?
BCR.ABL.Hyperdip.R5? ???0? ?
? ???1.00000000

The results are not right! I want it to look for the gene
TEL.AML1.C49 in
the second matrix and group it accordingly. 

Aparna 

--
View this message in context: http://r.789695.n4.nabble.com/Ordering-a-matrix-based-on-cluster-no-tp3625956p3627017.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,
reproducible code.