An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110626/01a7e723/attachment.pl>
Ordering a matrix based on cluster no
4 messages · John Kane, Aparna Sampath
Combine the two matrices into one data.frame and order them
Example done using data.frames rather than matrices but just use use data.frame(x,y) to convert to a data.frame
bmat <- data.frame(matrix(1:25,5))
smat <- data.frame(aa= LETTERS[1:25],
bb = rep(c("a","b","c", "d", "e"),5))
df1 <- data.frame(smat, bmat)
orddata <- df1[order(df1[,2],decreasing=TRUE),]
I hope this helps.
--- On Sun, 6/26/11, Aparna Sampath <aparna.sampath26 at gmail.com> wrote:
From: Aparna Sampath <aparna.sampath26 at gmail.com> Subject: [R] Ordering a matrix based on cluster no To: r-help at r-project.org Received: Sunday, June 26, 2011, 9:42 AM Hi All I have a symmetric matrix of genes ( 100x100 matrix). I also have a matrix (100x2) of two columns where column 1 has the gene names and column 2 has the cluster it belongs to (they are sorted and grouped based on the cluster no). I would like to order the rows and columns of the 100x 100 matrix such that the first n genes correspond to cluster 1 and next n genes correspond to cluster 2 and so on. The order of genes is taken from the sorted matrix(100x2). Can someone tell me how to do this in R. I tried the grep() but I get a message saying that the length of pattern >1 so only first element will be compared. But i want to check for each gene in the 100x100 matrix for its cluster number and then group it. I also tried the order() but it did not help either. Thanks for the help! :) Aparna -- Aparna Sampath Master of Science (Bioinformatics) Nanyang Technological University Mob no : +65 91601854 ??? [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Thanks for the help! But when I tried it, it does not work the same way I
want. :(
after combining the two matrices, they look like this:
V1 V2 X TEL.AML1.C41
Hyperdip.50.C23
1 TEL.AML1.C41 1 TEL.AML1.C41 1.0000000 0.00000000
2 Hyperdip.50.C23 1 Hyperdip.50.C23 0.0000000 1.00000000
3 BCR.AB.LC1 1 BCR.AB.LC1 0.1212121 0.78125000
4 Hyperdip.50.C13 1 Hyperdip.50.C13 0.0000000 1.00000000
6 TEL.AML1.9 1 T.ALL.C5 0.0000000 0.03225807
7 TEL.AML1.8 1 TEL.AML1.9 1.0000000 0.00000000
8 Hyperdip.50.C7 1 TEL.AML1.8 1.0000000 0.00000000
9 TEL.AML1.C37 1 Hyperdip.50.C7 0.0000000 1.00000000
11 TEL.AML1.C47 1 TEL.AML1.C37 1.0000000 0.00000000
13 Hyperdip.50.11 1 MLL.6 0.0000000 0.03225807
when i do : orddata1 <- df2_a[order(df2_a[,1],decreasing=T),]
I get the result:
V1 V2 X TEL.AML1.C41
Hyperdip.50.C23
22 TEL.AML1.C49 1 TEL.AML1.2M.1 1 0.00000000
11 TEL.AML1.C47 1 TEL.AML1.C37 1 0.00000000
1 TEL.AML1.C41 1 TEL.AML1.C41 1 0.00000000
9 TEL.AML1.C37 1 Hyperdip.50.C7 0 1.00000000
6 TEL.AML1.9 1 T.ALL.C5 0
0.03225807
7 TEL.AML1.8 1 TEL.AML1.9 1
0.00000000
16 TEL.AML1.2M.4 1 Hyperdip.50.11 0 1.00000000
19 TEL.AML1.2M.1 1 TEL.AML1.2M.4 1 0.00000000
15 Hyperdip.50.R2 1 T.ALL.C10 0
0.00000000
20 Hyperdip.50.C9 1 BCR.ABL.Hyperdip.R5 0 1.00000000
The results are not right! I want it to look for the gene TEL.AML1.C49 in
the second matrix and group it accordingly.
Aparna
--
View this message in context: http://r.789695.n4.nabble.com/Ordering-a-matrix-based-on-cluster-no-tp3625956p3627017.html
Sent from the R help mailing list archive at Nabble.com.
Hi First a handy point : When supplying sample data it is a good idea to use dput(). See ?dput for an explanation. It makes it much easier to see what the data looks like and to work with it. Sample data pasted into a e-mail can get badly mangled. Yours was not bad but still need a bit of cleaning up. I think I'll suggest that this idea goes in the posting guidelines. Next point: Do you really have two matrices? It looks from the examples you have supplied that you have two data.frames unless all the numerics in the two data sets actually are character values. Try str(data.set) to see what they are. On to more substantive matters. It looks like I misread part of the post. I had assumed that both the 100 X 100 matrix and the 2 X 100 were sorted already and just needed to be joined and sorted. It looks to me like what you want to do is to merge (?merge for info) the two data sets based on the gene names (in merge they should have the same variable name to make life easy ) and then apply the order command to sort by cluster Of the top of my head, and using the example names I used earlier I thin;k you want something like this assuming a common name for the gene name. merge (smat, bmat, by= ?gene?) # untested Then apply the order command. I hope I understood the problem this time
--- On Mon, 6/27/11, Aparna Sampath <aparna.sampath26 at gmail.com> wrote:
From: Aparna Sampath <aparna.sampath26 at gmail.com> Subject: Re: [R] Ordering a matrix based on cluster no To: r-help at r-project.org Received: Monday, June 27, 2011, 2:03 AM Thanks for the help! But when I tried it, it does not work the same way I want. :( after combining the two matrices, they look like this: ? ? ? ? ? ???V1? ? ? ? ? ? V2? ? ? ? ? ? ???X? ? ? ? ???TEL.AML1.C41? Hyperdip.50.C23 1? ???TEL.AML1.C41???1? ? TEL.AML1.C41? ? ???1.0000000? ? ? 0.00000000 2? Hyperdip.50.C23???1? ? Hyperdip.50.C23? ? 0.0000000? ? ? 1.00000000 3? ? ???BCR.AB.LC1? ? 1? ? BCR.AB.LC1? ? ? ? ???0.1212121? ? ? 0.78125000 4? Hyperdip.50.C13???1? ? Hyperdip.50.C13? ? 0.0000000? ? ? 1.00000000 6? ? ???TEL.AML1.9? ? 1? ? T.ALL.C5? ? ? ? ? ? ? 0.0000000? ? ? 0.03225807 7? ? ???TEL.AML1.8? ? 1? ? TEL.AML1.9? ? ? ? ? 1.0000000? ? ? 0.00000000 8???Hyperdip.50.C7? ? 1???TEL.AML1.8? ? ? ? ???1.0000000? ? ? 0.00000000 9? ???TEL.AML1.C37???1???Hyperdip.50.C7? ? ???0.0000000? ? ? 1.00000000 11? ? TEL.AML1.C47? 1???TEL.AML1.C37? ? ? ? 1.0000000? ? ? 0.00000000 13? Hyperdip.50.11???1???MLL.6? ? ? ? ? ? ? ? ? ? 0.0000000? ? ? 0.03225807 when i do :? orddata1? <- df2_a[order(df2_a[,1],decreasing=T),] I get the result: ? ? ? ? ???V1? ? ? ? ???V2? ? ? ? ? ? ? ? ???X? ? ? ? ???TEL.AML1.C41 Hyperdip.50.C23 22???TEL.AML1.C49???1? ? ???TEL.AML1.2M.1? ? ? ? ? ? 1? ? ? ? ? 0.00000000 11???TEL.AML1.C47???1? ? ???TEL.AML1.C37? ? ? ? ? ? ? 1? ? ? ???0.00000000 1? ? TEL.AML1.C41???1? ? ? ? TEL.AML1.C41? ? ? ? ? ???1? ? ? ? ? 0.00000000 9? ? TEL.AML1.C37???1? ? ? ? Hyperdip.50.C7? ? ? ? ? ? 0? ? ? ? ? 1.00000000 6? ? TEL.AML1.9? ? ? 1? ? ? ? T.ALL.C5? ? ? ? ? ? ? ? ? ???0? ? ? ? 0.03225807 7? ? TEL.AML1.8? ? ? 1? ? ? ? TEL.AML1.9? ? ? ? ? ? ? ???1? ? ? ? 0.00000000 16? TEL.AML1.2M.4? 1? ? ? ? Hyperdip.50.11? ? ? ? ? ? 0? ? ? ???1.00000000 19? TEL.AML1.2M.1? 1? ? ? ? TEL.AML1.2M.4? ? ? ? ? ? 1? ? ? ???0.00000000 15? Hyperdip.50.R2? 1? ? ? ? T.ALL.C10? ? ? ? ? ? ? ? ???0? ? ? ? 0.00000000 20? Hyperdip.50.C9? 1? ? ? ? BCR.ABL.Hyperdip.R5? ???0? ? ? ???1.00000000 The results are not right! I want it to look for the gene TEL.AML1.C49 in the second matrix and group it accordingly. Aparna -- View this message in context: http://r.789695.n4.nabble.com/Ordering-a-matrix-based-on-cluster-no-tp3625956p3627017.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.