Looking for a quick way to combine rows in a matrix - R-help

Mon, May 11, 2009 1:53 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090511/817cba6a/attachment-0001.pl>

jim holtman

Mon, May 11, 2009 5:40 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090511/78437b7e/attachment-0001.pl>

Johannes Hüsing

Tue, May 12, 2009 10:27 AM #

jim holtman schrieb:

something like

paste(sort(strsplit(key, split="")[[1]]), "")

might be more general.

Rocko22

Tue, May 12, 2009 5:53 PM #

In the first reply, what was calculated was the overall means by group (amino
acids). It does not work for a larger database.
I am quite really new to R, and I worked on your question just to learn how
to manipulate data with R.
The following seems to work. The code could be made a lot more elegant and
straightforward, but it works:

Let's try with a matrix "b" that contains more rows than in your example:

b=matrix(1:32, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),2)
key <- rownames(b)
key[key == "AT"] <- "TA"
rownames(b)=key

for(i in 1:I(nrow(b)-1)) {
   if(rownames(b)[i]=="TA" & rownames(b)[i+1]=="TA") { b[i,] <-
colSums(b[i:I(i+1),])
              b[i+1,]<-NA}} # sums the rows and replace the used rows by NA
values
b <- b[order(b[,1],na.last=NA),] # removes the rows with NA values

Of course, the rows are reordered, and that may be not wanted. The ordering
was just to remove the NA rows.

Rock :-D

View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23513599.html
Sent from the R help mailing list archive at Nabble.com.

Jorge Ivan Velez

Tue, May 12, 2009 6:01 PM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090512/7c326ccc/attachment-0001.pl>

Rocko22

Wed, May 13, 2009 5:37 AM #

Hello,

I reviewed my code and this will work now for any number of successive "TA",
I hope:

b=matrix(1:64, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),each=4)
key <- rownames(b)
key[key == "AT"] <- "TA"
c <- b
rownames(c)=key

for(i in 2:I(nrow(c))) {
   if(rownames(c)[i]=="TA" & rownames(c)[i-1]=="TA") { c[i,] <-
colSums(c[i:I(i-1),])
              c[i-1,]<-NA}} # sums the rows and replace the used rows by NA
values
c <- c[apply(c,1,function(x)any(!is.na(x))),] # removes the rows with NA
values
c

Rock

Rocko22 wrote:

View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23520900.html
Sent from the R help mailing list archive at Nabble.com.

Chris Stubben

Wed, May 13, 2009 9:44 AM #

You can automate this step

## create a function to reverse a string -- see strsplit help page for this
strReverse function
reverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste,
collapse="")

key <- rownames(a)
# combine rownames with reverse (rownames)
n<-cbind(key, rev=reverse(key))
     key  rev 
[1,] "AA" "AA"
[2,] "AT" "TA"
[3,] "TA" "AT"
[4,] "TT" "TT"

# Now just sort the values in the rows   (apply returns column vectors so I
also use t() ) and then run do.call on first column
 n<-t(apply(n,1, sort))

do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  1  5  9 13
AT  5 13 21 29
TT  4  8 12 16


I often need to combine reverse complement DNA strings, so you could do that
too 

# DNA complement
comp <-  function(x) chartr("ACGT", "TGCA", x)

n<-cbind(key, rev=reverse(comp(key)))  
 n<-t(apply(n,1, sort))
do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  5 13 21 29   
AT  2  6 10 14
TA  3  7 11 15


Chris Stubben

jholtman wrote:

View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23525634.html
Sent from the R help mailing list archive at Nabble.com.