Skip to content

Looking for a quick way to combine rows in a matrix

7 messages · Crosby, Jacy R, jim holtman, Johannes Hüsing +3 more

#
jim holtman schrieb:
something like

paste(sort(strsplit(key, split="")[[1]]), "")

might be more general.
#
In the first reply, what was calculated was the overall means by group (amino
acids). It does not work for a larger database.
I am quite really new to R, and I worked on your question just to learn how
to manipulate data with R.
The following seems to work. The code could be made a lot more elegant and
straightforward, but it works:

Let's try with a matrix "b" that contains more rows than in your example:

b=matrix(1:32, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),2)
key <- rownames(b)
key[key == "AT"] <- "TA"
rownames(b)=key

for(i in 1:I(nrow(b)-1)) {
   if(rownames(b)[i]=="TA" & rownames(b)[i+1]=="TA") { b[i,] <-
colSums(b[i:I(i+1),])
              b[i+1,]<-NA}} # sums the rows and replace the used rows by NA
values
b <- b[order(b[,1],na.last=NA),] # removes the rows with NA values

Of course, the rows are reordered, and that may be not wanted. The ordering
was just to remove the NA rows.

Rock :-D
#
Hello,

I reviewed my code and this will work now for any number of successive "TA",
I hope:

b=matrix(1:64, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),each=4)
key <- rownames(b)
key[key == "AT"] <- "TA"
c <- b
rownames(c)=key

for(i in 2:I(nrow(c))) {
   if(rownames(c)[i]=="TA" & rownames(c)[i-1]=="TA") { c[i,] <-
colSums(c[i:I(i-1),])
              c[i-1,]<-NA}} # sums the rows and replace the used rows by NA
values
c <- c[apply(c,1,function(x)any(!is.na(x))),] # removes the rows with NA
values
c

Rock
Rocko22 wrote:

  
    
#
You can automate this step
## create a function to reverse a string -- see strsplit help page for this
strReverse function
reverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste,
collapse="")

key <- rownames(a)
# combine rownames with reverse (rownames)
n<-cbind(key, rev=reverse(key))
     key  rev 
[1,] "AA" "AA"
[2,] "AT" "TA"
[3,] "TA" "AT"
[4,] "TT" "TT"

# Now just sort the values in the rows   (apply returns column vectors so I
also use t() ) and then run do.call on first column
 n<-t(apply(n,1, sort))

do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  1  5  9 13
AT  5 13 21 29
TT  4  8 12 16


I often need to combine reverse complement DNA strings, so you could do that
too 

# DNA complement
comp <-  function(x) chartr("ACGT", "TGCA", x)

n<-cbind(key, rev=reverse(comp(key)))  
 n<-t(apply(n,1, sort))
do.call(rbind, by(a, n[,1], colSums)) 
   V1 V2 V3 V4
AA  5 13 21 29   
AT  2  6 10 14
TA  3  7 11 15


Chris Stubben
jholtman wrote: