An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090511/817cba6a/attachment-0001.pl>
Looking for a quick way to combine rows in a matrix
7 messages · Crosby, Jacy R, jim holtman, Johannes Hüsing +3 more
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090511/78437b7e/attachment-0001.pl>
jim holtman schrieb:
Try this:
key <- rownames(a)
key[key == "AT"] <- "TA"
do.call(rbind, by(a, key, colSums))
something like paste(sort(strsplit(key, split="")[[1]]), "") might be more general.
In the first reply, what was calculated was the overall means by group (amino
acids). It does not work for a larger database.
I am quite really new to R, and I worked on your question just to learn how
to manipulate data with R.
The following seems to work. The code could be made a lot more elegant and
straightforward, but it works:
Let's try with a matrix "b" that contains more rows than in your example:
b=matrix(1:32, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),2)
key <- rownames(b)
key[key == "AT"] <- "TA"
rownames(b)=key
for(i in 1:I(nrow(b)-1)) {
if(rownames(b)[i]=="TA" & rownames(b)[i+1]=="TA") { b[i,] <-
colSums(b[i:I(i+1),])
b[i+1,]<-NA}} # sums the rows and replace the used rows by NA
values
b <- b[order(b[,1],na.last=NA),] # removes the rows with NA values
Of course, the rows are reordered, and that may be not wanted. The ordering
was just to remove the NA rows.
Rock :-D
View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23513599.html Sent from the R help mailing list archive at Nabble.com.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090512/7c326ccc/attachment-0001.pl>
Hello,
I reviewed my code and this will work now for any number of successive "TA",
I hope:
b=matrix(1:64, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),each=4)
key <- rownames(b)
key[key == "AT"] <- "TA"
c <- b
rownames(c)=key
for(i in 2:I(nrow(c))) {
if(rownames(c)[i]=="TA" & rownames(c)[i-1]=="TA") { c[i,] <-
colSums(c[i:I(i-1),])
c[i-1,]<-NA}} # sums the rows and replace the used rows by NA
values
c <- c[apply(c,1,function(x)any(!is.na(x))),] # removes the rows with NA
values
c
Rock
Rocko22 wrote:
In the first reply, what was calculated was the overall means by group
(amino acids). It does not work for a larger database.
I am quite really new to R, and I worked on your question just to learn
how to manipulate data with R.
The following seems to work. The code could be made a lot more elegant and
straightforward, but it works only when there is no more than two
successive "TA":
Let's try with a matrix "b" that contains more rows than in your example:
b=matrix(1:32, ncol=4)
rownames(b)=rep(c("AA","AT","TA","TT"),2)
key <- rownames(b)
key[key == "AT"] <- "TA"
rownames(b)=key
for(i in 1:I(nrow(b)-1)) {
if(rownames(b)[i]=="TA" & rownames(b)[i+1]=="TA") { b[i,] <-
colSums(b[i:I(i+1),])
b[i+1,]<-NA}} # sums the rows and replace the used rows by
NA values
b <- b[order(b[,1],na.last=NA),] # removes the rows with NA values
Of course, the rows are reordered, and that may be not wanted. The
ordering was just to remove the NA rows.
Rock :-D
View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23520900.html Sent from the R help mailing list archive at Nabble.com.
You can automate this step
key[key == "AT"] <- "TA"
## create a function to reverse a string -- see strsplit help page for this
strReverse function
reverse <- function(x) sapply(lapply(strsplit(x, NULL), rev), paste,
collapse="")
key <- rownames(a)
# combine rownames with reverse (rownames)
n<-cbind(key, rev=reverse(key))
key rev
[1,] "AA" "AA"
[2,] "AT" "TA"
[3,] "TA" "AT"
[4,] "TT" "TT"
# Now just sort the values in the rows (apply returns column vectors so I
also use t() ) and then run do.call on first column
n<-t(apply(n,1, sort))
do.call(rbind, by(a, n[,1], colSums))
V1 V2 V3 V4
AA 1 5 9 13
AT 5 13 21 29
TT 4 8 12 16
I often need to combine reverse complement DNA strings, so you could do that
too
# DNA complement
comp <- function(x) chartr("ACGT", "TGCA", x)
n<-cbind(key, rev=reverse(comp(key)))
n<-t(apply(n,1, sort))
do.call(rbind, by(a, n[,1], colSums))
V1 V2 V3 V4
AA 5 13 21 29
AT 2 6 10 14
TA 3 7 11 15
Chris Stubben
jholtman wrote:
Try this:
key <- rownames(a) key[key == "AT"] <- "TA" do.call(rbind, by(a, key, colSums))
V2 V3 V4 V5 AA 1 5 9 13 TA 5 13 21 29 TT 4 8 12 16 On Mon, May 11, 2009 at 4:53 PM, Crosby, Jacy R <Jacy.R.Crosby at uth.tmc.edu>wrote:
I'm working with genotype data in a frequency table:
a=matrix(1:16, nrow=4)
rownames(a)=c("AA","AT","TA","TT")
a
[,1] [,2] [,3] [,4] AA 1 5 9 13 AT 2 6 10 14 TA 3 7 11 15 TT 4 8 12 16 'AT' and 'TA' are essentially the same, and I'd like to combine (add) the rows to reflect this. The final matrix should be: [,1] [,2] [,3] [,4] AA 1 5 9 13 AT 5 13 21 29 TT 4 8 12 16 Is there a fast way to do this? Thanks in advance! Jacy Crosby jacy.r.crosby at uth.tmc.edu
View this message in context: http://www.nabble.com/Looking-for-a-quick-way-to-combine-rows-in-a-matrix-tp23491348p23525634.html Sent from the R help mailing list archive at Nabble.com.