Skip to content

Paste every two columns together

8 messages · Kate Ignatius, Jim Lemon, JSHuang +4 more

#
I have genetic data as follows (simple example, actual data is much larger):

comb =

ID1 A A T G C T G C G T C G T A

ID2 G C T G C C T G C T G T T T

And I wish to get an output like this:

ID1 AA TG CT GC GT CG TA

ID2 GC TG CC TG CT GT TT

That is, paste every two columns together.

I have this code, but I get the error:

Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1

conc <- function(x) {
  s <- seq(2, nchar(x), 2)
  paste0(x[s], x[s+1])
}

combn <- as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE)

Thanks in advance!
#
Hi Kate,
Maybe you want:

seq(2,length(x),by=2)

Jim
On Thu, Jan 29, 2015 at 10:55 AM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
#
Hi, 

  Here is my implementation:
+ odd <- x[1:length(x) %% 2 == 1]
+ even <- x[1:length(x) %%2 == 0]
+ paste0(odd,even)}
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r"
"s" "t" "u" "v" "w" "x"
[1] "ab" "cd" "ef" "gh" "ij" "kl" "mn" "op" "qr" "st" "uv" "wx"



--
View this message in context: http://r.789695.n4.nabble.com/Paste-every-two-columns-together-tp4702429p4702433.html
Sent from the R help mailing list archive at Nabble.com.
#
I am using just the first row of your data (i.e. ID1).

 > ID1 <- c("A", "A", "T", "G", "C", "T", "G", "C", "G", "T", "C", "G", 
"T", "A")
 > do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse=""))
    1    2    3    4    5    6    7
"AA" "TG" "CT" "GC" "GT" "CG" "TA"
 >

Is this what you are looking for?  I hope this helps.

Chel Hee Lee
On 01/28/2015 05:55 PM, Kate Ignatius wrote:
#
eek!

Chel Hee,anything that complicated should engender fear and trembling.

Much simpler and more efficient (if I understand correctly)

i <- seq.int(1L,length(ID1),by = 2L)
paste0(ID1[i],ID1[i+1])

That gives a vector of paired letters. If you want a single character
string, just collapse with a " " (space):

paste0(ID1[i],ID1[i+1],collapse= " ")

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll
On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee <chl948 at mail.usask.ca> wrote:
#
Hi Bert! yes, you are VERY correct!!!  Why am I making this simple thing 
so complicated???  ;) Thank you so much for your nice lesson!

Chel Hee Lee
On 01/28/2015 09:59 PM, Bert Gunter wrote:
#
Kate, here's a solution that uses regular expressions, rather than vector manipulation:
[1] "ID1 AA TG CT GC GT CG TA"

-John
#
Hi:

Don't know about performance, but this is fairly simple for operating
on atomic vectors:

x <- c("A", "A", "G", "T", "C", "G")
apply(embed(x, 2), 1, paste0, collapse = "")
[1] "AA" "GA" "TG" "CT" "GC"

Check the help page of embed() for details.

Dennis
On Wed, Jan 28, 2015 at 3:55 PM, Kate Ignatius <kate.ignatius at gmail.com> wrote: