Skip to content

How to concatenate a several rows according with a column ?

6 messages · tgodoy, Jean V Adams, arun +2 more

#
Hi, I'm a new user or R and I try to concatenate a several rows according
with the value in a column. 

this is my data.frame and I want to concatenate my data.frame according with
the column "b" and make a new data.frame with the information in the others
columns.
a                    b         c             d
1      E001234      TSA    IP234   like_domain
2      E001234      TSB    IP234   like_domain     
3      E001234      TSC    IP234   like_domain
4      E001234      TSD    IP234   like_domain
5      E001235      TSA    IP235   two_domain
6      E001235      TSD    IP235   two_domain
7      E001235      TSS    IP235   two_domain
8      E001236      TSP    IP236   like_domain
9      E001236      TST    IP236   like_domain
10    E001237      TSV    IP237   simple_domain

I want my table in this way
a                    b                                      c            
d
1      E001234      TSA, TSB, TSC, TSD    IP234   like_domain
2      E001235                TSA, TSD, TSS    IP235   two_domain
3      E001236                 TSP, TSP, TST    IP236   like_domain
4      E001237                                   TSV    IP237  
simple_domain

How can I do this in R? 

Thanks 




--
View this message in context: http://r.789695.n4.nabble.com/How-to-concatenate-a-several-rows-according-with-a-column-tp4639072.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi,

You could also use ddply.
Though, the output format differs a bit from aggregate().

dat1<-read.table(text="
?? a??????????????????? b???????? c???????????? d
????? E001234????? TSA??? IP234?? like_domain
????? E001234????? TSB??? IP234?? like_domain??? 
????? E001234????? TSC??? IP234?? like_domain
????? E001234????? TSD??? IP234?? like_domain
????? E001235????? TSA??? IP235?? two_domain
????? E001235????? TSD??? IP235?? two_domain
????? E001235????? TSS??? IP235?? two_domain
????? E001236????? TSP??? IP236?? like_domain
????? E001236????? TST??? IP236?? like_domain
??? E001237????? TSV??? IP237?? simple_domain 
",sep="",header=TRUE,stringsAsFactors=FALSE)
?dat2<-ddply(dat1,.(a,c,d), paste,sep=",")
?dat2<-dat2[,c(1,5,2:3)]

colnames(dat2)<-colnames(dat1)
?dat2
??????? a???????????????????????????? b???? c???????????? d
1 E001234 c("TSA", "TSB", "TSC", "TSD") IP234?? like_domain
2 E001235??????? c("TSA", "TSD", "TSS") IP235??? two_domain
3 E001236?????????????? c("TSP", "TST") IP236?? like_domain
4 E001237?????????????????????????? TSV IP237 simple_domain
A.K.

----- Original Message -----
From: tgodoy <tingola_07 at hotmail.com>
To: r-help at r-project.org
Cc: 
Sent: Friday, August 3, 2012 1:00 PM
Subject: [R] How to concatenate a several rows according with a column ?

Hi, I'm a new user or R and I try to concatenate a several rows according
with the value in a column. 

this is my data.frame and I want to concatenate my data.frame according with
the column "b" and make a new data.frame with the information in the others
columns.
? ? ? ?  a? ? ? ? ? ? ? ? ? ? b? ? ? ?  c? ? ? ? ? ?  d
1? ? ? E001234? ? ? TSA? ? IP234?  like_domain
2? ? ? E001234? ? ? TSB? ? IP234?  like_domain? ? 
3? ? ? E001234? ? ? TSC? ? IP234?  like_domain
4? ? ? E001234? ? ? TSD? ? IP234?  like_domain
5? ? ? E001235? ? ? TSA? ? IP235?  two_domain
6? ? ? E001235? ? ? TSD? ? IP235?  two_domain
7? ? ? E001235? ? ? TSS? ? IP235?  two_domain
8? ? ? E001236? ? ? TSP? ? IP236?  like_domain
9? ? ? E001236? ? ? TST? ? IP236?  like_domain
10? ? E001237? ? ? TSV? ? IP237?  simple_domain

I want my table in this way
? ? ? ?  a? ? ? ? ? ? ? ? ? ? b? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? c? ? ? ? ? ? 
d
1? ? ? E001234? ? ? TSA, TSB, TSC, TSD? ? IP234?  like_domain
2? ? ? E001235? ? ? ? ? ? ? ? TSA, TSD, TSS? ? IP235?  two_domain
3? ? ? E001236? ? ? ? ? ? ? ?  TSP, TSP, TST? ? IP236?  like_domain
4? ? ? E001237? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?  TSV? ? IP237? 
simple_domain

How can I do this in R? 

Thanks 




--
View this message in context: http://r.789695.n4.nabble.com/How-to-concatenate-a-several-rows-according-with-a-column-tp4639072.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
HI,
Try this:
#Reformatted the ddply results to match your desired output.
dat1<-read.table(text="
?? a??????????????????? b???????? c???????????? d
????? E001234????? TSA??? IP234?? like_domain
????? E001234????? TSB??? IP234?? like_domain??? 
????? E001234????? TSC??? IP234?? like_domain
????? E001234????? TSD??? IP234?? like_domain
????? E001235????? TSA??? IP235?? two_domain
????? E001235????? TSD??? IP235?? two_domain
????? E001235????? TSS??? IP235?? two_domain
????? E001236????? TSP??? IP236?? like_domain
????? E001236????? TST??? IP236?? like_domain
??? E001237????? TSV??? IP237?? simple_domain 
",sep="",header=TRUE,stringsAsFactors=FALSE)

dat2<-ddply(dat1,.(a,c,d), paste,sep=",")
dat2<-dat2[,c(1,5,2:3)]
colnames(dat2)<-colnames(dat1)
dat3<-data.frame(sapply(dat2,function(x) gsub("c|\\(|\\\"|\\)","",x)))
dat3
#??????? a????????????????? b???? c???????????? d
#1 E001234 TSA, TSB, TSC, TSD IP234?? like_domain
#2 E001235????? TSA, TSD, TSS IP235??? two_domain
#3 E001236?????????? TSP, TST IP236?? like_domain
#4 E001237??????????????? TSV IP237 simple_domain

A.K.


----- Original Message -----
From: tgodoy <tingola_07 at hotmail.com>
To: r-help at r-project.org
Cc: 
Sent: Friday, August 3, 2012 1:00 PM
Subject: [R] How to concatenate a several rows according with a column ?

Hi, I'm a new user or R and I try to concatenate a several rows according
with the value in a column. 

this is my data.frame and I want to concatenate my data.frame according with
the column "b" and make a new data.frame with the information in the others
columns.
? ? ? ?  a? ? ? ? ? ? ? ? ? ? b? ? ? ?  c? ? ? ? ? ?  d
1? ? ? E001234? ? ? TSA? ? IP234?  like_domain
2? ? ? E001234? ? ? TSB? ? IP234?  like_domain? ? 
3? ? ? E001234? ? ? TSC? ? IP234?  like_domain
4? ? ? E001234? ? ? TSD? ? IP234?  like_domain
5? ? ? E001235? ? ? TSA? ? IP235?  two_domain
6? ? ? E001235? ? ? TSD? ? IP235?  two_domain
7? ? ? E001235? ? ? TSS? ? IP235?  two_domain
8? ? ? E001236? ? ? TSP? ? IP236?  like_domain
9? ? ? E001236? ? ? TST? ? IP236?  like_domain
10? ? E001237? ? ? TSV? ? IP237?  simple_domain

I want my table in this way
? ? ? ?  a? ? ? ? ? ? ? ? ? ? b? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? c? ? ? ? ? ? 
d
1? ? ? E001234? ? ? TSA, TSB, TSC, TSD? ? IP234?  like_domain
2? ? ? E001235? ? ? ? ? ? ? ? TSA, TSD, TSS? ? IP235?  two_domain
3? ? ? E001236? ? ? ? ? ? ? ?  TSP, TSP, TST? ? IP236?  like_domain
4? ? ? E001237? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?  TSV? ? IP237? 
simple_domain

How can I do this in R? 

Thanks 




--
View this message in context: http://r.789695.n4.nabble.com/How-to-concatenate-a-several-rows-according-with-a-column-tp4639072.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
On Aug 3, 2012, at 10:00 AM, tgodoy wrote:

            
table1 <- <-read.table(text="
         a                    b         c             d
1      E001234      TSA    IP234   like_domain
2      E001234      TSB    IP234   like_domain
3      E001234      TSC    IP234   like_domain
4      E001234      TSD    IP234   like_domain
5      E001235      TSA    IP235   two_domain
6      E001235      TSD    IP235   two_domain
7      E001235      TSS    IP235   two_domain
8      E001236      TSP    IP236   like_domain
9      E001236      TST    IP236   like_domain
10    E001237      TSV    IP237    
simple_domain",header=TRUE,stringsAsFactors=FALSE)

 > aggrdat <- with(table1, aggregate(b, list(a,c,d), FUN=paste,  
sep=",") )
 > names(aggrdat) <- names(table1)[c(2:4,1)]
 > aggrdat
         b     c             d                  a
1 E001234 IP234   like_domain TSA, TSB, TSC, TSD
2 E001236 IP236   like_domain           TSP, TST
3 E001237 IP237 simple_domain                TSV
4 E001235 IP235    two_domain      TSA, TSD, TSS

Swapping the column position is left as an exercise.
#
Hello,

Inline.

Em 04-08-2012 09:34, David Winsemius escreveu:
It's a column order issue, not a names one. Using Jean's form of aggregate,


aggrdat <- aggregate(b ~ a + c + d, data = table1, paste, sep=",")
(table2 <- aggrdat[, c(1, 4, 2, 3)])

Hope this helps,

Rui Barradas