Performing multiplication on duplicate values
Hi Jorge, I think this will be much more useful when you have triplicates or more. dat1<-read.table(text=" Reference_Position Reference_Allele Variant_Allele Variant_Frequency?? AAC_Variant ?????????????????? 16??????????????? G????????????? A???????? 91.833811?? Val6Met,Met ?????????????????? 52??????????????? G????????????? A???????? 93.969466????? Val18Ile ?????????????????? 64??????????????? G????????????? T???????? 94.155381????? Val22Phe ?????????????????? 73??????????????? C????????????? A???????? 94.293478????? Gln25Lys ???????????????? 131??????????????? G????????????? A???????? 94.268168????? Arg44Lys ???????????????? 64????????????????? G????????????? A???????? 92.947658????? Ser48Asn ??? ??? ?72????????????????? G????????????? A????????? 85.9468?????? Gln25Phe?? ???????????????? 62????????????????? C????????????? A????????? 92.6583?????? Arg42Lys ???????????????? 72????????????????? G????????????? T????????? 96.86688????? Ser48Lys ??? ??? ?72????????????????? G????????????? C????????? 94.8488?????? Arg42ln ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-dat1[,c(1,4)][duplicated(dat1$Reference_Position)|duplicated(dat1$Reference_Position,fromLast=TRUE),] dat3<-do.call(rbind,lapply(split(dat2,dat2$Reference_Position),function(x) prod(x[,2])/100)) dat4<-data.frame(Reference_Position=row.names(dat3),Value=dat3) ?row.names(dat4)<-1:nrow(dat4) ?merge(dat1[,c(1,4)],dat4,by="Reference_Position",all=TRUE) #?? Reference_Position Variant_Frequency????? Value #1????????????????? 16????????? 91.83381???????? NA #2????????????????? 52????????? 93.96947???????? NA #3????????????????? 62????????? 92.65830???????? NA #4????????????????? 64????????? 94.15538?? 87.51522 #5????????????????? 64????????? 92.94766?? 87.51522 #6????????????????? 72????????? 85.94680 7896.54044 #7????????????????? 72????????? 96.86688 7896.54044 #8????????????????? 72????????? 94.84880 7896.54044 #9????????????????? 73????????? 94.29348???????? NA #10??????????????? 131????????? 94.26817???????? NA ? A.K. ----- Original Message ----- From: Jorge Dinis <jorgemdinis at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Monday, October 29, 2012 1:17 PM Subject: Re: [R] Performing multiplication on duplicate values Thank you very much, I will give this a try! JD
On Oct 29, 2012, at 12:15 PM, arun wrote:
Hi, May be this helps: dat1<-read.table(text=" Reference_Position Reference_Allele Variant_Allele Variant_Frequency? AAC_Variant ? ? ? ? ? ? ? ? ? ? 16? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 91.833811? Val6Met,Met ? ? ? ? ? ? ? ? ? ? 52? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 93.969466? ? ? Val18Ile ? ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? G? ? ? ? ? ? ? T? ? ? ? 94.155381? ? ? Val22Phe ? ? ? ? ? ? ? ? ? ? 73? ? ? ? ? ? ? ? C? ? ? ? ? ? ? A? ? ? ? 94.293478? ? ? Gln25Lys ? ? ? ? ? ? ? ? ? 131? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 94.268168? ? ? Arg44Lys ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 92.947658? ? ? Ser48Asn ? ? ? ? ? 72? ? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? ? 85.9468? ? ? Gln25Phe? ? ? ? ? ? ? ? ? ? 62? ? ? ? ? ? ? ? ? C? ? ? ? ? ? ? A? ? ? ? ? 92.6583? ? ? Arg42Lys ? ? ? ? ? ? ? ? ? 72? ? ? ? ? ? ? ? ? G? ? ? ? ? ? ? T? ? ? ? ? 96.86688? ? ? Ser48Lys ",sep="",header=TRUE,stringsAsFactors=FALSE) dat2<-dat1[,c(1,4)][duplicated(dat1$Reference_Position)|duplicated(dat1$Reference_Position,fromLast=TRUE),] dat3<-do.call(rbind,lapply(split(dat2,dat2$Reference_Position),function(x) (x[1,2]*x[2,2])/100)) dat4<-data.frame(Reference_Position=row.names(dat3),Value=dat3) ? row.names(dat4)<-1:nrow(dat4) merge(dat1[,c(1,4)],dat4,by="Reference_Position",all=TRUE) #? Reference_Position Variant_Frequency? ? Value #1? ? ? ? ? ? ? ? 16? ? ? ? ? 91.83381? ? ? NA #2? ? ? ? ? ? ? ? 52? ? ? ? ? 93.96947? ? ? NA #3? ? ? ? ? ? ? ? 62? ? ? ? ? 92.65830? ? ? NA #4? ? ? ? ? ? ? ? 64? ? ? ? ? 94.15538 87.51522 #5? ? ? ? ? ? ? ? 64? ? ? ? ? 92.94766 87.51522 #6? ? ? ? ? ? ? ? 72? ? ? ? ? 85.94680 83.25398 #7? ? ? ? ? ? ? ? 72? ? ? ? ? 96.86688 83.25398 #8? ? ? ? ? ? ? ? 73? ? ? ? ? 94.29348? ? ? NA #9? ? ? ? ? ? ? ? 131? ? ? ? ? 94.26817? ? ? NA A.K. ----- Original Message ----- From: JDINIS <jorgemdinis at gmail.com> To: r-help at r-project.org Cc: Sent: Monday, October 29, 2012 12:27 PM Subject: [R] Performing multiplication on duplicate values Hello all,? thank you for your help. The task I need to perform is difficult to explain so I apologize a head of time for any confusion. I have a data frame with these colnames() Reference_Position, Reference_Allele, Variant_Allele, Variant_Frequency,? AAC_Variant. If a value is duplicated in the "Reference_Position" column, I would both of their "Variant_Frequency" values multiplied and inserted into a new data frame. example below: reference position 64 is repeated,? multiple 94.155381 by 92.947658 and insert value into "Value". Because I do not know how to best explain this or phrase it into a logical questions, here is a before and after. Also, if you could explain the code a bit, that would be appreciated as well- teach and person to fish! Before
dat
[Reference_Position] [Reference_Allele] [Variant_Allele] [Variant_Frequency]? [AAC_Variant] ? ? ? ? ? ? ? ? ? ? 16? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 91.833811? Val6Met,Met ? ? ? ? ? ? ? ? ? ? 52? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 93.969466? ? Val18Ile ? ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? G? ? ? ? ? ? ? T? ? ? ? 94.155381? ? Val22Phe ? ? ? ? ? ? ? ? ? ? 73? ? ? ? ? ? ? ? C? ? ? ? ? ? ? A? ? ? ? 94.293478? ? Gln25Lys ? ? ? ? ? ? ? ? ? 131? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 94.268168? ? Arg44Lys ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? ? G? ? ? ? ? ? ? A? ? ? ? 92.947658? ? Ser48Asn After
dat
[Reference_Position]? ? [Variant_Frequency]? ? ? [Value] ? ? ? ? ? ? ? ? ? ? 16? ? ? ? ? ? ? ? 91.833811? ? ? ? ? ? ? ? ? ? ? ? ? ? NA ? ? ? ? ? ? ? ? ? ? 52? ? ? ? ? ? ? ? 93.969466? ? ? ? ? ? ? ? ? ? ? ? ? ? NA ? ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? 94.155381? ? ? ? ? ? ? ? ? ? ? ? 85.152215 ? ? ? ? ? ? ? ? ? ? 73? ? ? ? ? ? ? ? 94.293478? ? ? ? ? ? ? ? ? ? ? ? ? NA ? ? ? ? ? ? ? ? ? 131? ? ? ? ? ? ? ? 94.268168? ? ? ? ? ? ? ? ? ? ? ? ? NA ? ? ? ? ? ? ? ? ? ? 64? ? ? ? ? ? ? ? 92.947658? ? ? ? ? ? ? ? ? ? ? ? 85.152215 -- View this message in context: http://r.789695.n4.nabble.com/Performing-multiplication-on-duplicate-values-tp4647772.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.