Hi,
May be this helps:
dat1<-read.table(text="
Reference_Position Reference_Allele Variant_Allele Variant_Frequency?? AAC_Variant
?????????????????? 16??????????????? G????????????? A???????? 91.833811?? Val6Met,Met
?????????????????? 52??????????????? G????????????? A???????? 93.969466????? Val18Ile
?????????????????? 64??????????????? G????????????? T???????? 94.155381????? Val22Phe
?????????????????? 73??????????????? C????????????? A???????? 94.293478????? Gln25Lys
???????????????? 131??????????????? G????????????? A???????? 94.268168????? Arg44Lys
???????????????? 64????????????????? G????????????? A???????? 92.947658????? Ser48Asn
??? ??? ?72????????????????? G????????????? A????????? 85.9468?????? Gln25Phe??
???????????????? 62????????????????? C????????????? A????????? 92.6583?????? Arg42Lys
???????????????? 72????????????????? G????????????? T????????? 96.86688????? Ser48Lys
",sep="",header=TRUE,stringsAsFactors=FALSE)
dat2<-dat1[,c(1,4)][duplicated(dat1$Reference_Position)|duplicated(dat1$Reference_Position,fromLast=TRUE),]
dat3<-do.call(rbind,lapply(split(dat2,dat2$Reference_Position),function(x) (x[1,2]*x[2,2])/100))
dat4<-data.frame(Reference_Position=row.names(dat3),Value=dat3)
?row.names(dat4)<-1:nrow(dat4)
merge(dat1[,c(1,4)],dat4,by="Reference_Position",all=TRUE)
#? Reference_Position Variant_Frequency??? Value
#1???????????????? 16????????? 91.83381?????? NA
#2???????????????? 52????????? 93.96947?????? NA
#3???????????????? 62????????? 92.65830?????? NA
#4???????????????? 64????????? 94.15538 87.51522
#5???????????????? 64????????? 92.94766 87.51522
#6???????????????? 72????????? 85.94680 83.25398
#7???????????????? 72????????? 96.86688 83.25398
#8???????????????? 73????????? 94.29348?????? NA
#9??????????????? 131????????? 94.26817?????? NA
A.K.
----- Original Message -----
From: JDINIS <jorgemdinis at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Monday, October 29, 2012 12:27 PM
Subject: [R] Performing multiplication on duplicate values
Hello all,? thank you for your help. The task I need to perform is difficult
to explain so I apologize a head of time for any confusion.
I have a data frame with these colnames() Reference_Position,
Reference_Allele, Variant_Allele, Variant_Frequency,? AAC_Variant. If a
value is duplicated in the "Reference_Position" column, I would both of
their "Variant_Frequency" values multiplied and inserted into a new data
frame.
example below: reference position 64 is repeated,? multiple 94.155381 by
92.947658 and insert value into "Value".
Because I do not know how to best explain this or phrase it into a logical
questions, here is a before and after. Also, if you could explain the code a
bit, that would be appreciated as well- teach and person to fish!
Before