An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111028/6d62f731/attachment.pl>
quick matching question
4 messages · Ben Ganzfried, Sarah Goslee, Marc Schwartz
Looks like a job for merge().
On Fri, Oct 28, 2011 at 10:49 AM, Ben Ganzfried <ben.ganzfried at gmail.com> wrote:
Hey,
I'm trying to match patient identifiers from two separate input files, and
then add information from one of the input files to the corresponding output
file. ?I'd greatly appreciate any help!
More specifically,
Input_File_1 has a column header "bcr_patient_barcode"
Input_File_2 has a column header "Barcode" and a column header "Batch"
I want my script to match the appropriate patient identifiers since
"bcr_patient_barcode" and "Barcode" are not in the same order. ?Then I want
to add the information from "Batch" to the corresponding patient.
My (incorrect) code is below:
#batch
tmp <- Input_File_2$Barcode
tmp1 <- Input_File_1$bcr_patient_barcode
for i in tmp
?for item in tmp1
if (tmp == tmp1) {
?curated$batch <- Input_File_2$Batch
}
Sarah Goslee http://www.functionaldiversity.org
On Oct 28, 2011, at 9:49 AM, Ben Ganzfried wrote:
Hey,
I'm trying to match patient identifiers from two separate input files, and
then add information from one of the input files to the corresponding output
file. I'd greatly appreciate any help!
More specifically,
Input_File_1 has a column header "bcr_patient_barcode"
Input_File_2 has a column header "Barcode" and a column header "Batch"
I want my script to match the appropriate patient identifiers since
"bcr_patient_barcode" and "Barcode" are not in the same order. Then I want
to add the information from "Batch" to the corresponding patient.
My (incorrect) code is below:
#batch
tmp <- Input_File_2$Barcode
tmp1 <- Input_File_1$bcr_patient_barcode
for i in tmp
for item in tmp1
if (tmp == tmp1) {
curated$batch <- Input_File_2$Batch
}
Thanks!
See ?merge and then use something like: newDF <- merge(Input_File_2, Input_File_1, by.x = "Barcode", by.y = "bcr_patient_barcode") Also, pay attention to the 'all', 'all.x' and 'all.y' arguments, which control whether or not only matching records are retained or non-matching records are retained from one or both datasets. merge() performs an "SQL-like" join operation. HTH, Marc Schwartz
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20111028/7a306c09/attachment.pl>