Hi everyone, I am very new to R and I have a task to do. I appreciate any help. I have 3 data sets. Each data set has 4 columns. For example: Class Comment Term Text 0 com1 aac text1 2 com2 aax text2 1 com3 vvx text3 Now I need t compare the class section between 3 data sets and assign the most available class to that text. For example if text1 is assigned to class 0 in data set 1&2 but assigned as 2 in data set 3 then it should be assigned to class 0. If they are all the same so the class will be the same. The ideal thing would be to keep the same format and just update the class. Is there any easy way to do this? Thanks a lot.
Problem with comparing multiple data sets
3 messages · Mohammad Alimohammadi, John Kane, Jim Lemon
Hi Mohammad Welcome to the R-help list. There probably is a fairly easy way to what you want but I think we probably need a bit more background information on what you are trying to achieve. I know I'm not exactly clear on your decision rule(s). It would also be very useful to see some actual sample data in useable R format.Have a look at these links http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and http://adv-r.had.co.nz/Reproducibility.html for some hints on what you might want to include in your question. In particular, read up about dput() in those links and/or see ?dput. This is the generally preferred way to supply sample or illustrative data to the R-help list. It basically creates a perfect copy of the data as it exists on 'your' machine so that R-help readers see exactly what you do. John Kane Kingston ON Canada
-----Original Message----- From: mxalimohamma at ualr.edu Sent: Fri, 22 May 2015 12:37:50 -0500 To: r-help at r-project.org Subject: [R] Problem with comparing multiple data sets Hi everyone, I am very new to R and I have a task to do. I appreciate any help. I have 3 data sets. Each data set has 4 columns. For example: Class Comment Term Text 0 com1 aac text1 2 com2 aax text2 1 com3 vvx text3 Now I need t compare the class section between 3 data sets and assign the most available class to that text. For example if text1 is assigned to class 0 in data set 1&2 but assigned as 2 in data set 3 then it should be assigned to class 0. If they are all the same so the class will be the same. The ideal thing would be to keep the same format and just update the class. Is there any easy way to do this? Thanks a lot. [[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
Hi Mohammad,
You know, I thought this would be fairly easy, but it wasn't really.
df1<-data.frame(Class=c(0,2,1),Comment=c("com1","com2","com3"),
Term=c("aac","aax","vvx"),Text=c("text1","text2","text3"))
df2<-data.frame(Class=c(0,2,1),Comment=c("com1","com2","com3"),
Term=c("aac","aax","vvx"),Text=c("text1","text2","text3"))
df3<-data.frame(Class=c(2,1,0),Comment=c("com1","com2","com3"),
Term=c("aac","aax","vvx"),Text=c("text1","text2","text3"))
dflist<-list(df1,df2,df3)
dflist
# define a function that extracts the value from one field
# selected by a value in another field
extract_by_value<-function(x,field1,value1,field2) {
return(x[x[,field1]==value1,field2])
}
# define another function that equates all of the values
sub_value<-function(x,field1,value1,field2,value2) {
x[x[,field1]==value1,field2]<-value2
return(x)
}
conformity<-function(x,fieldname1,value1,fieldname2) {
# get the most frequent value in fieldname2
# for the desired value in fieldname1
most_freq<-as.numeric(names(which.max(table(unlist(lapply(x,
extract_by_value,fieldname1,value1,fieldname2))))))
# now set all the values to the most frequent
for(i in 1:length(x))
x[[i]]<-sub_value(x[[i]],fieldname1,value1,fieldname2,most_freq)
return(x)
}
conformity(dflist,"Text","text1","Class")
Jim
On Sat, May 23, 2015 at 11:23 PM, John Kane <jrkrideau at inbox.com> wrote:
Hi Mohammad Welcome to the R-help list. There probably is a fairly easy way to what you want but I think we probably need a bit more background information on what you are trying to achieve. I know I'm not exactly clear on your decision rule(s). It would also be very useful to see some actual sample data in useable R format.Have a look at these links http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and http://adv-r.had.co.nz/Reproducibility.html for some hints on what you might want to include in your question. In particular, read up about dput() in those links and/or see ?dput. This is the generally preferred way to supply sample or illustrative data to the R-help list. It basically creates a perfect copy of the data as it exists on 'your' machine so that R-help readers see exactly what you do. John Kane Kingston ON Canada
-----Original Message-----
From: mxalimohamma at ualr.edu
Sent: Fri, 22 May 2015 12:37:50 -0500
To: r-help at r-project.org
Subject: [R] Problem with comparing multiple data sets
Hi everyone,
I am very new to R and I have a task to do. I appreciate any help. I have
3
data sets. Each data set has 4 columns. For example:
Class Comment Term Text
0 com1 aac text1
2 com2 aax text2
1 com3 vvx text3
Now I need t compare the class section between 3 data sets and assign the
most available class to that text. For example if text1 is assigned to
class 0 in data set 1&2 but assigned as 2 in data set 3 then it should be
assigned to class 0. If they are all the same so the class will be the
same. The ideal thing would be to keep the same format and just update
the
class. Is there any easy way to do this?
Thanks a lot.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________ FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.