Hi to all, I'm new to this forum and new to R. I have to build a tree classifier that has boolean values as response. When I build the tree with: echoknn.tree <- tree(class ~ ., data=echoknn.train) where "class" is a coloumn of my dataset (echoknn.train) of boolean values, the result is a tree where leaf nodes are numbers in the range [0,1]; but this isn't the result that I expect to have. I'd want that result of classifier is TRUE or FALSE. Can someone help me? Thanks. Fabio -- View this message in context: http://r.789695.n4.nabble.com/Classifying-boolean-values-tp3579993p3579993.html Sent from the R help mailing list archive at Nabble.com.
Classifying boolean values
5 messages · Sarah Goslee, Uwe Ligges, Grifone
It's likely that class is numeric and you actually want factor (regression tree vs classification tree). str(echoknn.train) will show you. By saying, "I have to build a tree classifier" you make me think that this is a course assignment. If it is, you should perhaps talk to your instructor. If not, then a more detailed and reproducible example will usually get you a more informative answer, since it will allow people to actually run and debug your code. Sarah
On Tue, Jun 7, 2011 at 11:47 AM, Grifone <fabio.podda at alice.it> wrote:
Hi to all, I'm new to this forum and new to R. I have to build a tree classifier that has boolean values as response. When I build the tree with: echoknn.tree <- tree(class ~ ., data=echoknn.train) where "class" is a coloumn of my dataset (echoknn.train) of boolean values, the result is a tree where leaf nodes are numbers in the range [0,1]; but this isn't the result that I expect to have. I'd want that result of classifier is TRUE or FALSE. Can someone help me? Thanks. Fabio
Sarah Goslee http://www.functionaldiversity.org
Thanks Sarah for the response; with the command str(echoknn.train) the coloumn "class" is a "logi" value (i think without any immagination that is a logical value ). So, how can I handle this type of data? Thanks a lot. P.S. Yes, is a course assignment and i was hoping to solve this problem (that i consider just a beginner problem) without asking my teacher . -- View this message in context: http://r.789695.n4.nabble.com/Classifying-boolean-values-tp3579993p3581980.html Sent from the R help mailing list archive at Nabble.com.
1 day later
Convert it to a factor? Uwe Ligges
On 08.06.2011 10:44, Grifone wrote:
Thanks Sarah for the response; with the command str(echoknn.train) the coloumn "class" is a "logi" value (i think without any immagination that is a logical value ). So, how can I handle this type of data? Thanks a lot. P.S. Yes, is a course assignment and i was hoping to solve this problem (that i consider just a beginner problem) without asking my teacher . -- View this message in context: http://r.789695.n4.nabble.com/Classifying-boolean-values-tp3579993p3581980.html Sent from the R help mailing list archive at Nabble.com.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Yes, it works! Thanks a lot! Now, i have another question... When i try to use the tree for predict the value of the class with the function "predict" the result is not a vector with TRUE or FALSE value (that is what i want for every row of my test set) but is a sort of matrix with a weight on the two possible values. For better understanding, I copy the commands and the result. I have two data frames, echoknn.train for growing the tree and echoknn.test for testing it, there is the str() result
str(echoknn.test)
'data.frame': 32 obs. of 6 variables: $ age.at.heart.attack : num 55 57 68 60 54 55 66 54 55 55 ... $ pericardical.effusion: int 0 0 0 0 0 1 0 0 0 1 ... $ fractional.shortening: num 0.26 0.16 0.26 0.33 0.14 ... $ epss : num 4 22 5 8 13 ... $ lvdd : num 3.42 5.75 4.31 5.25 4.49 ... $ wall.motion.index : num 1 2.25 1 1 1.19 ...
str(echoknn.train)
'data.frame': 64 obs. of 7 variables: $ age.at.heart.attack : num 70 65 51 62 63 46 63 70 79 59 ... $ pericardical.effusion: int 1 0 0 0 1 0 0 1 0 0 ... $ fractional.shortening: num 0.27 0.36 0.16 0.15 0.241 ... $ epss : num 4.7 8.8 13.2 0 10 ... $ lvdd : num 4.49 5.78 5.26 4.51 5.31 ... $ wall.motion.index : num 2 1 1 1.41 1 ... $ class : Factor w/ 2 levels "TRUE","FALSE": 1 1 1 1 1 1 1 1 1 1 ... and these are the commands: echoknn.tree <- tree(class ~ ., data=echoknn.train) predictedClass <- predict(echoknn.tree,echoknn.test) but predicted classes are
predictedClass
TRUE FALSE 3 1.000 0.000 5 1.000 0.000 6 1.000 0.000 8 0.875 0.125 10 1.000 0.000 16 1.000 0.000 19 1.000 0.000 26 1.000 0.000 28 1.000 0.000 30 1.000 0.000 39 1.000 0.000 41 1.000 0.000 44 1.000 0.000 59 1.000 0.000 60 1.000 0.000 62 1.000 0.000 65 1.000 0.000 72 0.600 0.400 76 1.000 0.000 79 1.000 0.000 80 1.000 0.000 83 1.000 0.000 96 1.000 0.000 114 0.875 0.125 115 0.875 0.125 117 1.000 0.000 119 0.600 0.400 120 1.000 0.000 122 1.000 0.000 125 1.000 0.000 129 0.875 0.125 131 1.000 0.000 where I go wrong? Thanks. Fabio -- View this message in context: http://r.789695.n4.nabble.com/Classifying-boolean-values-tp3579993p3585459.html Sent from the R help mailing list archive at Nabble.com.