Question about Cubist Model

Thu, Jan 12, 2017 2:37 PM #

Dear All,
I am fine tuning a Cubist model (see
https://cran.r-project.org/web/packages/Cubist/index.html).
I am a bit puzzled by its output. On a dataset which contains 275
cases, I get non mutually exclusive rules.
E.g., in the output below, rules 2 and 3 cover all the 275 cases of
the data set and rule 1 overlaps partially.
Am I misunderstanding something?
Many thanks

Lorenzo




Cubist [Release 2.07 GPL Edition]  Thu Jan 12 23:10:40 2017
---------------------------------

    Target attribute `outcome'

Read 275 cases (21 attributes) from undefined.data

Model:

  Rule 1: [204 cases, mean 0.5393324, range 0 to 2.285714, est err
  0.2598495]

    if
	home_copub_after_all <= 0.7142857
        host_copub_after_all <= 1.833333
		 then
	     outcome = 0.1666667 + 0.9 home_copub_after_all
		          + 0.11 home_copub_before_all

  Rule 2: [259 cases, mean 0.7445303, range 0 to 3.166667, est err
  0.1866440]

    if
	host_copub_after_all <= 1.833333
	    then
		outcome = 0.0433333 + 0.75 home_copub_after_all
			          + 0.33 host_copub_after_all + 0.37
	top_10_after_all

  Rule 3: [16 cases, mean 4.4285712, range 2.142857 to 8.857142, est
  err 1.0346190]

    if
	host_copub_after_all > 1.833333
	    then
		outcome = 1.595 + 1.03 top_10_after_all + 0.45
	home_copub_after_all


Evaluation on training data (275 cases):

    Average  |error|          0.2678023
        Relative |error|               0.38
	    Correlation coefficient        0.94


	    Attribute usage:
	    	        Conds  Model

			  100%    54%    host_copub_after_all
			  	     43%   100%
			  	     home_copub_after_all
					          57%
			  	     top_10_after_all
					          43%
			  	     home_copub_before_all


Time: 0.0 secs

Mxkuhn

Thu, Jan 12, 2017 3:04 PM #

It is doing the right thing. The rules are first derived from a regression tree and, in the process of pruning the rules, they can produce overlapping sets. When the rules overlap, the regression output is average across the active rules. 

Thanks,

Max