Message-ID: <e339d4be159b6.49ba3a6b@wiscmail.wisc.edu>
Date: 2009-03-13T15:50:19Z
From: Erika Mudrak
Subject: using tree() for CART on a binary response variable
I am working on a project where we are trying to model levels of Phosphorous runoff on farms as a function of several variables, such as farm size in acres, slope, proximity to streams, number of animals, etc.. We have grouped these explanatory variables into three classes: cheap, medium and expensive, according to how hard it is to obtain that information. We are most interested in making a model to identify "hot" farms, or those farms with a P runoff level over a certain value. For this reason we have converted the P runoff variable from actual numbers into 1/0, reflecting the cutoff (1= above cutoff/hot farm, 0= below cutoff/farm is ok). We are working with 42 observations.
Since, the models will be used often by county office personnel, we would like to analyze the data with classification trees or regression trees, and essentially provide the county office with a decision tree to help identify farms to investigate. We would like to include a tree for each of the variable classes (a cheap tree, a medium tree and an expensive tree).
First we ran the tree() command on the data with the 1/0 response, and it worked for all three variable classes, but I later realized that they were regression trees, treating the 1/0 as numeric. We then ran the tree() command with the response variable wrapped in the factor() command, to convert the 1/0 data from numeric to factor type. A classification tree works for the cheap variables, but not for the medium set, which has only 4 explanatory variables. The error message says something about the tree being too large. Why would the tree be constructable as a regression tree but not as a classification tree? Is it appropriate to run a regression tree on binary response data like this?
I appreciate any guidance.
Erika Mudrak
-------------------------------------------
Erika Mudrak
Graduate Student
Department of Botany
University of Wisconsin-Madison
430 Lincoln Dr
Madison WI, 53706
608-265-2191
mudrak at wisc.edu