Skip to content

Example Data Set(s) for nnet, rpart

2 messages · Ko-Kang Kevin Wang, Christian Schulz

#
Hi,

I'm doing a presentation on Neural Networks and Tree-Based Models in two 
weeks, at the moment I'm looking for a data set to use in the 
presentation.  What I would like to use is a good old data, like the Iris 
data, that is already known by every statisticians.

MASS4 uses the cpus data in Chapter 8.10 and the Cushing's syndrome in 
Chapter 12.4.  These two data sets plus the Iris data I have mentioned 
make three possible candidate data sets.  Does anyone has a good 
recommendation as to which data set is better?  

While I'm at it.  Is it technically correct to obtain (using residuals()) 
the residual sum of square from the nnet() and rpart() models.  Then say 
one is better than the other based on the statistic?
#
...i  like the adult file, beacuse it is "real-life"
and have a lot of cases for good splitting
in train/test cases -  ok  you need more time, to train/test!

http://www.ics.uci.edu/~mlearn/MLSummary.html

Donated by Ron Kohavi
Predicting whether income exceeds $50K/yr based on census data
Documentation: On everything
48842 instances, 14 attributes (6 continuous and 8 nominal)
Missing attribute values
Originally listed as the "Census Income" Database. It was renamed because it
is cited as the "Adult" database

regards,christian


----- Original Message -----
From: "Ko-Kang Kevin Wang" <kwan022 at stat.auckland.ac.nz>
To: "R Help" <r-help at stat.math.ethz.ch>
Sent: Sunday, May 25, 2003 10:15 AM
Subject: [R] Example Data Set(s) for nnet, rpart
----