Assuming you have enough data, usually 1/4 to 1/2 is used for validation. One reference would be Picard, R.R. and Berk, K.N. (1990) "Data Splitting," The American Statistician, 44;140-147. hth, b. -----Original Message----- From: Wensui Liu [mailto:liuwensui at gmail.com] Sent: Thursday, November 11, 2004 10:20 PM To: r-help at stat.math.ethz.ch Subject: [R] an off-topic question -> model validation Currently, I am working on a data mining project and plan to divide the data table into 2 parts, one for modeling and the other for validation to compare several models. But I am not sure about the percentage of data I should use to build the model and the one I should keep to validate the model. Is there any literature reference about this topic? Thank you so much! ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
an off-topic question -> model validation
1 message · bogdan romocea