If no one has a better solution, split it, take a sample of size X from both and put it back together.
hgwelec wrote:
Dear members, Consider the following data frame (first 4 rows shown) age sex class 15 m low 20 f high 15 f low 10 m low in my original data set i have 1200 rows and a class distribution of low=0.3 and high=0.7 My question : how can i create a new data frame as the one shown above but with the 'high' class subsampled so that in the new data frame the class distribution is low=0.5 and high=0.5? I tried looking at the sample function and prob option but all examples i seen do not use an imbalanced class problem as the one shown above Thank you in advance Thank you in advance
-- View this message in context: http://r.789695.n4.nabble.com/Subsampling-oversampling-from-a-data-frame-tp3965771p3965827.html Sent from the R help mailing list archive at Nabble.com.