Fitting multiple horizontal lines to data
On Nov 6, 2013, at 9:19 AM, Sashikanth Chandrasekaran wrote:
I am not trying to fit a horizontal line at every unique value of y. I am trying fit the y values with as few horizontal lines by trading off the number of horizontal lines with the error. The actual problem I am trying to solve is to smooth data in a time series. Here is a realistic example of y y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99) An example fit for y using multiple horizontal lines (may not be the best fit in terms of squared error or another error metric, but I have included the y value for concreteness)
The human brain searches for patterns and often finds them where there is no underlying mechanism. If you are asking for a regime-change method that is statistically based and will replicate your brain-driven pencil-and-paper methods you will probably be disappointed.
plot(y) abline(h=c(140,110,150,65) ) abline(v=c(13,20,22,30) ,col="red")
1. A horizontal line at approximately y=140 (to fit the first 13 values - 134.45 to 148.77) 2. A horizontal line at approximately y=110 (to fit the next 7 values - 71.64 to 111.64) 3. A horizontal line at approximately y=150 (to fit the next 2 values - 157.26 to 143.8) 4. A horizontal line at approximately y=65 (to fit the last 8 values - 40.19 to 64.99) -sashi.
If you want a method that is driven by the magnitude of the shift in adjacent values, then this would find some but not all of your proposed breakpoints:
which( abs(diff(y)) >55)
[1] 11 13 22 25 26 You could perhaps refine that set of candidates by requiring that the next value have some other defining feature but I was unable to come up with a simple rule-set that agreed with your candidates . There are packages that do segmented regression but hey are not generall set up to assume all regression coefficients are 0 and that you are only interested in
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius Alameda, CA, USA