An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20131106/f244a7be/attachment.pl>
Fitting multiple horizontal lines to data
5 messages · Sashikanth Chandrasekaran, David L Carlson, David Winsemius +2 more
The changepoint package might give you a way to do this. ------------------------------------- David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sashikanth Chandrasekaran Sent: Wednesday, November 6, 2013 11:20 AM To: r-help at r-project.org Subject: [R] Fitting multiple horizontal lines to data I am not trying to fit a horizontal line at every unique value of y. I am trying fit the y values with as few horizontal lines by trading off the number of horizontal lines with the error. The actual problem I am trying to solve is to smooth data in a time series. Here is a realistic example of y y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175 ,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,1 57.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99) An example fit for y using multiple horizontal lines (may not be the best fit in terms of squared error or another error metric, but I have included the y value for concreteness) 1. A horizontal line at approximately y=140 (to fit the first 13 values - 134.45 to 148.77) 2. A horizontal line at approximately y=110 (to fit the next 7 values - 71.64 to 111.64) 3. A horizontal line at approximately y=150 (to fit the next 2 values - 157.26 to 143.8) 4. A horizontal line at approximately y=65 (to fit the last 8 values - 40.19 to 64.99) -sashi. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Nov 6, 2013, at 9:19 AM, Sashikanth Chandrasekaran wrote:
I am not trying to fit a horizontal line at every unique value of y. I am trying fit the y values with as few horizontal lines by trading off the number of horizontal lines with the error. The actual problem I am trying to solve is to smooth data in a time series. Here is a realistic example of y y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,148.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64.99,64.99,129.98,64.99,65,64.98,64.99) An example fit for y using multiple horizontal lines (may not be the best fit in terms of squared error or another error metric, but I have included the y value for concreteness)
The human brain searches for patterns and often finds them where there is no underlying mechanism. If you are asking for a regime-change method that is statistically based and will replicate your brain-driven pencil-and-paper methods you will probably be disappointed.
plot(y) abline(h=c(140,110,150,65) ) abline(v=c(13,20,22,30) ,col="red")
1. A horizontal line at approximately y=140 (to fit the first 13 values - 134.45 to 148.77) 2. A horizontal line at approximately y=110 (to fit the next 7 values - 71.64 to 111.64) 3. A horizontal line at approximately y=150 (to fit the next 2 values - 157.26 to 143.8) 4. A horizontal line at approximately y=65 (to fit the last 8 values - 40.19 to 64.99) -sashi.
If you want a method that is driven by the magnitude of the shift in adjacent values, then this would find some but not all of your proposed breakpoints:
which( abs(diff(y)) >55)
[1] 11 13 22 25 26 You could perhaps refine that set of candidates by requiring that the next value have some other defining feature but I was unable to come up with a simple rule-set that agreed with your candidates . There are packages that do segmented regression but hey are not generall set up to assume all regression coefficients are 0 and that you are only interested in
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius Alameda, CA, USA
Possibly see the strucchange package
Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 11/6/13 9:19 AM, "Sashikanth Chandrasekaran" <sashikanth.chandrasekaran at gmail.com> wrote: >I am not trying to fit a horizontal line at every unique value of y. I am >trying fit the y values with as few horizontal lines by trading off the >number of horizontal lines with the error. The actual problem I am trying >to solve is to smooth data in a time series. Here is a realistic example >of >y > >y=c(134.45,141.82,143.81,141.81,145,141.61,143.72,145.71,200,175,140,200,1 >48.77,71.64,111.57,118.15,119.15,112.8,111.64,111.64,157.26,143.8,40.19,64 >.99,64.99,129.98,64.99,65,64.98,64.99) > >An example fit for y using multiple horizontal lines (may not be the best >fit in terms of squared error or another error metric, but I have included >the y value for concreteness) > >1. A horizontal line at approximately y=140 (to fit the first 13 values - >134.45 to 148.77) >2. A horizontal line at approximately y=110 (to fit the next 7 values - >71.64 to 111.64) >3. A horizontal line at approximately y=150 (to fit the next 2 values - >157.26 to 143.8) >4. A horizontal line at approximately y=65 (to fit the last 8 values - >40.19 to 64.99) >-sashi. > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
You already asked this on StackOverflow. The answer remains the same, pretty much what David W. wrote: this is not a question about fitting lines to data. You need to step back and think about what message you want to deliver to those who will view your graph, and what the meaning of your dataset is in the first place. -- View this message in context: http://r.789695.n4.nabble.com/Fitting-multiple-horizontal-lines-to-data-tp4679891p4679952.html Sent from the R help mailing list archive at Nabble.com.