Hi, is it possible to perform a geometric mean regression with R ? Thanks. ------------------------------------------------ Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 ------------------------------------------------
geometric mean regression
5 messages · Poizot Emmanuel, Kjetil Halvorsen, (Ted Harding) +1 more
Poizot Emmanuel wrote:
Hi, is it possible to perform a geometric mean regression with R ? Thanks.
As has been said on this list before, "This is R, there is no if, only how", but if you actually wanted to ask how it is possible, it would help if you explained what is "geometric mean regression". Kjetil
------------------------------------------------ Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 ------------------------------------------------ ------------------------------------------------------------------------
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ------------------------------------------------------------------------ No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.322 / Virus Database: 267.4.0 - Release Date: 01/06/2005
--
Kjetil Halvorsen.
Peace is the most effective weapon of mass construction.
-- Mahdi Elmandjra
No virus found in this outgoing message. Checked by AVG Anti-Virus.
I presume the reference is to the 'geometric mean functional regression' or the 'line of organic correlation' or 'reduced major axis regression'. If so, this is relatively easy alsmost trivial to implement in R. Maybe it's in a package, but I never looked. I worked from Helsel's description in his classic water resources statistics book. See Chapter 10 here: http://water.usgs.gov/pubs/twri/twri4a3/ Now, if you are after confidence intervals or prediction intervals, I haven't found anything on that yet. Seems that I did something a couple of year ago by hacking some approximate residuals using the LOC line and the data, and then feeding that into the CL and PL equations for OLS. (Be advised that I'm not a statistician and did that in the spirit of approximation--who knows? :O) ) By coincidence I've been looking at this again recently. Maybe bootstrapping.... Regards, Michael Grant --- Kjetil Brinchmann Halvorsen <kjetil at acelerate.com> wrote:
Poizot Emmanuel wrote:
Hi, is it possible to perform a geometric mean
regression with R ?
Thanks.
As has been said on this list before, "This is R, there is no if, only how", but if you actually wanted to ask how it is possible, it would help if you explained what is "geometric mean regression". Kjetil
------------------------------------------------ Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 ------------------------------------------------
------------------------------------------------------------------------
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html ------------------------------------------------------------------------ No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.322 / Virus Database: 267.4.0 - Release Date: 01/06/2005 -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
2 days later
On 03-Jun-05 Michael Grant wrote:
I presume the reference is to the 'geometric mean functional regression' or the 'line of organic correlation' or 'reduced major axis regression'. If so, this is relatively easy alsmost trivial to implement in R.
This somewhat contentious method is indeed trivial to implement in R. The idea is that if you plot the two regression lines (y on x, x on y) on the same axes (y vertical, x horizontal), the slope of the GMR is the geometric mean of the slopes of these two lines. Since the slope of the y-on-x line is Sxy/Sxx, and the slope of the x-on-y line is Syy/Sxy, the GMR slope is therefore sqrt(Syy/Sxx) = sd(y)/sd(x). All three lines go through the same point, (mean(x),mean(y)).
Maybe it's in a package, but I never looked.
It hardly needs a package!
I worked from Helsel's description in his classic water resources statistics book. See Chapter 10 here: http://water.usgs.gov/pubs/twri/twri4a3/
The method goes back a lot further than suggested here. It seems it was proposed in oceanography by H. Sverdrup in 1916, and very influentially promoted by W.E. Ricker (e.g. Jnl Fisheries Research Board of Canada, 1973, vol. 30, 409-434).
Now, if you are after confidence intervals or prediction intervals, I haven't found anything on that yet. Seems that I did something a couple of year ago by hacking some approximate residuals using the LOC line and the data, and then feeding that into the CL and PL equations for OLS. (Be advised that I'm not a statistician and did that in the spirit of approximation--who knows? :O) )
The uncertainty properties, and indeed the interpretation, of this method are elusive. You can, of course, resort to whatever stochastic modelling you choose (including simulation and bootstrap) to estimate the variability of the slope sd(y)/sd(x) and of any predictions you may want to make. However, the method shows its indeterminate side to the extent that the relationship between y and x is loose rather than tight. At one extreme, where the correlation between x and y = 1, the two regression lines (y on x and z on y) and the GMR all coincide. No problem here. At the other extreme, where there is no correlation, the GMR method still gives you a definite answer (sd(y)/sd(x)) even though by normal standards there is no relationhip between y and x. In the latter case, the slope of the GMR depends solely on the two SDs, and we may well ask what is being estimated here (apart from the ratio of the SDs). (Of course, if you go back to the "primitive" definition, you find yourself evaluating sqrt(0 * inf), which is indeterminate; and this is a better outcome than sd(y)/sd(x), but still falls short of telling you directly that y is independent of x). As you approach the r=0 situation, you therefore have to be mindful that the GMR method will appear to provide a definite answer to a question which in reality has at best a vague answer, i.e. there is a major problem of interpretation. Therefore I would be suspicious of results obtained by "blind" application of the GMR method which were not accompanied by a good discussion of grounds why the results can be expectd to be meaningful in the particular case where it has been applied. The GMR method seems to be well entrenched in the fisheries, natural resources, and ecology worlds. I suspect that the reasons for this may be partly "psychological": people are aware that they are looking for a functional relationship, are put off (rightly) by the existence of two regression lines, and are not enthusiastic to tangle with the difficulties (including the potential indeterminacy) of estimating a linear functional relationship. The GMR provides a very simple escape route which, in no doubt many cases, may give you as good a working answer as you can expect. Nevertheless, I'm inclined to the view that the linear functional relationship is usually the best way to go. When the observed (x,y) points depart from the "true" points on the straight line by normally distributed amounts, the MLE of the relationship is well defined provided the ratio of the "departure" variances is fixed. Therefore it is possible to examine the robustness of the estimated relationship with respect to variation in the assumed value of this ratio. To the extent that this is acceptably robust within plausible variation of the ratio, you have an adequate and reliable perspective. Otherwise, you have to acknowledge that your information is inadquate. The danger of adopting a formulaic solution like GMY is that it tends to conceal inadequacy of information! Best wishes, Ted.
By coincidence I've been looking at this again recently. Maybe bootstrapping.... Regards, Michael Grant --- Kjetil Brinchmann Halvorsen <kjetil at acelerate.com> wrote:
Poizot Emmanuel wrote:
Hi, is it possible to perform a geometric mean
regression with R ?
Thanks.
As has been said on this list before, "This is R, there is no if, only how", but if you actually wanted to ask how it is possible, it would help if you explained what is "geometric mean regression". Kjetil
------------------------------------------------ Emmanuel Poizot Cnam/Intechmer B.P. 324 50103 Cherbourg Cedex Phone (Direct) : (00 33)(0)233887342 Fax : (00 33)(0)233887339 ------------------------------------------------
------------------------------------------------------------------------
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html ------------------------------------------------------------------------ No virus found in this incoming message. Checked by AVG Anti-Virus. Version: 7.0.322 / Virus Database: 267.4.0 - Release Date: 01/06/2005 -- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 06-Jun-05 Time: 10:20:01 ------------------------------ XFMail ------------------------------
Hi Ted, Thank you for your informative comments regarding GMR. TH:
This somewhat contentious method
Contentious...well that says a lot (seriously)! TH:
is indeed trivial to implement in R. ...
I implemented it in a simnple brute force manner--elegance is time--following Helsel and Hirsch. Get the two slopes to calculate the GMR slope and then use mean(x) and mean(y) with the new slope to get the intercept... TH:
It hardly needs a package!
By itself, no. Your comment is timely given another help thread currently on the large number of packages :O). But something like Stats-R-Us and/or the R-grahpics gallery aimed at useful snippets not worthy of packages.... MWG:
I worked from Helsel's description in his classic
water
resources statistics book. See Chapter 10 here: http://water.usgs.gov/pubs/twri/twri4a3/
TH:
The method goes back a lot further than suggested here.
So it seems. After all it has to have been around to acquire all the different names it goes by. The USGS book is just good as an online reference. BTW read 'classic' as useful but out of print. I listed the material because I have found it quite lucid and I like the emphasis on non-parametric methods. Making the material available is indeed quite generous of the authors. I find the book quite thought provoking for the non-statistics individual. I'm always looking for insights. MWG:
Now, if you are after confidence intervals or prediction intervals, I haven't found anything on
TH:
The uncertainty properties, and indeed the interpretation, of this method are elusive.
... Now you are getting to the heart of what as been puzzling me lately. To me the question seemed to be: does it make sense to even talk about confidence bands and prediction bands for GMR. It seemed that one can take a stochastic approach to prediction, i.e., one can set up simulations and roll the dice over and over. On one hand it is beyond my knowledge at this time to ascertain whether or not the the results of such effort can be couched in the traditional language of confidence bands and prediction bands about such a line--neither variable is (in)dependent. Yet if I view it from the perspective of the minimization of the sum of the areas of the right triangles (Helsel Fig. 10.8) determined by each observation and the GMR (LOC), I am back to a single variable(?)... Oh, well I have not lost sleep over it, and indeed find your use of the term 'elusive' reassuring.
At the other extreme, where there is no correlation,
Being conservative in such matters, little or no correlation is where I declare defeat and move on to some other tactic ;O).
The GMR method seems to be well entrenched in the fisheries, ... Nevertheless, I'm inclined to the view that the linear functional relationship is usually the best way to go. When the observed (x,y) points depart from the "true" points on the straight line by normally distributed amounts, the MLE of the relationship is well defined provided the ratio of the "departure" variances is fixed. Therefore it is possible to examine the robustness of the estimated relationship with respect to variation in the assumed value of this ratio. To the extent that this is acceptably robust within plausible variation of the ratio, you have an adequate and reliable perspective. Otherwise, you have to acknowledge that your information is inadquate. The danger of adopting a formulaic solution like GMY is that it tends to conceal inadequacy of information!
Hmmm, more fodder for self study. Thank you very much for the insights! Best regards, Michael Grant