[apologies if this appears twice]
Hi,
I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s are unfortunately often rather non-Gaussian although most
of the time not multi-modal.
For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as carefully as I can. At the moment I am carrying out a Monte Carlo
estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
orthogonal linear fit for each realisation but that is not very
satisfactory as it leads to different linear relationships depending
on whether I do the orhtogonal fit on x or y (as the errors on X & Y
are quite different & non-Gaussian using the covariance matrix isn't
all that useful
either)
Does anybody know of code in R to do this kind of fitting in a
Bayesian framework? My concern isn't so much on getting _the_ best
slope estimate but rather to have a good estimate of the uncertainty
on the slope.
Cheers,
Jarle.
linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?
7 messages · Brian Ripley, Jarle Brinchmann, Spencer Graves
I wonder if you are using this term in its correct technical sense. A linear functional relationship is V = a + bU X = U + e Y = V + f e and f are random errors (often but not necessarily independent) with distributions possibly depending on U and V respectively. and pairs from (X,Y) are observed. As U and V are not random, there is no PDF of X or Y: each X_i has a different distribution. If you know the error distribution for each X_i and Y_i, you can easily write down a log-likelihood and maximize it. The first hit I got on Google, http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp, has a reference to a paper for the Gaussian case. But finding R code for the non-Gaussian case seems a very long shot.
Jarle Brinchmann wrote:
[apologies if this appears twice]
It did ...
Hi,
I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s are unfortunately often rather non-Gaussian although most
of the time not multi-modal.
For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as carefully as I can. At the moment I am carrying out a Monte Carlo
estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
orthogonal linear fit for each realisation but that is not very
satisfactory as it leads to different linear relationships depending
on whether I do the orhtogonal fit on x or y (as the errors on X & Y
are quite different & non-Gaussian using the covariance matrix isn't
all that useful
either)
Does anybody know of code in R to do this kind of fitting in a
Bayesian framework? My concern isn't so much on getting _the_ best
slope estimate but rather to have a good estimate of the uncertainty
on the slope.
Cheers,
Jarle.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
I wonder if you are using this term in its correct technical sense. A linear functional relationship is V = a + bU X = U + e Y = V + f e and f are random errors (often but not necessarily independent) with distributions possibly depending on U and V respectively. and pairs from (X,Y) are observed. As U and V are not random, there is no PDF of X or Y: each X_i has a different distribution. If you know the error distribution for each X_i and Y_i, you can easily write down a log-likelihood and maximize it. The first hit I got on Google, http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp, has a reference to a paper for the Gaussian case. But finding R code for the non-Gaussian case seems a very long shot.
Jarle Brinchmann wrote:
[apologies if this appears twice]
It did ...
Hi,
I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s are unfortunately often rather non-Gaussian although most
of the time not multi-modal.
For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as carefully as I can. At the moment I am carrying out a Monte Carlo
estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
orthogonal linear fit for each realisation but that is not very
satisfactory as it leads to different linear relationships depending
on whether I do the orhtogonal fit on x or y (as the errors on X & Y
are quite different & non-Gaussian using the covariance matrix isn't
all that useful
either)
Does anybody know of code in R to do this kind of fitting in a
Bayesian framework? My concern isn't so much on getting _the_ best
slope estimate but rather to have a good estimate of the uncertainty
on the slope.
Cheers,
Jarle.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Thanks for the reply!
On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
I wonder if you are using this term in its correct technical sense. A linear functional relationship is V = a + bU X = U + e Y = V + f e and f are random errors (often but not necessarily independent) with distributions possibly depending on U and V respectively.
This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.
However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".
Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)
Thanks,
Jarle.
Isn't this a special case of structural equation modeling, handled
by the 'sem' package?
Spencer
Jarle Brinchmann wrote:
Thanks for the reply! On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
I wonder if you are using this term in its correct technical sense.
A linear functional relationship is
V = a + bU
X = U + e
Y = V + f
e and f are random errors (often but not necessarily independent) with
distributions possibly depending on U and V respectively.
This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.
However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".
Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)
Thanks,
Jarle.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Yes I think so if the errors were normally distributed. Unfortunately
I'm far from that but the combination of sem & its bootstrap is a good
way to deal with it in the normal case.
I must admit as a non-statistician I'm a not 100% sure what the
difference (if there is one) between a linear functional relationship
and a linear structural equation model is as they both deal with
hidden variables as far as I can see.
J.
On Tue, Dec 2, 2008 at 9:33 PM, Spencer Graves <spencer.graves at pdf.com> wrote:
Isn't this a special case of structural equation modeling, handled by
the 'sem' package?
Spencer
Jarle Brinchmann wrote:
Thanks for the reply! On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
I wonder if you are using this term in its correct technical sense. A linear functional relationship is V = a + bU X = U + e Y = V + f e and f are random errors (often but not necessarily independent) with distributions possibly depending on U and V respectively.
This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.
However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".
Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)
Thanks,
Jarle.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Tue, 2 Dec 2008, Jarle Brinchmann wrote:
Yes I think so if the errors were normally distributed. Unfortunately I'm far from that but the combination of sem & its bootstrap is a good way to deal with it in the normal case. I must admit as a non-statistician I'm a not 100% sure what the difference (if there is one) between a linear functional relationship and a linear structural equation model is as they both deal with hidden variables as far as I can see.
U and V are not 'variables' (not random variables) in a linear functional relationship (they are in the closely related linear structural relationship).
J.
On Tue, Dec 2, 2008 at 9:33 PM, Spencer Graves <spencer.graves at pdf.com> wrote:
Isn't this a special case of structural equation modeling, handled by
the 'sem' package?
Spencer
Jarle Brinchmann wrote:
Thanks for the reply! On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
I wonder if you are using this term in its correct technical sense. A linear functional relationship is V = a + bU X = U + e Y = V + f e and f are random errors (often but not necessarily independent) with distributions possibly depending on U and V respectively.
This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.
However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".
Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)
Thanks,
Jarle.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595