Skip to content

linear functional relationships with heteroscedastic & non-Gaussian errors - any packages around?

7 messages · Brian Ripley, Jarle Brinchmann, Spencer Graves

#
[apologies if this appears twice]

Hi,

I have a situation where I have a set of pairs of X & Y variables for
each of which I have a (fairly) well-defined PDF. The PDF(x_i) 's and
PDF(y_i)'s  are unfortunately often rather non-Gaussian although most
of the time not multi-modal.

For these data (estimates of gas content in galaxies), I need to
quantify a linear functional relationship and I am trying to do this
as carefully as I can. At the moment I am carrying out a Monte Carlo
estimation, sampling from each PDF(x_i) and PDF(y_i) and using a
orthogonal linear fit for each realisation but that is not very
satisfactory as it leads to different linear relationships depending
on whether I do the orhtogonal fit on x or y (as the errors on X & Y
are quite different & non-Gaussian using the covariance matrix isn't
all that useful
either)

Does anybody know of code in R to do this kind of fitting in a
Bayesian framework? My concern isn't so much on getting _the_ best
slope estimate but rather to have a good estimate of the uncertainty
on the slope.

      Cheers,
            Jarle.
#
I wonder if you are using this term in its correct technical sense.
A linear functional relationship is

V = a + bU
X = U + e
Y = V + f

e and f are random errors (often but not necessarily independent) with 
distributions possibly depending on U and V respectively.

and pairs from (X,Y) are observed.  As U and V are not random, there is 
no PDF of X or Y: each X_i has a different distribution.  If you know 
the error distribution for each X_i and Y_i, you can easily write down a 
log-likelihood and maximize it.

The first hit I got on Google, 
http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp, 

has a reference to a paper for the Gaussian case.

But finding R code for the non-Gaussian case seems a very long shot.
Jarle Brinchmann wrote:
It did ...

  
    
#
I wonder if you are using this term in its correct technical sense.
A linear functional relationship is

V = a + bU
X = U + e
Y = V + f

e and f are random errors (often but not necessarily independent) with 
distributions possibly depending on U and V respectively.

and pairs from (X,Y) are observed.  As U and V are not random, there is 
no PDF of X or Y: each X_i has a different distribution.  If you know 
the error distribution for each X_i and Y_i, you can easily write down a 
log-likelihood and maximize it.

The first hit I got on Google, 
http://www.rsc.org/Membership/Networking/InterestGroups/Analytical/AMC/Software/FREML.asp, 

has a reference to a paper for the Gaussian case.

But finding R code for the non-Gaussian case seems a very long shot.
Jarle Brinchmann wrote:
It did ...

  
    
#
Thanks for the reply!
On Tue, Dec 2, 2008 at 6:34 PM, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
This is indeed what I mean, poor phrasing of me. What I have is the
effectively the PDF for e & f for each instance, and I wish to get a &
b. For Gaussian errors there are certainly various ways to approach it
and the maximum-likelihood estimator is fine and is what I normally
use when my errors are sort-of-normal.

However in this instance my uncertainty estimates are strongly
non-Gaussian and even defining the mode of the distribution becomes
rather iffy so  I really prefer to sample the likelihoods properly.
Using the maximum-likelihood estimator naively in this case is not
terribly useful and I have no idea what the derived confidence limits
"means".

Ah well, I guess what I have to do at the moment is to use brute force
and try to calculate P(a,b|X,Y) properly using a marginalisation over
U (I hadn't done that before, I expect that was part of my problem).
Hopefully that will give reasonable uncertainty estimates, lots of
pain for a figure nobody will look at for much time :)

                 Thanks,
                     Jarle.
#
Isn't this a special case of structural equation modeling, handled 
by the 'sem' package? 

      Spencer
Jarle Brinchmann wrote:
#
Yes I think so if the errors were normally distributed. Unfortunately
I'm far from that but the combination of sem & its bootstrap is a good
way to deal with it in the normal case.

I must admit as a non-statistician I'm a not 100% sure what the
difference (if there is one) between a linear functional relationship
and a linear structural equation model is as they both deal with
hidden variables as far as I can see.

            J.
On Tue, Dec 2, 2008 at 9:33 PM, Spencer Graves <spencer.graves at pdf.com> wrote:
#
On Tue, 2 Dec 2008, Jarle Brinchmann wrote:

            
U and V are not 'variables' (not random variables) in a linear functional 
relationship (they are in the closely related linear structural 
relationship).