Skip to content

confidence / prediction ellipse

9 messages · Giuseppe Amatulli, John Fox, Bert Gunter +2 more

#
Well, I'd guess you have to first define what you mean by "prediction
ellipse," as the confidence ellipses are for the bivariate
distribution of 2 parameter estimates -- as I understand it --
whereas predictions depend on the covariate values and are for a
single response value (unless you have fitted multiple responses, I
suppose).

-- Bert

On Sat, Jan 26, 2013 at 1:12 PM, Giuseppe Amatulli
<giuseppe.amatulli at gmail.com> wrote:

  
    
#
You appear to be quite confused. I have no idea what your "a" and "b"
below mean. The SAS documentation that you quote is for prediction
from a bivariate normal. I don't know what that has to do with your
problem, nor with car's confidence ellipses for parameters for a
univariate regression. I would suggest you get local statistical help,
though perhaps someone with more time on their hands on this list than
I may help sort you out.

Cheers,
Bert

On Sun, Jan 27, 2013 at 8:41 AM, Giuseppe Amatulli
<giuseppe.amatulli at gmail.com> wrote:

  
    
#
Dear Giuseppe and Bert,

I also didn't follow what's intended, more or less for the same reasons as
Bert mentioned, which is why I didn't reply to the initial posting. In the
car package, confidenceEllipse() draws confidence ellipses for a pair of
coefficients from a statistical model, and dataEllipse() draws
bivariate-normal concentration ellipses for the bivariate distribution of
two variables.

I'm copying to Georges Monette and Michael Friendly, coauthors of these
functions, in case they have something to add.

I hope that this helps, but I doubt that it does.

John

-----------------------------------------------
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
#
All:

Aha! -- The light dawneth, methinks (maybe...)

Giuseppe: I re-read the SAS link you sent and if I have parsed it
correctly, what SAS chooses to call an ellipse for "prediction" -- a
rather idiosyncratic way to describe it, imo -- I believe the rest of
us would call a contour for a population quantile. The key phrase that
indicates this is "It [the elliptical region] also approximates a
region containing a specified percentage of the population. "

So, if I'm right, I believe you just need to use car's data.ellipse() function.

And if I'm wrong, my little corner of the globe remains cloaked in
darkness confusion.

Cheers,
Bert
On Sun, Jan 27, 2013 at 12:43 PM, John Fox <jfox at mcmaster.ca> wrote:

  
    
#
On Jan 27, 2013, at 8:16 PM, Giuseppe Amatulli wrote:

            
install.packages("hdrcde")  # from Rob Hyndman
require(hdrcde) x <- c(rnorm(200,0,1),rnorm(200,4,1))
y <- c(rnorm(200,0,1),rnorm(200,4,1))
par(mfrow=c(1,2))
plot(x,y, pch="+", cex=.5)
hdr.boxplot.2d(x,y)
#
I believe that the value of "radius" that you are using is incorrect. 
If you have a data
matrix X whose columns  are jointly distributed N(mu,Sigma) then a 
confidence
ellipse for mu is determined by

     n * (x - Xbar)' S^{-1}(x - Xbar) ~ T^2

where Xbar is the mean vector for X and S is the sample covariance matrix,
and where "T^2" means Hotelling's T-squared distribution, which is equal to

     (n-1)*2/(n-2) * F_{2,n-2}

the latter representing the F distribution on 2 and n-2 degrees of freedom.

Thus (I think) your radius should be

     radius <- sqrt(2 * (npts-1) * qf(0.95, 2, npts-2)/(npts*(npts-2)))

where npts <- length(a).  Note that it is qf(0.95,2,npts-2) and *NOT*
qf(0.95,2,npts-1).

To get the corresponding *prediction* ellipse simply multiply the foregoing
radius by sqrt(npts+1).  By "prediction ellipse" I mean an ellipse such that
the probability that a new independent observation from the same population
will fall in that ellipse is the given probability (e.g. 0.95). Note 
that this does
not mean that 95% of the data will fall in the calculated ellipse 
(basically because
of the *dependence* between S and the individual observations).

These confidence and prediction ellipses are (I'm pretty sure) valid under
the assumption that the data are (two dimensional, independent) Gaussian,
and that you use the sample covariance and sample mean as "shape" and
"centre".  I don't know what impact your robustification procedure of using
cov.trob() will/would have on the properties of these ellipses.

A script which does the ellipses for your toy data, using the sample 
covariance
and sample mean (rather than output from cov.trob()) is as follows:

#
# Script scr.amatulli
#

require(car)
a <- c(12,12,4,5,63,63,23)
b <- c(13,15,7,10,73,83,43)
npts   <- length(a)
shape  <- var(cbind(a, b))
center <- c(mean(a),mean(b))
rconf  <- sqrt(2 * (npts-1) * qf(0.95, 2, npts-2)/(npts*(npts-2)))
rpred  <- sqrt(npts+1)*rconf

conf.elip <- ellipse(center, shape, rconf,draw = FALSE)
pred.elip <- ellipse(center, shape, rpred,draw = FALSE)
plot(pred.elip, type='l')
points(a,b)
lines(conf.elip,col="red")

     cheers,

         Rolf Turner
On 01/27/2013 10:12 AM, Giuseppe Amatulli wrote: