stat question - not R question so ignore if not interested

Michael Kubovy · 2006-12-05T22:21:00Z

On Dec 5, 2006, at 3:42 PM, Leeds, Mark ((IED)) wrote: > If do a scattrplot of data ( x and y ) and there are two clouds of > points. One cloud is in the left > bottom corner of the plot and the other cloud is in the upper right. > > If I fit a regression line to this data ( or equivalently , > calculate a > correlation ), then obviously, it is going to seem like > x and y are related because a line has to be connected between the 2 > clouds. But, there must be a regression assumption that >

Michael Kubovy

Tue, Dec 5, 2006 2:21 PM

On Dec 5, 2006, at 3:42 PM, Leeds, Mark ((IED)) wrote:

One needs only to look at diagnostic plots:

Suppose
set.seed(2)
xy <- data.frame(y = c(rnorm(300), rnorm(300, 5)), x = c(rnorm(300),  
rnorm(300, 5)))
op <- par(mfrow = c(2,2))
plot(lm(y ~ x, xy))
par(op)

The model does not fit well because the residuals aren't flat as a  
function of fit and because homoscedasticity is violated.

When this happens we might try a different approach:
require(sm)
xy.sm <- sm.regression(xy$x, xy$y)

Whenever there's a big discrepancy between an OLS fit and a robust  
one, we should not pursue the OLS one w/o reinterpretation, which  
others have discussed in their replies.
_____________________________
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS:     P.O.Box 400400    Charlottesville, VA 22904-4400
Parcels:    Room 102        Gilmer Hall
         McCormick Road    Charlottesville, VA 22903
Office:    B011    +1-434-982-4729
Lab:        B019    +1-434-982-4751
Fax:        +1-434-982-4766
WWW:    http://www.people.virginia.edu/~mk9y/

stat question - not R question so ignore if not interested

Thread (2 messages)