Skip to content
Prev 310135 / 398502 Next

HELP! Excel and R give me totally different regression results using the exact same data

On Nov 7, 2012, at 11:47 AM, frauke wrote:

            
Well. the second point might be more correctly stated that the data do not meet the conditions for valid inference using linear regression. Since the goals of the exercise have never been stated, it is difficult to say whether other regression methods migh be more applicable.
That is generally the reason people use data.frames.
It shouldn't, but it seems unnecessarily convoluted and prone to errors.
That is only going to change the first element of 'collection'. You should study the help page for "[". If you were changing the first column it would need to be a different call on the LHS.
Again, possibly not what you thought you were doing.Lack of context prevents further analysis.
'data.frame':	3548 obs. of  5 variables:
 $ V1: num  1.91 1.9 1.93 2.16 1.9 1.87 1.87 2.01 2.8 2.11 ...
 $ V2: num  1.86 1.9 1.91 1.88 1.87 1.88 6.94 2.01 2.03 2.09 ...
 $ V3: num  1.89 1.94 1.9 1.85 1.86 1.88 2.01 2 2.03 2.06 ...
 $ V4: num  1.92 1.96 1.91 1.83 1.85 1.87 2.01 2.03 2.04 2.03 ...
 $ V5: num  2.1 2 1.93 1.92 1.85 1.86 2.02 2.15 2.08 2.03 ...
Call:
lm(formula = V1 ~ ., data = dat)

Coefficients:
(Intercept)           V2           V3           V4           V5  
     0.1291       0.3378       0.2079       0.2635       0.1460
Call:
lm(formula = V1 ~ ., data = dat)

Residuals:
     Min       1Q   Median       3Q      Max 
-13.3116  -0.1825  -0.0304   0.0959  27.0989 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.12906    0.03840   3.361 0.000784 ***
V2           0.33783    0.01768  19.111  < 2e-16 ***
V3           0.20789    0.01686  12.329  < 2e-16 ***
V4           0.26346    0.01784  14.768  < 2e-16 ***
V5           0.14596    0.01672   8.728  < 2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Residual standard error: 1.781 on 3543 degrees of freedom
Multiple R-squared: 0.7693,	Adjusted R-squared: 0.7691 
F-statistic:  2954 on 4 and 3543 DF,  p-value: < 2.2e-16
Hit <Return> to see next plot: 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot.png
Type: image/png
Size: 139409 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20121107/ecd2057a/attachment.png>
-------------- next part --------------


There appears to be quite a bit of "structure" in that plot.And a rather similar structure in 

with(dat, plot(V3, V1) )
What are these data and what are the scientific questions? You appear to think a) I can look over your shoulder and see your display and b) deduce your goals from extremely fragmentary evidence. I have a lower opinion of my ability to accomplish those tasks.
Not generally the biggest concern. But again you provide no code. Nabble-users are unfortunately notorious in rhelp for not reading the Posting Guide, and some do not seem even  to understand that rhelp is not Nabble.
Well, that second outcome would be the expected (even the desired) outcome of a regression wouldn't it? You would want the relationships to be in the prediction and the residuals to have zero correlations with
I'm rapidly running out of patience, however. Please read the PostingGuide more thoroughly than you appear to have done so far.
David Winsemius, MD
Alameda, CA, USA