I would appreciate pointers on what I should read to understand this
output:
summary(lm(TDS ~ Cond + Ca + Cl + Mg + Na + SO4))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
When I look at the summary for the data frame used for this model I do not
see an excessive number of missing values or indications why there are no
residual degrees of freedom. The same model applied to 8 other data frames
did not produce similar results.
Puzzled,
Rich
Interpreting Multiple Linear Regression Summary
15 messages · David Winsemius, Daniel Nordlund, Marc Schwartz +4 more
Please see ?dput use dput(your data) and paste the output into a reply, thanks. This way we know what you are working with.
Rich Shepard wrote:
I would appreciate pointers on what I should read to understand this
output:
summary(lm(TDS ~ Cond + Ca + Cl + Mg + Na + SO4))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
When I look at the summary for the data frame used for this model I do
not
see an excessive number of missing values or indications why there are no
residual degrees of freedom. The same model applied to 8 other data frames
did not produce similar results.
Puzzled,
Rich
______________________________________________ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://r.789695.n4.nabble.com/Interpreting-Multiple-Linear-Regression-Summary-tp4020516p4020567.html Sent from the R help mailing list archive at Nabble.com.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Rich Shepard
Sent: Wednesday, November 09, 2011 9:05 AM
To: r-help at r-project.org
Subject: [R] Interpreting Multiple Linear Regression Summary
I would appreciate pointers on what I should read to understand this
output:
summary(lm(TDS ~ Cond + Ca + Cl + Mg + Na + SO4))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
When I look at the summary for the data frame used for this model I do
not
see an excessive number of missing values or indications why there are no
residual degrees of freedom. The same model applied to 8 other data frames
did not produce similar results.
Puzzled,
Rich
Rich, I don't see a 'data=' parameter in your call to lm(). How does lm() know where to find the variables referenced in the model parameter? If that is not the problem, then we need to see str() output for the data frame that you are analyzing. Dan Daniel Nordlund Bothell, WA USA
On Nov 9, 2011, at 12:04 PM, Rich Shepard wrote:
I would appreciate pointers on what I should read to understand this output: summary(lm(TDS ~ Cond + Ca + Cl + Mg + Na + SO4))
I don't see a data= argument specified, so you are telling lm() that your workspace has individual vectors by those names in the formula. That is not what is implied by hte rest of your message.
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
When I look at the summary for the data frame used for this model I
do not
see an excessive number of missing values or indications why there
are no
residual degrees of freedom. The same model applied to 8 other data
frames
did not produce similar results.
Puzzled,
Rich
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD West Hartford, CT
On Wed, 9 Nov 2011, David Winsemius wrote:
I don't see a data= argument specified, so you are telling lm() that your workspace has individual vectors by those names in the formula. That is not what is implied by hte rest of your message.
David, That's because I attached the data frame before running the model. However, looking again at the scatter plots of the individual predictor variables with the response variable answered my question after I posted it. There are no patterns to the relationships in these scatter plots so there's nothing to model. I became caught up in the repetitive processing for all these data and stopped really seeing what was in front of me. My apologies to the list, Rich
-----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Rich Shepard Sent: Wednesday, November 09, 2011 9:42 AM To: r-help at r-project.org Subject: Re: [R] Interpreting Multiple Linear Regression Summary On Wed, 9 Nov 2011, David Winsemius wrote:
I don't see a data= argument specified, so you are telling lm() that
your
workspace has individual vectors by those names in the formula. That is not what is implied by hte rest of your message.
David, That's because I attached the data frame before running the model. However, looking again at the scatter plots of the individual predictor variables with the response variable answered my question after I posted it. There are no patterns to the relationships in these scatter plots so there's nothing to model. I became caught up in the repetitive processing for all these data and stopped really seeing what was in front of me. My apologies to the list, Rich
Rich, the problem is not just that there was 'nothing to model.' If that were the case, you would have gotten non-significant parameter estimates, not NA's. I would guess that there is something problematic with the how the data frame is structured relative to what lm() is expecting. So, I would not give up looking for a solution just yet. Can you show us the result of str() on the data frame that you attached? Dan Daniel Nordlund Bothell, WA USA
On Wed, 9 Nov 2011, Daniel Nordlund wrote:
I would guess that there is something problematic with the how the data frame is structured relative to what lm() is expecting.
Dan, I was not comfortable with my explanation, but the formula (and data frame) was equivalent to those of the other 8 streams.
So, I would not give up looking for a solution just yet.
OK. I'm always up for learning more about R and its processes.
Can you show us the result of str() on the data frame that you attached?
Sure. I subset the original data frame to select only the 6 predictor
variables and the response variable. Same lm() results. I'll provide the
data frame, too.
summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data =
mod.stump.cast))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
str(mod.stump.cast)
'data.frame': 64 obs. of 7 variables:
$ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ...
$ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
$ Cond: num NA NA 190 187 184 NA NA NA NA NA ...
$ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ...
$ Na : num NA NA NA NA NA NA NA NA NA NA ...
$ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
$ TDS : num 105 181 112 144 114 308 96 430 108 108 ...
summary(mod.stump.cast)
Ca Cl Cond Mg Na
Min. : 0.60 Min. : 1.000 Min. : 2.2 Min. : 9.10 Min. : 4
1st Qu.:23.35 1st Qu.: 2.000 1st Qu.:214.8 1st Qu.:11.00 1st Qu.: 4
Median :28.35 Median : 4.000 Median :282.5 Median :17.40 Median : 4
Mean :32.77 Mean : 4.076 Mean :294.6 Mean :17.85 Mean : 4
3rd Qu.:40.55 3rd Qu.: 5.600 3rd Qu.:372.0 3rd Qu.:22.10 3rd Qu.: 4
Max. :64.30 Max. :13.000 Max. :636.0 Max. :32.40 Max. : 4
NA's :50.00 NA's :11.000 NA's : 42.0 NA's :51.00 NA's :62
SO4 TDS
Min. : 4.00 Min. : 14.0
1st Qu.: 7.00 1st Qu.:131.2
Median : 9.40 Median :174.0
Mean : 16.31 Mean :176.9
3rd Qu.: 17.00 3rd Qu.:195.5
Max. :105.00 Max. :430.0
NA's : 3.00 NA's : 2.0
mod.stump.cast
Ca Cl Cond Mg Na SO4 TDS
1 NA 1.58 NA NA NA 9.4 105
2 NA 5.60 NA NA NA 6.5 181
3 24.4 3.00 190.0 10.0 NA 9.0 112
4 NA NA 187.0 NA NA NA 144
5 21.4 1.00 184.0 9.1 NA 7.0 114
6 NA 5.00 NA NA NA 55.0 308
7 NA 1.20 NA NA NA 6.8 96
8 NA 4.00 NA NA NA 105.0 430
9 NA 4.00 NA NA NA 15.6 108
10 NA 8.40 NA NA NA 8.4 108
11 NA 1.00 NA NA NA 8.8 125
12 NA 1.40 NA NA NA 19.4 129
13 NA 4.90 NA NA NA 37.0 360
14 NA 1.70 NA NA NA 12.0 140
15 NA 2.00 NA NA NA 10.0 95
16 NA 1.60 NA NA NA 9.1 120
17 NA 3.30 NA NA NA 34.0 280
18 NA 2.20 NA NA NA 11.0 130
19 NA 9.00 NA NA NA 69.0 352
20 NA 1.00 NA NA NA 18.0 148
21 NA 2.00 NA NA NA 9.0 107
22 28.0 1.00 248.0 11.0 4 13.0 125
23 32.0 1.00 NA 12.0 4 9.0 139
24 NA 5.00 NA NA NA 7.0 188
25 NA 4.00 NA NA NA 6.0 201
26 NA 3.00 NA NA NA 5.0 178
27 NA 2.27 NA NA NA 7.8 197
28 NA 1.76 NA NA NA 7.8 187
29 NA 5.81 NA NA NA 7.5 182
30 NA 4.23 NA NA NA 6.0 165
31 NA 4.23 NA NA NA 7.3 186
32 NA 6.25 NA NA NA 7.0 191
33 NA 6.72 NA NA NA 7.5 190
34 34.7 4.00 304.0 17.4 NA 6.0 176
35 NA NA 354.0 NA NA 7.0 175
36 42.5 5.00 379.0 21.1 NA 7.0 220
37 NA 5.80 NA NA NA 5.6 163
38 26.0 5.80 300.0 24.0 NA 5.6 163
39 NA 2.20 NA NA NA 5.4 152
40 NA 5.40 NA NA NA 11.0 221
41 NA 5.40 NA NA NA 10.5 171
42 NA 4.80 NA NA NA 9.9 204
43 NA 8.00 NA NA NA 11.7 174
44 NA 1.00 NA NA NA 8.4 190
45 NA 4.80 NA NA NA 12.1 174
46 NA 5.90 NA NA NA 16.0 210
47 NA 5.90 NA NA NA 20.0 190
48 NA 13.00 NA NA NA 7.6 180
49 NA 5.60 NA NA NA 17.0 200
50 NA 1.20 NA NA NA 6.5 180
51 0.6 NA 2.2 NA NA NA NA
52 21.4 NA 187.0 9.5 NA 8.0 120
53 NA NA 285.0 NA NA 22.0 135
54 48.3 3.00 378.0 22.1 NA 24.0 228
55 63.5 7.00 533.0 29.9 NA 44.0 14
56 NA NA 207.0 NA NA NA NA
57 NA NA 262.0 NA NA 13.0 156
58 28.7 2.00 244.0 12.6 NA 13.0 140
59 NA NA 238.0 NA NA 12.0 128
60 NA NA 280.0 NA NA 18.0 160
61 NA NA 380.0 NA NA 23.0 215
62 NA NA 402.0 NA NA 23.0 230
63 64.3 7.00 636.0 32.4 NA 73.0 316
64 23.0 4.10 300.0 21.0 NA 4.0 163
Thanks,
Rich
On Nov 9, 2011, at 1:17 PM, Rich Shepard wrote:
On Wed, 9 Nov 2011, Daniel Nordlund wrote:
I would guess that there is something problematic with the how the data frame is structured relative to what lm() is expecting.
Dan, I was not comfortable with my explanation, but the formula (and data frame) was equivalent to those of the other 8 streams.
So, I would not give up looking for a solution just yet.
OK. I'm always up for learning more about R and its processes.
Can you show us the result of str() on the data frame that you attached?
Sure. I subset the original data frame to select only the 6 predictor
variables and the response variable. Same lm() results. I'll provide the
data frame, too.
summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data =
mod.stump.cast))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
str(mod.stump.cast)
'data.frame': 64 obs. of 7 variables:
$ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ...
$ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
$ Cond: num NA NA 190 187 184 NA NA NA NA NA ...
$ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ...
$ Na : num NA NA NA NA NA NA NA NA NA NA ...
$ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
$ TDS : num 105 181 112 144 114 308 96 430 108 108 ...
summary(mod.stump.cast)
Ca Cl Cond Mg Na
Min. : 0.60 Min. : 1.000 Min. : 2.2 Min. : 9.10 Min. : 4
1st Qu.:23.35 1st Qu.: 2.000 1st Qu.:214.8 1st Qu.:11.00 1st Qu.: 4
Median :28.35 Median : 4.000 Median :282.5 Median :17.40 Median : 4
Mean :32.77 Mean : 4.076 Mean :294.6 Mean :17.85 Mean : 4
3rd Qu.:40.55 3rd Qu.: 5.600 3rd Qu.:372.0 3rd Qu.:22.10 3rd Qu.: 4
Max. :64.30 Max. :13.000 Max. :636.0 Max. :32.40 Max. : 4
NA's :50.00 NA's :11.000 NA's : 42.0 NA's :51.00 NA's :62
SO4 TDS
Min. : 4.00 Min. : 14.0
1st Qu.: 7.00 1st Qu.:131.2
Median : 9.40 Median :174.0
Mean : 16.31 Mean :176.9
3rd Qu.: 17.00 3rd Qu.:195.5
Max. :105.00 Max. :430.0
NA's : 3.00 NA's : 2.0
mod.stump.cast
Ca Cl Cond Mg Na SO4 TDS
1 NA 1.58 NA NA NA 9.4 105
2 NA 5.60 NA NA NA 6.5 181
3 24.4 3.00 190.0 10.0 NA 9.0 112
4 NA NA 187.0 NA NA NA 144
5 21.4 1.00 184.0 9.1 NA 7.0 114
6 NA 5.00 NA NA NA 55.0 308
7 NA 1.20 NA NA NA 6.8 96
8 NA 4.00 NA NA NA 105.0 430
9 NA 4.00 NA NA NA 15.6 108
10 NA 8.40 NA NA NA 8.4 108
11 NA 1.00 NA NA NA 8.8 125
12 NA 1.40 NA NA NA 19.4 129
13 NA 4.90 NA NA NA 37.0 360
14 NA 1.70 NA NA NA 12.0 140
15 NA 2.00 NA NA NA 10.0 95
16 NA 1.60 NA NA NA 9.1 120
17 NA 3.30 NA NA NA 34.0 280
18 NA 2.20 NA NA NA 11.0 130
19 NA 9.00 NA NA NA 69.0 352
20 NA 1.00 NA NA NA 18.0 148
21 NA 2.00 NA NA NA 9.0 107
22 28.0 1.00 248.0 11.0 4 13.0 125
23 32.0 1.00 NA 12.0 4 9.0 139
24 NA 5.00 NA NA NA 7.0 188
25 NA 4.00 NA NA NA 6.0 201
26 NA 3.00 NA NA NA 5.0 178
27 NA 2.27 NA NA NA 7.8 197
28 NA 1.76 NA NA NA 7.8 187
29 NA 5.81 NA NA NA 7.5 182
30 NA 4.23 NA NA NA 6.0 165
31 NA 4.23 NA NA NA 7.3 186
32 NA 6.25 NA NA NA 7.0 191
33 NA 6.72 NA NA NA 7.5 190
34 34.7 4.00 304.0 17.4 NA 6.0 176
35 NA NA 354.0 NA NA 7.0 175
36 42.5 5.00 379.0 21.1 NA 7.0 220
37 NA 5.80 NA NA NA 5.6 163
38 26.0 5.80 300.0 24.0 NA 5.6 163
39 NA 2.20 NA NA NA 5.4 152
40 NA 5.40 NA NA NA 11.0 221
41 NA 5.40 NA NA NA 10.5 171
42 NA 4.80 NA NA NA 9.9 204
43 NA 8.00 NA NA NA 11.7 174
44 NA 1.00 NA NA NA 8.4 190
45 NA 4.80 NA NA NA 12.1 174
46 NA 5.90 NA NA NA 16.0 210
47 NA 5.90 NA NA NA 20.0 190
48 NA 13.00 NA NA NA 7.6 180
49 NA 5.60 NA NA NA 17.0 200
50 NA 1.20 NA NA NA 6.5 180
51 0.6 NA 2.2 NA NA NA NA
52 21.4 NA 187.0 9.5 NA 8.0 120
53 NA NA 285.0 NA NA 22.0 135
54 48.3 3.00 378.0 22.1 NA 24.0 228
55 63.5 7.00 533.0 29.9 NA 44.0 14
56 NA NA 207.0 NA NA NA NA
57 NA NA 262.0 NA NA 13.0 156
58 28.7 2.00 244.0 12.6 NA 13.0 140
59 NA NA 238.0 NA NA 12.0 128
60 NA NA 280.0 NA NA 18.0 160
61 NA NA 380.0 NA NA 23.0 215
62 NA NA 402.0 NA NA 23.0 230
63 64.3 7.00 636.0 32.4 NA 73.0 316
64 23.0 4.10 300.0 21.0 NA 4.0 163
Thanks,
Rich
Here is your problem:
# 'DF' is the result of copying your data above from the
# clipboard on OSX
DF <- read.table(pipe("pbpaste"), header = TRUE)
str(DF)
'data.frame': 64 obs. of 7 variables: $ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ... $ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ... $ Cond: num NA NA 190 187 184 NA NA NA NA NA ... $ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ... $ Na : int NA NA NA NA NA NA NA NA NA NA ... $ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ... $ TDS : int 105 181 112 144 114 308 96 430 108 108 ?
na.omit(DF)
Ca Cl Cond Mg Na SO4 TDS 22 28 1 248 11 4 13 125 After removing incomplete records (any records with NA values) which is the default behavior for R model functions, you only have one record left to fit the model to. HTH, Marc Schwartz
As far as I know if there is an NA in any variable in an observation the default is to drop the entire observation. Thus there are no observations in your calculation Best Regards John
On 9 November 2011 19:17, Rich Shepard <rshepard at appl-ecosys.com> wrote:
On Wed, 9 Nov 2011, Daniel Nordlund wrote:
I would guess that there is something problematic with the how the data frame is structured relative to what lm() is expecting.
Dan, ?I was not comfortable with my explanation, but the formula (and data frame) was equivalent to those of the other 8 streams.
So, I would not give up looking for a solution just yet.
?OK. I'm always up for learning more about R and its processes.
Can you show us the result of str() on the data frame that you attached?
?Sure. I subset the original data frame to select only the 6 predictor variables and the response variable. Same lm() results. I'll provide the data frame, too. summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast)) Call: lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast) Residuals: ALL 1 residuals are 0: no residual degrees of freedom! Coefficients: (6 not defined because of singularities) ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|) (Intercept) ? ? ?125 ? ? ? ? NA ? ? ?NA ? ? ? NA Cond ? ? ? ? ? ? ?NA ? ? ? ? NA ? ? ?NA ? ? ? NA Ca ? ? ? ? ? ? ? ?NA ? ? ? ? NA ? ? ?NA ? ? ? NA Cl ? ? ? ? ? ? ? ?NA ? ? ? ? NA ? ? ?NA ? ? ? NA Mg ? ? ? ? ? ? ? ?NA ? ? ? ? NA ? ? ?NA ? ? ? NA Na ? ? ? ? ? ? ? ?NA ? ? ? ? NA ? ? ?NA ? ? ? NA SO4 ? ? ? ? ? ? ? NA ? ? ? ? NA ? ? ?NA ? ? ? NA Residual standard error: NaN on 0 degrees of freedom ?(63 observations deleted due to missingness) ?str(mod.stump.cast) 'data.frame': ? 64 obs. of ?7 variables: ?$ Ca ?: num ?NA NA 24.4 NA 21.4 NA NA NA NA NA ... ?$ Cl ?: num ?1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ... ?$ Cond: num ?NA NA 190 187 184 NA NA NA NA NA ... ?$ Mg ?: num ?NA NA 10 NA 9.1 NA NA NA NA NA ... ?$ Na ?: num ?NA NA NA NA NA NA NA NA NA NA ... ?$ SO4 : num ?9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ... ?$ TDS : num ?105 181 112 144 114 308 96 430 108 108 ... summary(mod.stump.cast) ? ? ? Ca ? ? ? ? ? ? ?Cl ? ? ? ? ? ? ?Cond ? ? ? ? ? ? Mg ? ? ? ? ? ? ?Na ?Min. ? : 0.60 ? Min. ? : 1.000 ? Min. ? : ?2.2 ? Min. ? : 9.10 ? Min. ? : 4 ?1st Qu.:23.35 ? 1st Qu.: 2.000 ? 1st Qu.:214.8 ? 1st Qu.:11.00 ? 1st Qu.: 4 ?Median :28.35 ? Median : 4.000 ? Median :282.5 ? Median :17.40 ? Median : 4 ?Mean ? :32.77 ? Mean ? : 4.076 ? Mean ? :294.6 ? Mean ? :17.85 ? Mean ? : 4 ?3rd Qu.:40.55 ? 3rd Qu.: 5.600 ? 3rd Qu.:372.0 ? 3rd Qu.:22.10 ? 3rd Qu.: 4 ?Max. ? :64.30 ? Max. ? :13.000 ? Max. ? :636.0 ? Max. ? :32.40 ? Max. ? : 4 ?NA's ? :50.00 ? NA's ? :11.000 ? NA's ? : 42.0 ? NA's ? :51.00 ? NA's ? :62 ? ? ?SO4 ? ? ? ? ? ? ?TDS ?Min. ? : ?4.00 ? Min. ? : 14.0 ?1st Qu.: ?7.00 ? 1st Qu.:131.2 ?Median : ?9.40 ? Median :174.0 ?Mean ? : 16.31 ? Mean ? :176.9 ?3rd Qu.: 17.00 ? 3rd Qu.:195.5 ?Max. ? :105.00 ? Max. ? :430.0 ?NA's ? : ?3.00 ? NA's ? : ?2.0 ?mod.stump.cast ? ? Ca ? ?Cl ?Cond ? Mg Na ? SO4 TDS 1 ? ?NA ?1.58 ? ?NA ? NA NA ? 9.4 105 2 ? ?NA ?5.60 ? ?NA ? NA NA ? 6.5 181 3 ?24.4 ?3.00 190.0 10.0 NA ? 9.0 112 4 ? ?NA ? ?NA 187.0 ? NA NA ? ?NA 144 5 ?21.4 ?1.00 184.0 ?9.1 NA ? 7.0 114 6 ? ?NA ?5.00 ? ?NA ? NA NA ?55.0 308 7 ? ?NA ?1.20 ? ?NA ? NA NA ? 6.8 ?96 8 ? ?NA ?4.00 ? ?NA ? NA NA 105.0 430 9 ? ?NA ?4.00 ? ?NA ? NA NA ?15.6 108 10 ? NA ?8.40 ? ?NA ? NA NA ? 8.4 108 11 ? NA ?1.00 ? ?NA ? NA NA ? 8.8 125 12 ? NA ?1.40 ? ?NA ? NA NA ?19.4 129 13 ? NA ?4.90 ? ?NA ? NA NA ?37.0 360 14 ? NA ?1.70 ? ?NA ? NA NA ?12.0 140 15 ? NA ?2.00 ? ?NA ? NA NA ?10.0 ?95 16 ? NA ?1.60 ? ?NA ? NA NA ? 9.1 120 17 ? NA ?3.30 ? ?NA ? NA NA ?34.0 280 18 ? NA ?2.20 ? ?NA ? NA NA ?11.0 130 19 ? NA ?9.00 ? ?NA ? NA NA ?69.0 352 20 ? NA ?1.00 ? ?NA ? NA NA ?18.0 148 21 ? NA ?2.00 ? ?NA ? NA NA ? 9.0 107 22 28.0 ?1.00 248.0 11.0 ?4 ?13.0 125 23 32.0 1.00 ? ?NA 12.0 4 9.0 139 24 ? NA ?5.00 ? ?NA ? NA NA ? 7.0 188 25 ? NA ?4.00 ? ?NA ? NA NA ? 6.0 201 26 ? NA ?3.00 ? ?NA ? NA NA ? 5.0 178 27 ? NA ?2.27 ? ?NA ? NA NA ? 7.8 197 28 ? NA ?1.76 ? ?NA ? NA NA ? 7.8 187 29 ? NA ?5.81 ? ?NA ? NA NA ? 7.5 182 30 ? NA ?4.23 ? ?NA ? NA NA ? 6.0 165 31 ? NA ?4.23 ? ?NA ? NA NA ? 7.3 186 32 ? NA ?6.25 ? ?NA ? NA NA ? 7.0 191 33 ? NA ?6.72 ? ?NA ? NA NA ? 7.5 190 34 34.7 ?4.00 304.0 17.4 NA ? 6.0 176 35 ? NA ? ?NA 354.0 ? NA NA ? 7.0 175 36 42.5 ?5.00 379.0 21.1 NA ? 7.0 220 37 ? NA ?5.80 ? ?NA ? NA NA ? 5.6 163 38 26.0 ?5.80 300.0 24.0 NA ? 5.6 163 39 ? NA ?2.20 ? ?NA ? NA NA ? 5.4 152 40 ? NA ?5.40 ? ?NA ? NA NA ?11.0 221 41 ? NA ?5.40 ? ?NA ? NA NA ?10.5 171 42 ? NA ?4.80 ? ?NA ? NA NA ? 9.9 204 43 ? NA ?8.00 ? ?NA ? NA NA ?11.7 174 44 ? NA ?1.00 ? ?NA ? NA NA ? 8.4 190 45 ? NA ?4.80 ? ?NA ? NA NA ?12.1 174 46 ? NA ?5.90 ? ?NA ? NA NA ?16.0 210 47 ? NA ?5.90 ? ?NA ? NA NA ?20.0 190 48 ? NA 13.00 ? ?NA ? NA NA ? 7.6 180 49 ? NA ?5.60 ? ?NA ? NA NA ?17.0 200 50 ? NA ?1.20 ? ?NA ? NA NA ? 6.5 180 51 ?0.6 ? ?NA ? 2.2 ? NA NA ? ?NA ?NA 52 21.4 ? ?NA 187.0 ?9.5 NA ? 8.0 120 53 ? NA ? ?NA 285.0 ? NA NA ?22.0 135 54 48.3 ?3.00 378.0 22.1 NA ?24.0 228 55 63.5 ?7.00 533.0 29.9 NA ?44.0 ?14 56 ? NA ? ?NA 207.0 ? NA NA ? ?NA ?NA 57 ? NA ? ?NA 262.0 ? NA NA ?13.0 156 58 28.7 ?2.00 244.0 12.6 NA ?13.0 140 59 ? NA ? ?NA 238.0 ? NA NA ?12.0 128 60 ? NA ? ?NA 280.0 ? NA NA ?18.0 160 61 ? NA ? ?NA 380.0 ? NA NA ?23.0 215 62 ? NA ? ?NA 402.0 ? NA NA ?23.0 230 63 64.3 ?7.00 636.0 32.4 NA ?73.0 316 64 23.0 ?4.10 300.0 21.0 NA ? 4.0 163 Thanks, Rich
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com
On Wed, 9 Nov 2011, Marc Schwartz wrote:
# 'DF' is the result of copying your data above from the
# clipboard on OSX
DF <- read.table(pipe("pbpaste"), header = TRUE)
Marc, Oh? I don't do Apple so there's no OSX here.
After removing incomplete records (any records with NA values) which is the default behavior for R model functions, you only have one record left to fit the model to.
That's what I saw from the scatter plots. Thanks, Rich
On Nov 9, 2011, at 2:17 PM, Rich Shepard wrote:
On Wed, 9 Nov 2011, Daniel Nordlund wrote:
I would guess that there is something problematic with the how the data frame is structured relative to what lm() is expecting.
Dan, I was not comfortable with my explanation, but the formula (and data frame) was equivalent to those of the other 8 streams.
So, I would not give up looking for a solution just yet.
OK. I'm always up for learning more about R and its processes.
I count exactly 1 line in the data.frame below that have all columns with non-NA values. It should be no surprise that its 'TDS' value (=125) is the same as the estimated Intercept. I cannot understand why you mislead us to such an extent about the degree of missing-ness in that data. (Failing to indicate that you have attached a dataframe is also very discourteous.)
David. > >> Can you show us the result of str() on the data frame that you >> attached? > > Sure. I subset the original data frame to select only the 6 predictor > variables and the response variable. Same lm() results. I'll provide > the > data frame, too. > > summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = > mod.stump.cast)) > > Call: > lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = > mod.stump.cast) > > Residuals: > ALL 1 residuals are 0: no residual degrees of freedom! > > Coefficients: (6 not defined because of singularities) > Estimate Std. Error t value Pr(>|t|) > (Intercept) 125 NA NA NA > Cond NA NA NA NA > Ca NA NA NA NA > Cl NA NA NA NA > Mg NA NA NA NA > Na NA NA NA NA > SO4 NA NA NA NA > > Residual standard error: NaN on 0 degrees of freedom > (63 observations deleted due to missingness) > > str(mod.stump.cast) > 'data.frame': 64 obs. of 7 variables: > $ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ... > $ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ... > $ Cond: num NA NA 190 187 184 NA NA NA NA NA ... > $ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ... > $ Na : num NA NA NA NA NA NA NA NA NA NA ... > $ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ... > $ TDS : num 105 181 112 144 114 308 96 430 108 108 ... > > summary(mod.stump.cast) > Ca Cl Cond > Mg Na > Min. : 0.60 Min. : 1.000 Min. : 2.2 Min. : 9.10 > Min. : 4 > 1st Qu.:23.35 1st Qu.: 2.000 1st Qu.:214.8 1st Qu.:11.00 1st > Qu.: 4 > Median :28.35 Median : 4.000 Median :282.5 Median :17.40 > Median : 4 > Mean :32.77 Mean : 4.076 Mean :294.6 Mean :17.85 > Mean : 4 > 3rd Qu.:40.55 3rd Qu.: 5.600 3rd Qu.:372.0 3rd Qu.:22.10 3rd > Qu.: 4 > Max. :64.30 Max. :13.000 Max. :636.0 Max. :32.40 > Max. : 4 > NA's :50.00 NA's :11.000 NA's : 42.0 NA's :51.00 > NA's :62 > SO4 TDS > Min. : 4.00 Min. : 14.0 > 1st Qu.: 7.00 1st Qu.:131.2 > Median : 9.40 Median :174.0 > Mean : 16.31 Mean :176.9 > 3rd Qu.: 17.00 3rd Qu.:195.5 > Max. :105.00 Max. :430.0 > NA's : 3.00 NA's : 2.0 > > mod.stump.cast > Ca Cl Cond Mg Na SO4 TDS > 1 NA 1.58 NA NA NA 9.4 105 > 2 NA 5.60 NA NA NA 6.5 181 > 3 24.4 3.00 190.0 10.0 NA 9.0 112 > 4 NA NA 187.0 NA NA NA 144 > 5 21.4 1.00 184.0 9.1 NA 7.0 114 > 6 NA 5.00 NA NA NA 55.0 308 > 7 NA 1.20 NA NA NA 6.8 96 > 8 NA 4.00 NA NA NA 105.0 430 > 9 NA 4.00 NA NA NA 15.6 108 > 10 NA 8.40 NA NA NA 8.4 108 > 11 NA 1.00 NA NA NA 8.8 125 > 12 NA 1.40 NA NA NA 19.4 129 > 13 NA 4.90 NA NA NA 37.0 360 > 14 NA 1.70 NA NA NA 12.0 140 > 15 NA 2.00 NA NA NA 10.0 95 > 16 NA 1.60 NA NA NA 9.1 120 > 17 NA 3.30 NA NA NA 34.0 280 > 18 NA 2.20 NA NA NA 11.0 130 > 19 NA 9.00 NA NA NA 69.0 352 > 20 NA 1.00 NA NA NA 18.0 148 > 21 NA 2.00 NA NA NA 9.0 107 > 22 28.0 1.00 248.0 11.0 4 13.0 125 > 23 32.0 1.00 NA 12.0 4 9.0 139 > 24 NA 5.00 NA NA NA 7.0 188 > 25 NA 4.00 NA NA NA 6.0 201 > 26 NA 3.00 NA NA NA 5.0 178 > 27 NA 2.27 NA NA NA 7.8 197 > 28 NA 1.76 NA NA NA 7.8 187 > 29 NA 5.81 NA NA NA 7.5 182 > 30 NA 4.23 NA NA NA 6.0 165 > 31 NA 4.23 NA NA NA 7.3 186 > 32 NA 6.25 NA NA NA 7.0 191 > 33 NA 6.72 NA NA NA 7.5 190 > 34 34.7 4.00 304.0 17.4 NA 6.0 176 > 35 NA NA 354.0 NA NA 7.0 175 > 36 42.5 5.00 379.0 21.1 NA 7.0 220 > 37 NA 5.80 NA NA NA 5.6 163 > 38 26.0 5.80 300.0 24.0 NA 5.6 163 > 39 NA 2.20 NA NA NA 5.4 152 > 40 NA 5.40 NA NA NA 11.0 221 > 41 NA 5.40 NA NA NA 10.5 171 > 42 NA 4.80 NA NA NA 9.9 204 > 43 NA 8.00 NA NA NA 11.7 174 > 44 NA 1.00 NA NA NA 8.4 190 > 45 NA 4.80 NA NA NA 12.1 174 > 46 NA 5.90 NA NA NA 16.0 210 > 47 NA 5.90 NA NA NA 20.0 190 > 48 NA 13.00 NA NA NA 7.6 180 > 49 NA 5.60 NA NA NA 17.0 200 > 50 NA 1.20 NA NA NA 6.5 180 > 51 0.6 NA 2.2 NA NA NA NA > 52 21.4 NA 187.0 9.5 NA 8.0 120 > 53 NA NA 285.0 NA NA 22.0 135 > 54 48.3 3.00 378.0 22.1 NA 24.0 228 > 55 63.5 7.00 533.0 29.9 NA 44.0 14 > 56 NA NA 207.0 NA NA NA NA > 57 NA NA 262.0 NA NA 13.0 156 > 58 28.7 2.00 244.0 12.6 NA 13.0 140 > 59 NA NA 238.0 NA NA 12.0 128 > 60 NA NA 280.0 NA NA 18.0 160 > 61 NA NA 380.0 NA NA 23.0 215 > 62 NA NA 402.0 NA NA 23.0 230 > 63 64.3 7.00 636.0 32.4 NA 73.0 316 > 64 23.0 4.10 300.0 21.0 NA 4.0 163 > > Thanks, > > Rich > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT
On Wed, 9 Nov 2011, John C Frain wrote:
As far as I know if there is an NA in any variable in an observation the default is to drop the entire observation. Thus there are no observations in your calculation
John, Hadn't realized that. I know there are NA's in other data frames that yield model results. Perhaps it is the excessive numbers in this set that are the problem. Thanks, Rich
On 09-Nov-11 19:39:54, Rich Shepard wrote:
On Wed, 9 Nov 2011, John C Frain wrote:
As far as I know if there is an NA in any variable in an observation the default is to drop the entire observation. Thus there are no observations in your calculation
John, Hadn't realized that. I know there are NA's in other data frames that yield model results. Perhaps it is the excessive numbers in this set that are the problem. Thanks, Rich
It is not so much the number of NAs, as the number of observations that get dropped through having at least 1 NA. Provided enough observations remain to get a meaningful fit, you will be OK (though interpretation may be dubious). Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.harding at wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 09-Nov-11 Time: 20:06:24 ------------------------------ XFMail ------------------------------
There is only one row with a complete set of observations; I think lm() is throwing out the rest.
Rich Shepard wrote:
On Wed, 9 Nov 2011, John C Frain wrote:
As far as I know if there is an NA in any variable in an observation the default is to drop the entire observation. Thus there are no observations in your calculation
John, Hadn't realized that. I know there are NA's in other data frames that yield model results. Perhaps it is the excessive numbers in this set that are the problem. Thanks, Rich
______________________________________________ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://r.789695.n4.nabble.com/Interpreting-Multiple-Linear-Regression-Summary-tp4020516p4021352.html Sent from the R help mailing list archive at Nabble.com.
This is the output of dput(your data)
structure(list(Ca = c(NA, NA, 24.4, NA, 21.4, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 28, 32, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, 34.7, NA, 42.5, NA, 26, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.6, 21.4, NA, 48.3,
63.5, NA, NA, 28.7, NA, NA, NA, NA, 64.3, 23), Cl = c(1.58, 5.6,
3, NA, 1, 5, 1.2, 4, 4, 8.4, 1, 1.4, 4.9, 1.7, 2, 1.6, 3.3, 2.2,
9, 1, 2, 1, 1, 5, 4, 3, 2.27, 1.76, 5.81, 4.23, 4.23, 6.25, 6.72,
4, NA, 5, 5.8, 5.8, 2.2, 5.4, 5.4, 4.8, 8, 1, 4.8, 5.9, 5.9,
13, 5.6, 1.2, NA, NA, NA, 3, 7, NA, NA, 2, NA, NA, NA, NA, 7,
4.1), Cond = c(NA, NA, 190, 187, 184, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 248, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 304, 354, 379, NA, 300, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, 2.2, 187, 285, 378, 533,
207, 262, 244, 238, 280, 380, 402, 636, 300), Mg = c(NA, NA,
10, NA, 9.1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, 11, 12, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
17.4, NA, 21.1, NA, 24, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, 9.5, NA, 22.1, 29.9, NA, NA, 12.6, NA, NA, NA, NA,
32.4, 21), Na = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 4L, 4L, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA), SO4 = c(9.4, 6.5, 9, NA, 7, 55, 6.8, 105,
15.6, 8.4, 8.8, 19.4, 37, 12, 10, 9.1, 34, 11, 69, 18, 9, 13,
9, 7, 6, 5, 7.8, 7.8, 7.5, 6, 7.3, 7, 7.5, 6, 7, 7, 5.6, 5.6,
5.4, 11, 10.5, 9.9, 11.7, 8.4, 12.1, 16, 20, 7.6, 17, 6.5, NA,
8, 22, 24, 44, NA, 13, 13, 12, 18, 23, 23, 73, 4), TDS = c(105L,
181L, 112L, 144L, 114L, 308L, 96L, 430L, 108L, 108L, 125L, 129L,
360L, 140L, 95L, 120L, 280L, 130L, 352L, 148L, 107L, 125L, 139L,
188L, 201L, 178L, 197L, 187L, 182L, 165L, 186L, 191L, 190L, 176L,
175L, 220L, 163L, 163L, 152L, 221L, 171L, 204L, 174L, 190L, 174L,
210L, 190L, 180L, 200L, 180L, NA, 120L, 135L, 228L, 14L, NA,
156L, 140L, 128L, 160L, 215L, 230L, 316L, 163L)), .Names = c("Ca",
"Cl", "Cond", "Mg", "Na", "SO4", "TDS"), class = "data.frame", row.names =
c(NA,
-64L))
B77S wrote:
Please see ?dput use dput(your data) and paste the output into a reply, thanks. This way we know what you are working with. Rich Shepard wrote:
I would appreciate pointers on what I should read to understand this
output:
summary(lm(TDS ~ Cond + Ca + Cl + Mg + Na + SO4))
Call:
lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (6 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 125 NA NA NA
Cond NA NA NA NA
Ca NA NA NA NA
Cl NA NA NA NA
Mg NA NA NA NA
Na NA NA NA NA
SO4 NA NA NA NA
Residual standard error: NaN on 0 degrees of freedom
(63 observations deleted due to missingness)
When I look at the summary for the data frame used for this model I do
not
see an excessive number of missing values or indications why there are no
residual degrees of freedom. The same model applied to 8 other data
frames
did not produce similar results.
Puzzled,
Rich
______________________________________________ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- View this message in context: http://r.789695.n4.nabble.com/Interpreting-Multiple-Linear-Regression-Summary-tp4020516p4021355.html Sent from the R help mailing list archive at Nabble.com.