G'morning
What does the error message "Error in x %*% coef(object) : non-
conformable arguments" indicate when calculating the response values
for
newdata with a model from bigglm (in package biglm), and how can I
debug it? I am attempting to do Monte Carlo simulations, which may
explain the loop in the code that follows. After the code I
have included the output, which shows that the simulations are
changing the response and input values, and that there are not any
atypical values for the
factors in the seventh iteration. At the end of the output is the
aforementioned error message. Finally, I have included the model from
biglm.
Thanks in advance!
Code:
=======
iter <- nrow(nov.2010)
predict.nov.2011 <- vector(mode='numeric', length=iter)
for (i in 1:iter) {
iter.df <- nov.2010
##---------- Update values of dynamic variables ------------------
iter.df$age <- iter.df$age + 12
iter.df$pct_utilize <-
iter.df$pct_utilize + mc.util.delta[i]
iter.df$updated_varname1 <-
ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])
if(iter.df$state=="WI")
iter.df$varname3 <- iter.df$varname3 + mc.wi.varname3.delta[i]
if(iter.df$state=="MN")
iter.df$varname3 <- iter.df$varname3 + mc.mn.varname3.delta[i]
if(iter.df$state=="IL")
iter.df$varname3 <- iter.df$varname3 + mc.il.varname3.delta[i]
if(iter.df$state=="US")
iter.df$varname3 <- iter.df$varname3 + mc.us.varname3.delta[i]
##--- Bin Variables ------------------
iter.df$bin_varname1 <- as.factor(recode(iter.df$updated_varname1,
"300:499 = '300 - 499';
500:549 = '500 - 549';
550:599 = '550 - 599';
600:649 = '600 - 649';
650:699 = '650 - 699';
700:749 = '700 - 749';
750:799 = '750 - 799'; 800:849 = 'GE 800'; else =
'missing';
"))
iter.df$bin_age <- as.factor(recode(iter.df$age,
"0:23 = ' < 24mo.';
24:72 = '24 - 72mo.';
72:300 = '72 - 300mo'; else = 'missing';
"))
iter.df$bin_util <- as.factor(recode(iter.df$pct_utilize,
"0.0:0.2 = ' 0 - 20%';
0.2:0.4 = ' 20 - 40%';
0.4:0.6 = ' 40 - 60%';
0.6:0.8 = ' 60 - 80%';
0.8:1.0 = ' 80 - 100%';
1.0:1.2 = '100 - 120%'; else = 'missing';
"))
iter.df$bin_varname2 <- as.factor(recode(iter.df$varname2_prop,
"0:70 = ' < 70%';
70:85 = ' 70 - 85%';
85:95 = ' 85 - 95%';
95:110 = '95 - 110%'; else = 'missing';
"))
iter.df$bin_varname1 <- relevel(iter.df$bin_varname1, 'missing')
iter.df$bin_age <- relevel(iter.df$bin_age, 'missing')
iter.df$bin_util <- relevel(iter.df$bin_util, 'missing')
iter.df$bin_varname2 <- relevel(iter.df$bin_varname2, 'missing')
#~ print(head(iter.df))
if (i>=6 & i<=8){
print('---------------------------------')
browser()
print(i)
print(table(iter.df$bin_varname1))
print(table(iter.df$bin_age))
print(table(iter.df$bin_util))
print(table(iter.df$bin_varname2))
#~ debug(predict.nov.2011[i] <-
#~ sum(predict(logModel.1, newdata=iter.df,
type='response')))
}
predict.nov.2011[i] <-
sum(predict(logModel.1, newdata=iter.df, type='response'))
print(predict.nov.2011[i])
}
Output
==========
[1] 36.56073
[1] 561.4516
[1] 4.83483
[1] 5.01398
[1] 7.984146
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 6
missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799 GE 800
842 283 690 1094 1695 3404
6659 18374 21562
missing < 24mo. 24 - 72mo. 72 - 300mo
16 2997 19709 31881
missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
- 120%
17906 4832 4599 5154 7205
14865 42
missing < 70% 70 - 85% 85 - 95% 95 - 110%
10423 19429 10568 8350 5833
[1] 11.04090
[1] "---------------------------------"
Called from: top level
Browse[1]>
[1] 7
missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
750 - 799
847 909 1059 1586 3214 6304
16349 24335
missing < 24mo. 24 - 72mo. 72 - 300mo
16 2997 19709 31881
missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
- 120%
17145 4972 4617 5020 6634
16139 76
missing < 70% 70 - 85% 85 - 95% 95 - 110%
10423 19429 10568 8350 5833
Error in x %*% coef(object) : non-conformable arguments
Model
=======
Large data regression model: bigglm(outcome ~ bin_varname1 +
bin_varname2 + bin_age + bin_util +
state + varname3 + varname3:state, family = binomial(link =
"logit"),
data = dev.data, maxit = 75, sandwich = FALSE)
Sample size = 1372250
debug biglm response error on bigglm model
3 messages · Mike Harwood, Greg Snow
Not sure, but one possible candidate problem is that in your simulations one iteration ended up with fewer levels of a factor than the overall dataset and that caused the error. There is no recode function in the default packages, there are at least 6 recode functions in other packages, we cannot tell which you were using from the code below.
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Mike Harwood
> Sent: Monday, January 10, 2011 6:29 AM
> To: r-help at r-project.org
> Subject: [R] debug biglm response error on bigglm model
>
> G'morning
>
> What does the error message "Error in x %*% coef(object) : non-
> conformable arguments" indicate when calculating the response values
> for
> newdata with a model from bigglm (in package biglm), and how can I
> debug it? I am attempting to do Monte Carlo simulations, which may
> explain the loop in the code that follows. After the code I
> have included the output, which shows that the simulations are
> changing the response and input values, and that there are not any
> atypical values for the
> factors in the seventh iteration. At the end of the output is the
> aforementioned error message. Finally, I have included the model from
> biglm.
>
> Thanks in advance!
>
> Code:
> =======
> iter <- nrow(nov.2010)
> predict.nov.2011 <- vector(mode='numeric', length=iter)
> for (i in 1:iter) {
> iter.df <- nov.2010
> ##---------- Update values of dynamic variables ------------------
> iter.df$age <- iter.df$age + 12
> iter.df$pct_utilize <-
> iter.df$pct_utilize + mc.util.delta[i]
>
> iter.df$updated_varname1 <-
> ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])
>
> if(iter.df$state=="WI")
> iter.df$varname3 <- iter.df$varname3 + mc.wi.varname3.delta[i]
> if(iter.df$state=="MN")
> iter.df$varname3 <- iter.df$varname3 + mc.mn.varname3.delta[i]
> if(iter.df$state=="IL")
> iter.df$varname3 <- iter.df$varname3 + mc.il.varname3.delta[i]
> if(iter.df$state=="US")
> iter.df$varname3 <- iter.df$varname3 + mc.us.varname3.delta[i]
>
> ##--- Bin Variables ------------------
> iter.df$bin_varname1 <- as.factor(recode(iter.df$updated_varname1,
> "300:499 = '300 - 499';
> 500:549 = '500 - 549';
> 550:599 = '550 - 599';
> 600:649 = '600 - 649';
> 650:699 = '650 - 699';
> 700:749 = '700 - 749';
> 750:799 = '750 - 799'; 800:849 = 'GE 800'; else =
> 'missing';
> "))
> iter.df$bin_age <- as.factor(recode(iter.df$age,
> "0:23 = ' < 24mo.';
> 24:72 = '24 - 72mo.';
> 72:300 = '72 - 300mo'; else = 'missing';
> "))
> iter.df$bin_util <- as.factor(recode(iter.df$pct_utilize,
> "0.0:0.2 = ' 0 - 20%';
> 0.2:0.4 = ' 20 - 40%';
> 0.4:0.6 = ' 40 - 60%';
> 0.6:0.8 = ' 60 - 80%';
> 0.8:1.0 = ' 80 - 100%';
> 1.0:1.2 = '100 - 120%'; else = 'missing';
> "))
> iter.df$bin_varname2 <- as.factor(recode(iter.df$varname2_prop,
> "0:70 = ' < 70%';
> 70:85 = ' 70 - 85%';
> 85:95 = ' 85 - 95%';
> 95:110 = '95 - 110%'; else = 'missing';
> "))
> iter.df$bin_varname1 <- relevel(iter.df$bin_varname1, 'missing')
> iter.df$bin_age <- relevel(iter.df$bin_age, 'missing')
> iter.df$bin_util <- relevel(iter.df$bin_util, 'missing')
> iter.df$bin_varname2 <- relevel(iter.df$bin_varname2, 'missing')
>
> #~ print(head(iter.df))
> if (i>=6 & i<=8){
> print('---------------------------------')
> browser()
> print(i)
> print(table(iter.df$bin_varname1))
> print(table(iter.df$bin_age))
> print(table(iter.df$bin_util))
> print(table(iter.df$bin_varname2))
> #~ debug(predict.nov.2011[i] <-
> #~ sum(predict(logModel.1, newdata=iter.df,
> type='response')))
> }
>
> predict.nov.2011[i] <-
> sum(predict(logModel.1, newdata=iter.df, type='response'))
>
> print(predict.nov.2011[i])
>
> }
>
> Output
> ==========
> [1] 36.56073
> [1] 561.4516
> [1] 4.83483
> [1] 5.01398
> [1] 7.984146
> [1] "---------------------------------"
> Called from: top level
> Browse[1]>
> [1] 6
>
> missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
> 750 - 799 GE 800
> 842 283 690 1094 1695 3404
> 6659 18374 21562
>
> missing < 24mo. 24 - 72mo. 72 - 300mo
> 16 2997 19709 31881
>
> missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
> - 120%
> 17906 4832 4599 5154 7205
> 14865 42
>
> missing < 70% 70 - 85% 85 - 95% 95 - 110%
> 10423 19429 10568 8350 5833
> [1] 11.04090
> [1] "---------------------------------"
> Called from: top level
> Browse[1]>
> [1] 7
>
> missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749
> 750 - 799
> 847 909 1059 1586 3214 6304
> 16349 24335
>
> missing < 24mo. 24 - 72mo. 72 - 300mo
> 16 2997 19709 31881
>
> missing 0 - 20% 20 - 40% 40 - 60% 60 - 80% 80 - 100% 100
> - 120%
> 17145 4972 4617 5020 6634
> 16139 76
>
> missing < 70% 70 - 85% 85 - 95% 95 - 110%
> 10423 19429 10568 8350 5833
> Error in x %*% coef(object) : non-conformable arguments
>
> Model
> =======
> Large data regression model: bigglm(outcome ~ bin_varname1 +
> bin_varname2 + bin_age + bin_util +
> state + varname3 + varname3:state, family = binomial(link =
> "logit"),
> data = dev.data, maxit = 75, sandwich = FALSE)
> Sample size = 1372250
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
1 day later
Thank you, Greg. The issue was in the simulation logic, where one of the values was not changing correctly for some iterations...
On Jan 10, 3:20?pm, Greg Snow <Greg.S... at imail.org> wrote:
Not sure, but one possible candidate problem is that in your simulations one iteration ended up with fewer levels of a factor than the overall dataset and that caused the error. There is no recode function in the default packages, there are at least 6 recode functions in other packages, we cannot tell which you were using from the code below. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s... at imail.org 801.408.8111
-----Original Message----- From: r-help-boun... at r-project.org [mailto:r-help-bounces at r- project.org] On Behalf Of Mike Harwood Sent: Monday, January 10, 2011 6:29 AM To: r-h... at r-project.org Subject: [R] debug biglm response error on bigglm model
G'morning
What does the error message "Error in x %*% coef(object) : non- conformable arguments" indicate when calculating the response values for newdata with a model from bigglm (in package biglm), and how can I debug it? ?I am attempting to do Monte Carlo simulations, which may explain the loop in the code that follows. ?After the code I have included the output, which shows that the simulations are changing the response and input values, and that there are not any atypical values for the factors in the seventh iteration. ?At the end of the output is the aforementioned error message. ?Finally, I have included the model from biglm.
Thanks in advance!
Code:
=======
iter <- nrow(nov.2010)
predict.nov.2011 <- vector(mode='numeric', length=iter)
for (i in 1:iter) {
? ? iter.df <- nov.2010
? ? ##---------- Update values of dynamic variables ------------------
? ? iter.df$age <- iter.df$age + 12
? ? iter.df$pct_utilize <-
? ? ? ? iter.df$pct_utilize + mc.util.delta[i]
? ? iter.df$updated_varname1 <- ? ? ? ? ceiling(iter.df$updated_varname1 + mc.varname1.delta[i])
? ? if(iter.df$state=="WI") ? ? ? ? iter.df$varname3 <- iter.df$varname3 + mc.wi.varname3.delta[i] ? ? if(iter.df$state=="MN") ? ? ? ? iter.df$varname3 <- iter.df$varname3 + mc.mn.varname3.delta[i] ? ? if(iter.df$state=="IL") ? ? ? ? iter.df$varname3 <- iter.df$varname3 + mc.il.varname3.delta[i] ? ? if(iter.df$state=="US") ? ? ? ? iter.df$varname3 <- iter.df$varname3 + mc.us.varname3.delta[i]
? ? ##--- Bin Variables ------------------ ? ? iter.df$bin_varname1 <- as.factor(recode(iter.df$updated_varname1, ? ? ? ? "300:499 = '300 - 499'; ? ? ? ? ?500:549 = '500 - 549'; ? ? ? ? ?550:599 = '550 - 599'; ? ? ? ? ?600:649 = '600 - 649'; ? ? ? ? ?650:699 = '650 - 699'; ? ? ? ? ?700:749 = '700 - 749'; ? ? ? ? ?750:799 = '750 - 799'; 800:849 = 'GE 800'; else ? ?= 'missing'; ? ? ? ? ?")) ? ? iter.df$bin_age <- as.factor(recode(iter.df$age, ? ? ? ? "0:23 ? = ' < 24mo.'; ? ? ? ? ?24:72 ?= '24 - 72mo.'; ? ? ? ? ?72:300 = '72 - 300mo'; else ? = 'missing'; ? ? ? ? ?")) ? ? iter.df$bin_util <- as.factor(recode(iter.df$pct_utilize, ? ? ? ? "0.0:0.2 = ' ?0 - 20%'; ? ? ? ? ?0.2:0.4 = ' ?20 - 40%'; ? ? ? ? ?0.4:0.6 = ' ?40 - 60%'; ? ? ? ? ?0.6:0.8 = ' ?60 - 80%'; ? ? ? ? ?0.8:1.0 = ' 80 - 100%'; ? ? ? ? ?1.0:1.2 = '100 - 120%'; else ? ?= 'missing'; ? ? ? ? ?")) ? ? iter.df$bin_varname2 <- as.factor(recode(iter.df$varname2_prop, ? ? ? ? "0:70 = ' ? ?< 70%'; ? ? ? ? ?70:85 = ' 70 - 85%'; ? ? ? ? ?85:95 = ' 85 - 95%'; ? ? ? ? ?95:110 = '95 - 110%'; else ?= ?'missing'; ? ? ? ? ?")) ? ? iter.df$bin_varname1 <- relevel(iter.df$bin_varname1, 'missing') ? ? iter.df$bin_age <- relevel(iter.df$bin_age, 'missing') ? ? iter.df$bin_util <- relevel(iter.df$bin_util, 'missing') ? ? iter.df$bin_varname2 <- relevel(iter.df$bin_varname2, 'missing')
#~ ? ? print(head(iter.df))
? ? if (i>=6 & i<=8){
? ? ? ? ?print('---------------------------------')
? ? ? ? ?browser()
? ? ? ? ?print(i)
? ? ? ? ?print(table(iter.df$bin_varname1))
? ? ? ? ?print(table(iter.df$bin_age))
? ? ? ? ?print(table(iter.df$bin_util))
? ? ? ? ?print(table(iter.df$bin_varname2))
#~ ? ? ? ? debug(predict.nov.2011[i] <-
#~ ? ? ? ? ? ? ?sum(predict(logModel.1, newdata=iter.df,
type='response')))
? ? ?}
? ? predict.nov.2011[i] <- ? ? ? ? ?sum(predict(logModel.1, newdata=iter.df, type='response'))
? ? print(predict.nov.2011[i])
? }
Output ========== [1] 36.56073 [1] 561.4516 [1] 4.83483 [1] 5.01398 [1] 7.984146 [1] "---------------------------------" Called from: top level Browse[1]> [1] 6
? missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749 750 - 799 ? ?GE 800 ? ? ? 842 ? ? ? 283 ? ? ? 690 ? ? ?1094 ? ? ?1695 ? ? ?3404 6659 ? ? 18374 ? ? 21562
? ?missing ? ?< 24mo. 24 - 72mo. 72 - 300mo ? ? ? ? 16 ? ? ? 2997 ? ? ?19709 ? ? ?31881
? ?missing ? ?0 - 20% ? 20 - 40% ? 40 - 60% ? 60 - 80% ?80 - 100% 100 - 120% ? ? ?17906 ? ? ? 4832 ? ? ? 4599 ? ? ? 5154 ? ? ? 7205 14865 ? ? ? ? 42
? missing ? ? < 70% ?70 - 85% ?85 - 95% 95 - 110% ? ? 10423 ? ? 19429 ? ? 10568 ? ? ?8350 ? ? ?5833 [1] 11.04090 [1] "---------------------------------" Called from: top level Browse[1]> [1] 7
? missing 300 - 499 500 - 549 550 - 599 600 - 649 650 - 699 700 - 749 750 - 799 ? ? ? 847 ? ? ? 909 ? ? ?1059 ? ? ?1586 ? ? ?3214 ? ? ?6304 16349 ? ? 24335
? ?missing ? ?< 24mo. 24 - 72mo. 72 - 300mo ? ? ? ? 16 ? ? ? 2997 ? ? ?19709 ? ? ?31881
? ?missing ? ?0 - 20% ? 20 - 40% ? 40 - 60% ? 60 - 80% ?80 - 100% 100 - 120% ? ? ?17145 ? ? ? 4972 ? ? ? 4617 ? ? ? 5020 ? ? ? 6634 16139 ? ? ? ? 76
? missing ? ? < 70% ?70 - 85% ?85 - 95% 95 - 110% ? ? 10423 ? ? 19429 ? ? 10568 ? ? ?8350 ? ? ?5833 Error in x %*% coef(object) : non-conformable arguments
Model ======= Large data regression model: bigglm(outcome ~ bin_varname1 + bin_varname2 + bin_age + bin_util + ? ? state + varname3 + varname3:state, family = binomial(link = "logit"), ? ? data = dev.data, maxit = 75, sandwich = FALSE) Sample size = ?1372250
______________________________________________ R-h... at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.- Hide quoted text - - Show quoted text -