Skip to content

Predict on zero-inflated model using design matrix and parameter coefficients

2 messages · Lee, Laura, Peter Solymos

#
Hi all,

I am interested in predicting on a data set using a ZIP model without using the predict function. The reason is that I don't want to have to refit the model in the new script each time I want to do a prediction. I know I will have to specify the estimated coefficients and the design matrix, but I don't know what the code will look like. I would appreciate any assistance. 

Cheers,

Laura

Laura M. Lee
Stock Assessment Program Manager
Division of Marine Fisheries
Department of Environmental Quality

252 808 8072??? office
Laura.Lee at ncdenr.gov

3441 Arendell Street
P.O. Box 769
Morehead City, NC 28557-0769



Email correspondence to and from this address is subject to the
North Carolina Public Records Law and may be disclosed to third parties.
#
Laura,

Depending on how you implement the ZIP model, the approach might be
slightly different. Using the pscl package you can do this:

r$> library(pscl)

r$> m <- zeroinfl(art ~ . | 1, data = bioChemists)

r$> cf0 <- coef(m, "zero")

r$> cf0
(Intercept)
  -1.681349

r$> cf1 <- coef(m, "count")

r$> cf1
 (Intercept)     femWomen   marMarried         kid5          phd
ment
 0.553995385 -0.231609021  0.131971507 -0.170473908  0.002525835
 0.021542720

r$> X0 <- model.matrix(m, "zero")

r$> head(X0)
  (Intercept)
1           1
2           1
3           1
4           1
5           1
6           1

r$> X1 <- model.matrix(m, "count")

r$> head(X1)
  (Intercept) femWomen marMarried kid5  phd ment
1           1        0          1    0 2.52    7
2           1        1          0    0 2.05    6
3           1        1          0    0 3.75    6
4           1        0          1    1 1.18    3
5           1        1          0    0 3.75   26
6           1        1          1    2 3.59    2

r$> pr <- drop((1 - plogis(X0 %*% cf0)) * exp(X1 %*% cf1))

r$> summary(pr - predict(m))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
      0       0       0       0       0       0

If you are predicting for a new data set, use model.matrix(~1, newdata) to
get X and make sure your inverse link functions are correct when using
probit or cloglog.

Cheers,

P?ter S?lymos
On Tue, May 3, 2022 at 8:11 AM Lee, Laura <laura.lee at ncdenr.gov> wrote: