On Dec 9, 2016, at 2:45 PM, Hu Xinghai <huxinghai1989 at gmail.com> wrote:
I come across the following error training Logistic Regression model using
cv.glmnet:
Error in drop(y %*% rep(1, nc)) : error in evaluating the argument 'x' in
selecting a method for function 'drop': Error in y %*% rep(1, nc) :
non-conformable arguments
error in evaluating the argument 'x' in selecting a method for function
'drop': Error in y %*% rep(1, nc) : non-conformable arguments
The error appears occasionally. However, since I need to run over a
parameter grid to optimize a parameter, the logistic regression needs to
run for multiple time; and therefore, almost certainly this error would be
hit.
Below is my code:
cellDF = df[(df$cell_id == cellid), ]
X = cellDF[, c(5:(ncol(cellDF)-2) )]
X$median_age = as.numeric(X$median_age)
X = data.matrix(X)
Y = cellDF$signup
impWeights = as.double(cellDF$trW)
has_NA = union(apply(is.na(X), 1, any), sapply(Y, is.na) )
has_NA = union(has_NA, sapply(impWeights, is.na))
X = X[!has_NA,]
Y = Y[!has_NA]
impWeights = impWeights[!has_NA]
nfolds = 8
YPosIdx = which(Y == 1)
YNegIdx = which(Y == 0)
LYPos = length(YPosIdx)
LYNeg = length(YNegIdx)
samplePos = sample(c(1:nfolds), LYPos, replace = TRUE)
sampleNeg = sample(c(1:nfolds), LYNeg, replace = TRUE)
order = match(c(1: length(Y)), c(YPosIdx, YNegIdx))
foldid = c(samplePos, sampleNeg)[order]
model = cv.glmnet(x = X, y = Y, weights = impWeights,
family="binomial", type.measure="auc", lambda = lambdaGrid, nfolds =
nfolds, foldid = foldid)
fit = predict(model, censusX, s = "lambda.1se", type = "response")
I read some posts online about the issue, suggesting that there might be
NA, and I should use data.matrix instead of as.matrix, and also I need to
fix foldid to make sure both positive and negative samples exists. I tried
all these tricks, but none helps.
Is there any thought about it?