> On Dec 9, 2016, at 2:45 PM, Hu Xinghai <huxinghai1989 at gmail.com>
wrote:
>
> I come across the following error training Logistic Regression model using
> cv.glmnet:
>
>> Error in drop(y %*% rep(1, nc)) : error in evaluating the argument
'x' in
>> selecting a method for function 'drop': Error in y %*% rep(1,
nc) :
>> non-conformable arguments
>> error in evaluating the argument 'x' in selecting a method for
function
>> 'drop': Error in y %*% rep(1, nc) : non-conformable arguments
>
>
> The error appears occasionally. However, since I need to run over a
> parameter grid to optimize a parameter, the logistic regression needs to
> run for multiple time; and therefore, almost certainly this error would be
> hit.
>
> Below is my code:
>
>> cellDF = df[(df$cell_id == cellid), ]
>> X = cellDF[, c(5:(ncol(cellDF)-2) )]
>> X$median_age = as.numeric(X$median_age)
>> X = data.matrix(X)
>> Y = cellDF$signup
>> impWeights = as.double(cellDF$trW)
>> has_NA = union(apply(is.na(X), 1, any), sapply(Y, is.na) )
>> has_NA = union(has_NA, sapply(impWeights, is.na))
>> X = X[!has_NA,]
>> Y = Y[!has_NA]
>> impWeights = impWeights[!has_NA]
>> nfolds = 8
>> YPosIdx = which(Y == 1)
>> YNegIdx = which(Y == 0)
>> LYPos = length(YPosIdx)
>> LYNeg = length(YNegIdx)
>> samplePos = sample(c(1:nfolds), LYPos, replace = TRUE)
>> sampleNeg = sample(c(1:nfolds), LYNeg, replace = TRUE)
>> order = match(c(1: length(Y)), c(YPosIdx, YNegIdx))
>> foldid = c(samplePos, sampleNeg)[order]
>> model = cv.glmnet(x = X, y = Y, weights = impWeights,
>> family="binomial", type.measure="auc", lambda =
lambdaGrid, nfolds >> nfolds, foldid = foldid)
>> fit = predict(model, censusX, s = "lambda.1se", type =
"response")
>
>
> I read some posts online about the issue, suggesting that there might be
> NA, and I should use data.matrix instead of as.matrix, and also I need to
> fix foldid to make sure both positive and negative samples exists. I tried
> all these tricks, but none helps.
>
> Is there any thought about it?
This duplicates a posting on StackOverflow. If you read the Posting Guide you
will find advice cautioning you against cross-posting. If you do not get a
satisfactory answer in a reasonable interval which I would say would be 24 hours
at a minimum you can justify cross-posting but it should be accompanied by a
reference to the original posting>
http://stackoverflow.com/questions/41070103/glmnet-error-non-conformable-arguments
The posting guide (as well as the StackOverflow help pages) ask the you post
sufficient sample data to support demonstration and testing. You already have
two close votes on the basis of failing to do that. On StackOverflow you can
search on `[MCVE]` and "great reproducible example" or you can read
the relevant sections of the Posting Guide. Using dput() to post data objects
delivers the most specificity in data structure.
--
david.
>
> Thanks
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA