thr3ads.net - R help - [R] error in "predict.gam" used with "bam" [Jul 2013]

If this information is useful, please help other people find it:
Share via:

julian.bothe at elitepartner.de

2013-Jul-08 09:02 UTC

[R] error in "predict.gam" used with "bam"

Hello everyone.



I am doing a logistic gam (package mgcv) on a pretty large dataframe
(130.000 cases with 100 variables).

Because of that, the gam is fitted on a random subset of 10000. Now when I
want to predict the values for the rest of the data, I get the following
error:




> gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1,
+
newdata=activisale_join[gam.basis_alleakti.1.complete_cases,all.vars(gam.b
asis_alleakti.1.formula)],type="response")

Error in predict.gam(gam.basis_alleakti.1, newdata
activisale_join[gam.basis_alleakti.1.complete_cases,  :

  number of items to replace is not a multiple of replacement length





The following is the code:

#formula with some factors and a lot of variables to be fitted

gam.basis_alleakti.1.formula=as.formula( paste("verlängerung ~“,

      paste( names(activisale_join)[c(2:10)], collapse="+"), ##factors


paste("s(",names(activisale_join)[c(17,19:29,31:42,44)],")",
collapse="+")) # numeric variables, all count data

)



# complete cases

gam.basis_alleakti.1.complete_cases
complete.cases(activisale_join[,all.vars(gam.basis_alleakti.1.formula) ])



# modell fitting works on random subset

gam.basis_alleakti.1=bam(gam.basis_alleakti.1.formula,

                         data = activisale_join[subset.10000, ],
family"binomial")



# error, no idea why

gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1,
newdata=activisale_join[gam.basis_alleakti.1.complete_cases,
],type="response")





the prediction on the same subset (subset.10000) works.





It could be that this error is somewhat similar to that described as
sidequestion in

http://r.789695.n4.nabble.com/gamm-tensor-product-and-interaction-td452618
8.html, where simon answered the following:



“>  Here is the error message I obtain:>
vis.gam(gm1$gam,plot.type="contour",n.grid=200,color="heat",zlim=c(0,4))>  Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) :number of items to replace is not a multiple of replacement length
- hmm, possibly a bug. I'll look into it.

best,
Simon“



All the best



Julian



Ps.: > version
               _
platform       x86_64-w64-mingw32
arch           x86_64
os             mingw32
system         x86_64, mingw32
status
major          3
minor          0.1
year           2013
month          05
day            16
svn rev        62743
language       R
version.string R version 3.0.1 (2013-05-16)
nickname       Good Sport



package mgcv version 1.7-22




	[[alternative HTML version deleted]]

Simon Wood

2013-Jul-09 07:07 UTC

head link

[R] error in "predict.gam" used with "bam"

Hi Julian,

Any chance you could send me (offline) a short version of your data, 
which reproduces the problem? I can't reproduce it in a quick attempt 
(but it is quite puzzling, given that bam calls predict.gam internally 
in pretty much the same way that you are doing here).

btw (and nothing to do with the error) given that you are using R 3.0.1 
it's a good idea to upgrade to mgcv_1.7-23 or above, for the following 
reason (taken from the mgcv changeLog)

1.7-23
------

*** Fix of severe bug introduced with R 2.15.2 LAPACK change. The 
shipped version of dsyevr can fail to produce orthogonal eigenvectors 
when uplo='U' (upper triangle of symmetric matrix used), as opposed to 
'L'. This led to a substantial number of gam smoothing parameter 
estimation convergence failures, as the key stabilizing 
re-parameterization was substantially degraded. The issue did not affect 
gaussian additive models with GCV model selection. Other models could 
fail to converge any further as soon as any smoothing parameter became 
`large', as happens when a smooth is estimated as a straight line. 
check.gam reported the lack of full convergence, but the issue could 
also generate complete fit failures. Picked up late as full test suite 
had only been run on R > 2.15.1 with an external LAPACK.

best,
Simon


On 08/07/13 10:02, julian.bothe at elitepartner.de
wrote:> Hello everyone.
>
>
>
> I am doing a logistic gam (package mgcv) on a pretty large dataframe
> (130.000 cases with 100 variables).
>
> Because of that, the gam is fitted on a random subset of 10000. Now when I
> want to predict the values for the rest of the data, I get the following
> error:
>
>
>
>
>
>> gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1,
>
> +
> newdata=activisale_join[gam.basis_alleakti.1.complete_cases,all.vars(gam.b
> asis_alleakti.1.formula)],type="response")
>
> Error in predict.gam(gam.basis_alleakti.1, newdata >
activisale_join[gam.basis_alleakti.1.complete_cases,  :
>
>    number of items to replace is not a multiple of replacement length
>
>
>
>
>
> The following is the code:
>
> #formula with some factors and a lot of variables to be fitted
>
> gam.basis_alleakti.1.formula=as.formula( paste("verl?ngerung ~?,
>
>        paste( names(activisale_join)[c(2:10)], collapse="+"),
##factors
>
>
>
paste("s(",names(activisale_join)[c(17,19:29,31:42,44)],")",
> collapse="+")) # numeric variables, all count data
>
> )
>
>
>
> # complete cases
>
> gam.basis_alleakti.1.complete_cases >
complete.cases(activisale_join[,all.vars(gam.basis_alleakti.1.formula) ])
>
>
>
> # modell fitting works on random subset
>
> gam.basis_alleakti.1=bam(gam.basis_alleakti.1.formula,
>
>                           data = activisale_join[subset.10000, ],
family> "binomial")
>
>
>
> # error, no idea why
>
> gam.basis_alleakti.1.pr=predict(gam.basis_alleakti.1,
> newdata=activisale_join[gam.basis_alleakti.1.complete_cases,
> ],type="response")
>
>
>
>
>
> the prediction on the same subset (subset.10000) works.
>
>
>
>
>
> It could be that this error is somewhat similar to that described as
> sidequestion in
>
> http://r.789695.n4.nabble.com/gamm-tensor-product-and-interaction-td452618
> 8.html, where simon answered the following:
>
>
>
> ?>  Here is the error message I obtain:
>>
>
vis.gam(gm1$gam,plot.type="contour",n.grid=200,color="heat",zlim=c(0,4))
>>   Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) :
> number of items to replace is not a multiple of replacement length
> - hmm, possibly a bug. I'll look into it.
>
> best,
> Simon?
>
>
>
> All the best
>
>
>
> Julian
>
>
>
> Ps.: > version
>                 _
> platform       x86_64-w64-mingw32
> arch           x86_64
> os             mingw32
> system         x86_64, mingw32
> status
> major          3
> minor          0.1
> year           2013
> month          05
> day            16
> svn rev        62743
> language       R
> version.string R version 3.0.1 (2013-05-16)
> nickname       Good Sport
>
>
>
> package mgcv version 1.7-22
>
>
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283

Seemingly Similar Threads

Search for more reasonably related threads

R help - Jul 2013 - error in "predict.gam" used with "bam"

[R] error in "predict.gam" used with "bam"

[R] error in "predict.gam" used with "bam"

Seemingly Similar Threads