Martin Maechler
2023-Oct-23 08:38 UTC
[R] running crossvalidation many times MSE for Lasso regression
>>>>> Jin Li >>>>> on Mon, 23 Oct 2023 15:42:14 +1100 writes:> If you are interested in other validation methods (e.g., LOO or n-fold) > with more predictive accuracy measures, the function, glmnetcv, in the spm2 > package can be directly used, and some reproducible examples are > also available in ?glmnetcv. ... and once you open that can of w..: the glmnet package itself contains a function cv.glmnet() which we (our students) use when teaching. What's the advantage of the spm2 package ? At least, the glmnet package is authored by the same who originated and first published (as in "peer reviewed" ..) these algorithms. > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch <murdoch.duncan at gmail.com> > wrote: >> On 22/10/2023 7:01 p.m., Bert Gunter wrote: >> > No error message shown Please include the error message so that it is >> > not necessary to rerun your code. This might enable someone to see the >> > problem without running the code (e.g. downloading packages, etc.) >> >> And it's not necessarily true that someone else would see the same error >> message. >> >> Duncan Murdoch >> >> > >> > -- Bert >> > >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help >> > <r-help at r-project.org> wrote: >> >> >> >> Dear R-experts, >> >> >> >> Here below my R code with an error message. Can somebody help me to fix >> this error? >> >> Really appreciate your help. >> >> >> >> Best, >> >> >> >> ############################################################ >> >> # MSE CROSSVALIDATION Lasso regression >> >> >> >> library(glmnet) >> >> >> >> >> >> >> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91) >> >> >> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9) >> >> >> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2) >> >> T=data.frame(y,x1,x2) >> >> >> >> z=matrix(c(x1,x2), ncol=2) >> >> cv_model=glmnet(z,y,alpha=1) >> >> best_lambda=cv_model$lambda.min >> >> best_lambda >> >> >> >> >> >> # Create a list to store the results >> >> lst<-list() >> >> >> >> # This statement does the repetitions (looping) >> >> for(i in 1 :1000) { >> >> >> >> n=45 >> >> >> >> p=0.667 >> >> >> >> sam=sample(1 :n,floor(p*n),replace=FALSE) >> >> >> >> Training =T [sam,] >> >> Testing = T [-sam,] >> >> >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2) >> >> >> >> predictLasso=predict(cv_model, newx=test1) >> >> >> >> >> >> ypred=predict(predictLasso,newdata=test1) >> >> y=T[-sam,]$y >> >> >> >> MSE = mean((y-ypred)^2) >> >> MSE >> >> lst[i]<-MSE >> >> } >> >> mean(unlist(lst)) >> >> ################################################################## >> >> >> >> >> >> >> >> >> >> ______________________________________________ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- > Jin > ------------------------------------------ > Jin Li, PhD > Founder, Data2action, Australia > https://www.researchgate.net/profile/Jin_Li32 > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en > [[alternative HTML version deleted]] > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Ben Bolker
2023-Oct-23 17:58 UTC
[R] running crossvalidation many times MSE for Lasso regression
For what it's worth it looks like spm2 is specifically for *spatial* predictive modeling; presumably its version of CV is doing something spatially aware. I agree that glmnet is old and reliable. One might want to use a tidymodels wrapper to create pipelines where you can more easily switch among predictive algorithms (see the `parsnip` package), but otherwise sticking to glmnet seems wise. On 2023-10-23 4:38 a.m., Martin Maechler wrote:>>>>>> Jin Li >>>>>> on Mon, 23 Oct 2023 15:42:14 +1100 writes: > > > If you are interested in other validation methods (e.g., LOO or n-fold) > > with more predictive accuracy measures, the function, glmnetcv, in the spm2 > > package can be directly used, and some reproducible examples are > > also available in ?glmnetcv. > > ... and once you open that can of w..: the glmnet package itself > contains a function cv.glmnet() which we (our students) use when teaching. > > What's the advantage of the spm2 package ? > At least, the glmnet package is authored by the same who originated and > first published (as in "peer reviewed" ..) these algorithms. > > > > > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch <murdoch.duncan at gmail.com> > > wrote: > > >> On 22/10/2023 7:01 p.m., Bert Gunter wrote: > >> > No error message shown Please include the error message so that it is > >> > not necessary to rerun your code. This might enable someone to see the > >> > problem without running the code (e.g. downloading packages, etc.) > >> > >> And it's not necessarily true that someone else would see the same error > >> message. > >> > >> Duncan Murdoch > >> > >> > > >> > -- Bert > >> > > >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help > >> > <r-help at r-project.org> wrote: > >> >> > >> >> Dear R-experts, > >> >> > >> >> Here below my R code with an error message. Can somebody help me to fix > >> this error? > >> >> Really appreciate your help. > >> >> > >> >> Best, > >> >> > >> >> ############################################################ > >> >> # MSE CROSSVALIDATION Lasso regression > >> >> > >> >> library(glmnet) > >> >> > >> >> > >> >> > >> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91) > >> >> > >> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9) > >> >> > >> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2) > >> >> T=data.frame(y,x1,x2) > >> >> > >> >> z=matrix(c(x1,x2), ncol=2) > >> >> cv_model=glmnet(z,y,alpha=1) > >> >> best_lambda=cv_model$lambda.min > >> >> best_lambda > >> >> > >> >> > >> >> # Create a list to store the results > >> >> lst<-list() > >> >> > >> >> # This statement does the repetitions (looping) > >> >> for(i in 1 :1000) { > >> >> > >> >> n=45 > >> >> > >> >> p=0.667 > >> >> > >> >> sam=sample(1 :n,floor(p*n),replace=FALSE) > >> >> > >> >> Training =T [sam,] > >> >> Testing = T [-sam,] > >> >> > >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2) > >> >> > >> >> predictLasso=predict(cv_model, newx=test1) > >> >> > >> >> > >> >> ypred=predict(predictLasso,newdata=test1) > >> >> y=T[-sam,]$y > >> >> > >> >> MSE = mean((y-ypred)^2) > >> >> MSE > >> >> lst[i]<-MSE > >> >> } > >> >> mean(unlist(lst)) > >> >> ################################################################## > >> >> > >> >> > >> >> > >> >> > >> >> ______________________________________________ > >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> >> https://stat.ethz.ch/mailman/listinfo/r-help > >> >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> >> and provide commented, minimal, self-contained, reproducible code. > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > -- > > Jin > > ------------------------------------------ > > Jin Li, PhD > > Founder, Data2action, Australia > > https://www.researchgate.net/profile/Jin_Li32 > > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en > > > [[alternative HTML version deleted]] > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Possibly Parallel Threads
- running crossvalidation many times MSE for Lasso regression
- running crossvalidation many times MSE for Lasso regression
- running crossvalidation many times MSE for Lasso regression
- running crossvalidation many times MSE for Lasso regression
- crossvalidation in svm regression in e1071 gives incorre ct results (PR#8554)