thr3ads.net - R help - [R] running crossvalidation many times MSE for Lasso regression [Oct 2023]

If this information is useful, please help other people find it:
Share via:

Martin Maechler

2023-Oct-23 08:38 UTC

[R] running crossvalidation many times MSE for Lasso regression

>>>>> Jin Li 
>>>>>     on Mon, 23 Oct 2023 15:42:14 +1100 writes:
    > If you are interested in other validation methods (e.g., LOO or n-fold)
    > with more predictive accuracy measures, the function, glmnetcv, in the
spm2
    > package can be directly used, and some reproducible examples are
    > also available in ?glmnetcv.

... and once you open that can of w..:   the  glmnet package itself
contains a function  cv.glmnet()  which we (our students) use when teaching.

What's the advantage of the spm2 package ?
At least, the glmnet package is authored by the same who originated and
first published (as in "peer reviewed" ..) these algorithms.



    > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch <murdoch.duncan at
gmail.com>
    > wrote:

    >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
    >> > No error message shown Please include the error message so
that it is
    >> > not necessary to rerun your code. This might enable someone to
see the
    >> > problem without running the code (e.g. downloading packages,
etc.)
    >> 
    >> And it's not necessarily true that someone else would see the
same error
    >> message.
    >> 
    >> Duncan Murdoch
    >> 
    >> >
    >> > -- Bert
    >> >
    >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help
    >> > <r-help at r-project.org> wrote:
    >> >>
    >> >> Dear R-experts,
    >> >>
    >> >> Here below my R code with an error message. Can somebody
help me to fix
    >> this error?
    >> >> Really appreciate your help.
    >> >>
    >> >> Best,
    >> >>
    >> >>
############################################################
    >> >> # MSE CROSSVALIDATION Lasso regression
    >> >>
    >> >> library(glmnet)
    >> >>
    >> >>
    >> >>
    >>
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
    >> >>
    >>
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
    >> >>
    >>
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
    >> >> T=data.frame(y,x1,x2)
    >> >>
    >> >> z=matrix(c(x1,x2), ncol=2)
    >> >> cv_model=glmnet(z,y,alpha=1)
    >> >> best_lambda=cv_model$lambda.min
    >> >> best_lambda
    >> >>
    >> >>
    >> >> # Create a list to store the results
    >> >> lst<-list()
    >> >>
    >> >> # This statement does the repetitions (looping)
    >> >> for(i in 1 :1000) {
    >> >>
    >> >> n=45
    >> >>
    >> >> p=0.667
    >> >>
    >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
    >> >>
    >> >> Training =T [sam,]
    >> >> Testing = T [-sam,]
    >> >>
    >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
    >> >>
    >> >> predictLasso=predict(cv_model, newx=test1)
    >> >>
    >> >>
    >> >> ypred=predict(predictLasso,newdata=test1)
    >> >> y=T[-sam,]$y
    >> >>
    >> >> MSE = mean((y-ypred)^2)
    >> >> MSE
    >> >> lst[i]<-MSE
    >> >> }
    >> >> mean(unlist(lst))
    >> >>
##################################################################
    >> >>
    >> >>
    >> >>
    >> >>
    >> >> ______________________________________________
    >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
    >> >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> >> PLEASE do read the posting guide
    >> http://www.R-project.org/posting-guide.html
    >> >> and provide commented, minimal, self-contained,
reproducible code.
    >> >
    >> > ______________________________________________
    >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
    >> > https://stat.ethz.ch/mailman/listinfo/r-help
    >> > PLEASE do read the posting guide
    >> http://www.R-project.org/posting-guide.html
    >> > and provide commented, minimal, self-contained, reproducible
code.
    >> 
    >> ______________________________________________
    >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
    >> https://stat.ethz.ch/mailman/listinfo/r-help
    >> PLEASE do read the posting guide
    >> http://www.R-project.org/posting-guide.html
    >> and provide commented, minimal, self-contained, reproducible code.
    >> 


    > -- 
    > Jin
    > ------------------------------------------
    > Jin Li, PhD
    > Founder, Data2action, Australia
    > https://www.researchgate.net/profile/Jin_Li32
    > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en

    > [[alternative HTML version deleted]]

    > ______________________________________________
    > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
    > https://stat.ethz.ch/mailman/listinfo/r-help
    > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
    > and provide commented, minimal, self-contained, reproducible code.

Ben Bolker

2023-Oct-23 17:58 UTC

head link

[R] running crossvalidation many times MSE for Lasso regression

For what it's worth it looks like spm2 is specifically for *spatial* 
predictive modeling; presumably its version of CV is doing something 
spatially aware.

   I agree that glmnet is old and reliable.  One might want to use a 
tidymodels wrapper to create pipelines where you can more easily switch 
among predictive algorithms (see the `parsnip` package), but otherwise 
sticking to glmnet seems wise.

On 2023-10-23 4:38 a.m., Martin Maechler wrote:>>>>>> Jin Li
>>>>>>      on Mon, 23 Oct 2023 15:42:14 +1100 writes:
> 
>      > If you are interested in other validation methods (e.g., LOO or
n-fold)
>      > with more predictive accuracy measures, the function, glmnetcv,
in the spm2
>      > package can be directly used, and some reproducible examples are
>      > also available in ?glmnetcv.
> 
> ... and once you open that can of w..:   the  glmnet package itself
> contains a function  cv.glmnet()  which we (our students) use when
teaching.
> 
> What's the advantage of the spm2 package ?
> At least, the glmnet package is authored by the same who originated and
> first published (as in "peer reviewed" ..) these algorithms.
> 
> 
> 
>      > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch
<murdoch.duncan at gmail.com>
>      > wrote:
> 
>      >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
>      >> > No error message shown Please include the error message
so that it is
>      >> > not necessary to rerun your code. This might enable
someone to see the
>      >> > problem without running the code (e.g. downloading
packages, etc.)
>      >>
>      >> And it's not necessarily true that someone else would see
the same error
>      >> message.
>      >>
>      >> Duncan Murdoch
>      >>
>      >> >
>      >> > -- Bert
>      >> >
>      >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help
>      >> > <r-help at r-project.org> wrote:
>      >> >>
>      >> >> Dear R-experts,
>      >> >>
>      >> >> Here below my R code with an error message. Can
somebody help me to fix
>      >> this error?
>      >> >> Really appreciate your help.
>      >> >>
>      >> >> Best,
>      >> >>
>      >> >>
############################################################
>      >> >> # MSE CROSSVALIDATION Lasso regression
>      >> >>
>      >> >> library(glmnet)
>      >> >>
>      >> >>
>      >> >>
>      >>
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
>      >> >>
>      >>
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
>      >> >>
>      >>
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
>      >> >> T=data.frame(y,x1,x2)
>      >> >>
>      >> >> z=matrix(c(x1,x2), ncol=2)
>      >> >> cv_model=glmnet(z,y,alpha=1)
>      >> >> best_lambda=cv_model$lambda.min
>      >> >> best_lambda
>      >> >>
>      >> >>
>      >> >> # Create a list to store the results
>      >> >> lst<-list()
>      >> >>
>      >> >> # This statement does the repetitions (looping)
>      >> >> for(i in 1 :1000) {
>      >> >>
>      >> >> n=45
>      >> >>
>      >> >> p=0.667
>      >> >>
>      >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
>      >> >>
>      >> >> Training =T [sam,]
>      >> >> Testing = T [-sam,]
>      >> >>
>      >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
>      >> >>
>      >> >> predictLasso=predict(cv_model, newx=test1)
>      >> >>
>      >> >>
>      >> >> ypred=predict(predictLasso,newdata=test1)
>      >> >> y=T[-sam,]$y
>      >> >>
>      >> >> MSE = mean((y-ypred)^2)
>      >> >> MSE
>      >> >> lst[i]<-MSE
>      >> >> }
>      >> >> mean(unlist(lst))
>      >> >>
##################################################################
>      >> >>
>      >> >>
>      >> >>
>      >> >>
>      >> >> ______________________________________________
>      >> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>      >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>      >> >> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> >> and provide commented, minimal, self-contained,
reproducible code.
>      >> >
>      >> > ______________________________________________
>      >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>      >> > https://stat.ethz.ch/mailman/listinfo/r-help
>      >> > PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> > and provide commented, minimal, self-contained,
reproducible code.
>      >>
>      >> ______________________________________________
>      >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>      >> https://stat.ethz.ch/mailman/listinfo/r-help
>      >> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> and provide commented, minimal, self-contained, reproducible
code.
>      >>
> 
> 
>      > --
>      > Jin
>      > ------------------------------------------
>      > Jin Li, PhD
>      > Founder, Data2action, Australia
>      > https://www.researchgate.net/profile/Jin_Li32
>      > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en
> 
>      > [[alternative HTML version deleted]]
> 
>      > ______________________________________________
>      > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible
code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Oct 2023 - running crossvalidation many times MSE for Lasso regression

[R] running crossvalidation many times MSE for Lasso regression

[R] running crossvalidation many times MSE for Lasso regression

Possibly Parallel Threads