thr3ads.net - R help - [R] running crossvalidation many times MSE for Lasso regression [Oct 2023]

If this information is useful, please help other people find it:
Share via:

Ben Bolker

2023-Oct-23 17:58 UTC

[R] running crossvalidation many times MSE for Lasso regression

For what it's worth it looks like spm2 is specifically for *spatial* 
predictive modeling; presumably its version of CV is doing something 
spatially aware.

   I agree that glmnet is old and reliable.  One might want to use a 
tidymodels wrapper to create pipelines where you can more easily switch 
among predictive algorithms (see the `parsnip` package), but otherwise 
sticking to glmnet seems wise.

On 2023-10-23 4:38 a.m., Martin Maechler wrote:>>>>>> Jin Li
>>>>>>      on Mon, 23 Oct 2023 15:42:14 +1100 writes:
> 
>      > If you are interested in other validation methods (e.g., LOO or
n-fold)
>      > with more predictive accuracy measures, the function, glmnetcv,
in the spm2
>      > package can be directly used, and some reproducible examples are
>      > also available in ?glmnetcv.
> 
> ... and once you open that can of w..:   the  glmnet package itself
> contains a function  cv.glmnet()  which we (our students) use when
teaching.
> 
> What's the advantage of the spm2 package ?
> At least, the glmnet package is authored by the same who originated and
> first published (as in "peer reviewed" ..) these algorithms.
> 
> 
> 
>      > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch
<murdoch.duncan at gmail.com>
>      > wrote:
> 
>      >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
>      >> > No error message shown Please include the error message
so that it is
>      >> > not necessary to rerun your code. This might enable
someone to see the
>      >> > problem without running the code (e.g. downloading
packages, etc.)
>      >>
>      >> And it's not necessarily true that someone else would see
the same error
>      >> message.
>      >>
>      >> Duncan Murdoch
>      >>
>      >> >
>      >> > -- Bert
>      >> >
>      >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help
>      >> > <r-help at r-project.org> wrote:
>      >> >>
>      >> >> Dear R-experts,
>      >> >>
>      >> >> Here below my R code with an error message. Can
somebody help me to fix
>      >> this error?
>      >> >> Really appreciate your help.
>      >> >>
>      >> >> Best,
>      >> >>
>      >> >>
############################################################
>      >> >> # MSE CROSSVALIDATION Lasso regression
>      >> >>
>      >> >> library(glmnet)
>      >> >>
>      >> >>
>      >> >>
>      >>
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
>      >> >>
>      >>
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
>      >> >>
>      >>
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
>      >> >> T=data.frame(y,x1,x2)
>      >> >>
>      >> >> z=matrix(c(x1,x2), ncol=2)
>      >> >> cv_model=glmnet(z,y,alpha=1)
>      >> >> best_lambda=cv_model$lambda.min
>      >> >> best_lambda
>      >> >>
>      >> >>
>      >> >> # Create a list to store the results
>      >> >> lst<-list()
>      >> >>
>      >> >> # This statement does the repetitions (looping)
>      >> >> for(i in 1 :1000) {
>      >> >>
>      >> >> n=45
>      >> >>
>      >> >> p=0.667
>      >> >>
>      >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
>      >> >>
>      >> >> Training =T [sam,]
>      >> >> Testing = T [-sam,]
>      >> >>
>      >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
>      >> >>
>      >> >> predictLasso=predict(cv_model, newx=test1)
>      >> >>
>      >> >>
>      >> >> ypred=predict(predictLasso,newdata=test1)
>      >> >> y=T[-sam,]$y
>      >> >>
>      >> >> MSE = mean((y-ypred)^2)
>      >> >> MSE
>      >> >> lst[i]<-MSE
>      >> >> }
>      >> >> mean(unlist(lst))
>      >> >>
##################################################################
>      >> >>
>      >> >>
>      >> >>
>      >> >>
>      >> >> ______________________________________________
>      >> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>      >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>      >> >> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> >> and provide commented, minimal, self-contained,
reproducible code.
>      >> >
>      >> > ______________________________________________
>      >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>      >> > https://stat.ethz.ch/mailman/listinfo/r-help
>      >> > PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> > and provide commented, minimal, self-contained,
reproducible code.
>      >>
>      >> ______________________________________________
>      >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>      >> https://stat.ethz.ch/mailman/listinfo/r-help
>      >> PLEASE do read the posting guide
>      >> http://www.R-project.org/posting-guide.html
>      >> and provide commented, minimal, self-contained, reproducible
code.
>      >>
> 
> 
>      > --
>      > Jin
>      > ------------------------------------------
>      > Jin Li, PhD
>      > Founder, Data2action, Australia
>      > https://www.researchgate.net/profile/Jin_Li32
>      > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en
> 
>      > [[alternative HTML version deleted]]
> 
>      > ______________________________________________
>      > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible
code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

varin sacha

2023-Oct-23 19:12 UTC

head link

[R] running crossvalidation many times MSE for Lasso regression

Dear R-experts,

I really thank you all a lot for your responses. So, here is the error (and
warning) messages at the end of my R code.

Many thanks for your help.


Error in UseMethod("predict") :
? no applicable method for 'predict' applied to an object of class
"c('matrix', 'array', 'double',
'numeric')"> mean(unlist(lst))[1] NA
Warning message:
In mean.default(unlist(lst)) :
? argument is not numeric or logical: returning NA








Le lundi 23 octobre 2023 ? 19:59:15 UTC+2, Ben Bolker <bbolker at
gmail.com> a ?crit :





? For what it's worth it looks like spm2 is specifically for *spatial* 
predictive modeling; presumably its version of CV is doing something 
spatially aware.

? I agree that glmnet is old and reliable.? One might want to use a 
tidymodels wrapper to create pipelines where you can more easily switch 
among predictive algorithms (see the `parsnip` package), but otherwise 
sticking to glmnet seems wise.

On 2023-10-23 4:38 a.m., Martin Maechler wrote:>>>>>> Jin Li
>>>>>>? ? ? on Mon, 23 Oct 2023 15:42:14 +1100 writes:
> 
>? ? ? > If you are interested in other validation methods (e.g., LOO or
n-fold)
>? ? ? > with more predictive accuracy measures, the function, glmnetcv,
in the spm2
>? ? ? > package can be directly used, and some reproducible examples are
>? ? ? > also available in ?glmnetcv.
> 
> ... and once you open that can of w..:? the? glmnet package itself
> contains a function? cv.glmnet()? which we (our students) use when
teaching.
> 
> What's the advantage of the spm2 package ?
> At least, the glmnet package is authored by the same who originated and
> first published (as in "peer reviewed" ..) these algorithms.
> 
> 
> 
>? ? ? > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch
<murdoch.duncan at gmail.com>
>? ? ? > wrote:
> 
>? ? ? >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
>? ? ? >> > No error message shown Please include the error message
so that it is
>? ? ? >> > not necessary to rerun your code. This might enable
someone to see the
>? ? ? >> > problem without running the code (e.g. downloading
packages, etc.)
>? ? ? >>
>? ? ? >> And it's not necessarily true that someone else would see
the same error
>? ? ? >> message.
>? ? ? >>
>? ? ? >> Duncan Murdoch
>? ? ? >>
>? ? ? >> >
>? ? ? >> > -- Bert
>? ? ? >> >
>? ? ? >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via R-help
>? ? ? >> > <r-help at r-project.org> wrote:
>? ? ? >> >>
>? ? ? >> >> Dear R-experts,
>? ? ? >> >>
>? ? ? >> >> Here below my R code with an error message. Can
somebody help me to fix
>? ? ? >> this error?
>? ? ? >> >> Really appreciate your help.
>? ? ? >> >>
>? ? ? >> >> Best,
>? ? ? >> >>
>? ? ? >> >>
############################################################
>? ? ? >> >> # MSE CROSSVALIDATION Lasso regression
>? ? ? >> >>
>? ? ? >> >> library(glmnet)
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >>
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
>? ? ? >> >>
>? ? ? >>
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
>? ? ? >> >>
>? ? ? >>
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
>? ? ? >> >> T=data.frame(y,x1,x2)
>? ? ? >> >>
>? ? ? >> >> z=matrix(c(x1,x2), ncol=2)
>? ? ? >> >> cv_model=glmnet(z,y,alpha=1)
>? ? ? >> >> best_lambda=cv_model$lambda.min
>? ? ? >> >> best_lambda
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >> # Create a list to store the results
>? ? ? >> >> lst<-list()
>? ? ? >> >>
>? ? ? >> >> # This statement does the repetitions (looping)
>? ? ? >> >> for(i in 1 :1000) {
>? ? ? >> >>
>? ? ? >> >> n=45
>? ? ? >> >>
>? ? ? >> >> p=0.667
>? ? ? >> >>
>? ? ? >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
>? ? ? >> >>
>? ? ? >> >> Training =T [sam,]
>? ? ? >> >> Testing = T [-sam,]
>? ? ? >> >>
>? ? ? >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
>? ? ? >> >>
>? ? ? >> >> predictLasso=predict(cv_model, newx=test1)
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >> ypred=predict(predictLasso,newdata=test1)
>? ? ? >> >> y=T[-sam,]$y
>? ? ? >> >>
>? ? ? >> >> MSE = mean((y-ypred)^2)
>? ? ? >> >> MSE
>? ? ? >> >> lst[i]<-MSE
>? ? ? >> >> }
>? ? ? >> >> mean(unlist(lst))
>? ? ? >> >>
##################################################################
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >>
>? ? ? >> >> ______________________________________________
>? ? ? >> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
>? ? ? >> >> https://stat.ethz.ch/mailman/listinfo/r-help
>? ? ? >> >> PLEASE do read the posting guide
>? ? ? >> http://www.R-project.org/posting-guide.html
>? ? ? >> >> and provide commented, minimal, self-contained,
reproducible code.
>? ? ? >> >
>? ? ? >> > ______________________________________________
>? ? ? >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
>? ? ? >> > https://stat.ethz.ch/mailman/listinfo/r-help
>? ? ? >> > PLEASE do read the posting guide
>? ? ? >> http://www.R-project.org/posting-guide.html
>? ? ? >> > and provide commented, minimal, self-contained,
reproducible code.
>? ? ? >>
>? ? ? >> ______________________________________________
>? ? ? >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>? ? ? >> https://stat.ethz.ch/mailman/listinfo/r-help
>? ? ? >> PLEASE do read the posting guide
>? ? ? >> http://www.R-project.org/posting-guide.html
>? ? ? >> and provide commented, minimal, self-contained, reproducible
code.
>? ? ? >>
> 
> 
>? ? ? > --
>? ? ? > Jin
>? ? ? > ------------------------------------------
>? ? ? > Jin Li, PhD
>? ? ? > Founder, Data2action, Australia
>? ? ? > https://www.researchgate.net/profile/Jin_Li32
>? ? ? > https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en
> 
>? ? ? > [[alternative HTML version deleted]]
> 
>? ? ? > ______________________________________________
>? ? ? > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>? ? ? > https://stat.ethz.ch/mailman/listinfo/r-help
>? ? ? > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>? ? ? > and provide commented, minimal, self-contained, reproducible
code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Jin Li

2023-Oct-24 05:01 UTC

head link

[R] running crossvalidation many times MSE for Lasso regression

Hi Ben, Martin and all,

The function, glmnetcv, in the spm2 package was developed for the following
main reasons:
1. The training and testing samples were generated using a stratified
random sampling method instead of a simple random sampling method. By doing
this, we hoped that it may be able to decluster the spatial data as Ben
mentioned and also to reduce the variation in the perdictive accuarcy among
iterations and produce a more reliable predictive accuracy.
2.  It can be used to produce various prective accuracy measures (e.g.,
VEcv) as shown in the reproducible examples.
3.  We also wanted that all methods compared in Spatial Predictive Modeling
with R were based on cv functions that are using the same sampling methods
(i.e., a number of cv functions were developed for this purpose), so that
we could conclude that the differences in the accuracy of predictive
methods were resulted from the methods themselves.

Anyway, people interested can use their own data to test and see.

Best,
Jin


On Tue, Oct 24, 2023 at 4:59?AM Ben Bolker <bbolker at gmail.com> wrote:
>    For what it's worth it looks like spm2 is specifically for *spatial*
> predictive modeling; presumably its version of CV is doing something
> spatially aware.
>
>    I agree that glmnet is old and reliable.  One might want to use a
> tidymodels wrapper to create pipelines where you can more easily switch
> among predictive algorithms (see the `parsnip` package), but otherwise
> sticking to glmnet seems wise.
>
> On 2023-10-23 4:38 a.m., Martin Maechler wrote:
> >>>>>> Jin Li
> >>>>>>      on Mon, 23 Oct 2023 15:42:14 +1100 writes:
> >
> >      > If you are interested in other validation methods (e.g., LOO
or
> n-fold)
> >      > with more predictive accuracy measures, the function,
glmnetcv,
> in the spm2
> >      > package can be directly used, and some reproducible examples
are
> >      > also available in ?glmnetcv.
> >
> > ... and once you open that can of w..:   the  glmnet package itself
> > contains a function  cv.glmnet()  which we (our students) use when
> teaching.
> >
> > What's the advantage of the spm2 package ?
> > At least, the glmnet package is authored by the same who originated
and
> > first published (as in "peer reviewed" ..) these algorithms.
> >
> >
> >
> >      > On Mon, Oct 23, 2023 at 10:59?AM Duncan Murdoch <
> murdoch.duncan at gmail.com>
> >      > wrote:
> >
> >      >> On 22/10/2023 7:01 p.m., Bert Gunter wrote:
> >      >> > No error message shown Please include the error
message so
> that it is
> >      >> > not necessary to rerun your code. This might enable
someone to
> see the
> >      >> > problem without running the code (e.g. downloading
packages,
> etc.)
> >      >>
> >      >> And it's not necessarily true that someone else
would see the
> same error
> >      >> message.
> >      >>
> >      >> Duncan Murdoch
> >      >>
> >      >> >
> >      >> > -- Bert
> >      >> >
> >      >> > On Sun, Oct 22, 2023 at 1:36?PM varin sacha via
R-help
> >      >> > <r-help at r-project.org> wrote:
> >      >> >>
> >      >> >> Dear R-experts,
> >      >> >>
> >      >> >> Here below my R code with an error message. Can
somebody help
> me to fix
> >      >> this error?
> >      >> >> Really appreciate your help.
> >      >> >>
> >      >> >> Best,
> >      >> >>
> >      >> >>
############################################################
> >      >> >> # MSE CROSSVALIDATION Lasso regression
> >      >> >>
> >      >> >> library(glmnet)
> >      >> >>
> >      >> >>
> >      >> >>
> >      >>
>
x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
> >      >> >>
> >      >>
>
x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
> >      >> >>
> >      >>
>
y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
> >      >> >> T=data.frame(y,x1,x2)
> >      >> >>
> >      >> >> z=matrix(c(x1,x2), ncol=2)
> >      >> >> cv_model=glmnet(z,y,alpha=1)
> >      >> >> best_lambda=cv_model$lambda.min
> >      >> >> best_lambda
> >      >> >>
> >      >> >>
> >      >> >> # Create a list to store the results
> >      >> >> lst<-list()
> >      >> >>
> >      >> >> # This statement does the repetitions (looping)
> >      >> >> for(i in 1 :1000) {
> >      >> >>
> >      >> >> n=45
> >      >> >>
> >      >> >> p=0.667
> >      >> >>
> >      >> >> sam=sample(1 :n,floor(p*n),replace=FALSE)
> >      >> >>
> >      >> >> Training =T [sam,]
> >      >> >> Testing = T [-sam,]
> >      >> >>
> >      >> >> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
> >      >> >>
> >      >> >> predictLasso=predict(cv_model, newx=test1)
> >      >> >>
> >      >> >>
> >      >> >> ypred=predict(predictLasso,newdata=test1)
> >      >> >> y=T[-sam,]$y
> >      >> >>
> >      >> >> MSE = mean((y-ypred)^2)
> >      >> >> MSE
> >      >> >> lst[i]<-MSE
> >      >> >> }
> >      >> >> mean(unlist(lst))
> >      >> >>
> ##################################################################
> >      >> >>
> >      >> >>
> >      >> >>
> >      >> >>
> >      >> >> ______________________________________________
> >      >> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and
> more, see
> >      >> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >      >> >> PLEASE do read the posting guide
> >      >> http://www.R-project.org/posting-guide.html
> >      >> >> and provide commented, minimal, self-contained,
reproducible
> code.
> >      >> >
> >      >> > ______________________________________________
> >      >> > R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
> see
> >      >> > https://stat.ethz.ch/mailman/listinfo/r-help
> >      >> > PLEASE do read the posting guide
> >      >> http://www.R-project.org/posting-guide.html
> >      >> > and provide commented, minimal, self-contained,
reproducible
> code.
> >      >>
> >      >> ______________________________________________
> >      >> R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more,
> see
> >      >> https://stat.ethz.ch/mailman/listinfo/r-help
> >      >> PLEASE do read the posting guide
> >      >> http://www.R-project.org/posting-guide.html
> >      >> and provide commented, minimal, self-contained,
reproducible
> code.
> >      >>
> >
> >
> >      > --
> >      > Jin
> >      > ------------------------------------------
> >      > Jin Li, PhD
> >      > Founder, Data2action, Australia
> >      > https://www.researchgate.net/profile/Jin_Li32
> >      >
https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en
> >
> >      > [[alternative HTML version deleted]]
> >
> >      > ______________________________________________
> >      > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
> >      > https://stat.ethz.ch/mailman/listinfo/r-help
> >      > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >      > and provide commented, minimal, self-contained, reproducible
code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Jin
------------------------------------------
Jin Li, PhD
Founder, Data2action, Australia
https://www.researchgate.net/profile/Jin_Li32
https://scholar.google.com/citations?user=Jeot53EAAAAJ&hl=en

	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more maybe matching threads

R help - Oct 2023 - running crossvalidation many times MSE for Lasso regression

[R] running crossvalidation many times MSE for Lasso regression

[R] running crossvalidation many times MSE for Lasso regression

[R] running crossvalidation many times MSE for Lasso regression

Maybe Matching Threads