Many thanks David, it perfectly works. Now, one last think. If I want my R code here below to run let's say B=500 times and at the end I want to get the average for the MSE_GAM and for the MSE_MARS. How can I do that ? library(mgcv) library(earth) n<-2000 x<-runif(n, 0, 5) z <- runif(n, 0, 5) a <- runif(n, 0, 5) ? y_model<- 0.1*x^3 - 0.5 * z^2 - a + 10 y_obs <- c( rnorm(n*0.5, y_model, 0.1), rnorm(n*0.5, y_model, 0.5) ) gam_model<- gam(y_obs~s(x)+s(z)+s(a)) mars_model<-earth(y_obs~x+z+a) ? MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) ? MSE_GAM MSE_MARS Le mardi 17 septembre 2019 ? 22:27:54 UTC+2, David Winsemius <dwinsemius at comcast.net> a ?crit : On 9/17/19 12:48 PM, varin sacha via R-help wrote:> Dear R-helpers, > > Doing dput(x) and dput(y_obs), the 2 vectors are not the same length (1800 for y_obs and 2000 for x) > How can I solve the problem ? > > Here is the reproducible R code > >? ? #? #? #? #? #? #? #? #? #? # > library(mgcv) >? library(earth) > > n<-2000 > x<-runif(n, 0, 5) >? y_model<- 0.1*x^3 - 0.5 * x^2 - x + 10 > # y_obs<-rnorm(n*0.9, y_model, 0.1)+rnorm(n*0.1, y_model, 0.5) # maybe not exactly your goal?You didn't lay out any goals for analysis, so let me guess what was intended: I suspect that you were hoping to model a mixture composed of 90% from one distribution and 10% from another. If I'm right about that guess then you would instead wat to join the samples from each distribution: y_obs<-c( rnorm(n*0.9, y_model, 0.1),? rnorm(n*0.1, y_model, 0.5) ) -- David> gam_model<- gam(y_obs~s(x)) > mars_model<- earth(y_obs~x) > MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) > MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) > MSE_GAM > MSE_MARS >? ? #? #? #? #? #? #? #? #? #? #? #? #? #? #? #? # > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On 9/17/19 1:35 PM, varin sacha wrote:> Many thanks David, it perfectly works. > Now, one last think. > If I want my R code here below to run let's say B=500 times and at the end I want to get the average for the MSE_GAM and for the MSE_MARS. How can I do that ?The `replicate` function is designed for that purpose. -- David.> > library(mgcv) > library(earth) > n<-2000 > x<-runif(n, 0, 5) > z <- runif(n, 0, 5) > a <- runif(n, 0, 5) > y_model<- 0.1*x^3 - 0.5 * z^2 - a + 10 > y_obs <- c( rnorm(n*0.5, y_model, 0.1), rnorm(n*0.5, y_model, 0.5) ) > gam_model<- gam(y_obs~s(x)+s(z)+s(a)) > mars_model<-earth(y_obs~x+z+a) > MSE_GAM<-mean((gam_model$fitted.values - y_model)^2) > MSE_MARS<-mean((mars_model$fitted.values - y_model)^2) > MSE_GAM > MSE_MARS > > > > > > > > Le mardi 17 septembre 2019 ? 22:27:54 UTC+2, David Winsemius <dwinsemius at comcast.net> a ?crit : > > > > > > > On 9/17/19 12:48 PM, varin sacha via R-help wrote: >> Dear R-helpers, >> >> Doing dput(x) and dput(y_obs), the 2 vectors are not the same length (1800 for y_obs and 2000 for x) >> How can I solve the problem ? >> >> Here is the reproducible R code >> >> ? ? #? #? #? #? #? #? #? #? #? # >> library(mgcv) >> ? library(earth) >> >> n<-2000 >> x<-runif(n, 0, 5) >> ? y_model<- 0.1*x^3 - 0.5 * x^2 - x + 10 >> # y_obs<-rnorm(n*0.9, y_model, 0.1)+rnorm(n*0.1, y_model, 0.5) # maybe not exactly your goal? > > You didn't lay out any goals for analysis, so let me guess what was > intended: > > > I suspect that you were hoping to model a mixture composed of 90% from > one distribution and 10% from another. If I'm right about that guess > then you would instead wat to join the samples from each distribution: > > y_obs<-c( rnorm(n*0.9, y_model, 0.1),? rnorm(n*0.1, y_model, 0.5) ) >
On 9/17/19 2:08 PM, David Winsemius wrote:> > On 9/17/19 1:35 PM, varin sacha wrote: >> Many thanks David, it perfectly works. >> Now, one last think. >> If I want my R code here below to run let's say B=500 times and at >> the end I want to get the average for the MSE_GAM and for the >> MSE_MARS. How can I do that ? > > > The `replicate` function is designed for that purpose.Although I also just noticed that you were separately computing residuals. Many R regression functions return a residual vector. Your code would be a lot faster over the course of 500 repeats if you used the resid function: > str( resid(gam_model)) ?num [1:2000] -0.1385 0.1848 -0.0567 0.0605 -0.3297 ... > str( resid(mars_model)) ?num [1:2000, 1] -0.2181 0.294 -0.0773 0.1626 -0.3512 ... ?- attr(*, "dimnames")=List of 2 ? ..$ : chr [1:2000] "1" "2" "3" "4" ... ? ..$ : chr "y_obs"