David Winsemius
2017-Jul-16 14:36 UTC
[R] How to formulate quadratic function with interaction terms for the PLS fitting model?
> On Jul 13, 2017, at 7:43 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > Below. > > -- Bert > Bert Gunter > > > > On Thu, Jul 13, 2017 at 3:07 AM, Luigi Biagini <luigi.biagini at gmail.com> wrote: >> I have two ideas about it. >> >> 1- >> i) Entering variables in quadratic form is done with the command I >> (variable ^ 2) - >> plsr (octane ~ NIR + I (nir ^ 2), ncomp = 10, data = gasTrain, validation >> "LOO" >> You could also use a new variable NIR_sq <- (NIR) ^ 2 >> >> ii) To insert a square variable, use syntax I (x ^ 2) - it is very >> important to insert I before the parentheses. > > True, but better I believe: see ?poly. > e.g. poly(cbind(x1,x2,x3), degree = 2, raw = TRUE) is a full quadratic > polynomial in x1,x2,x3 . >Is there any real difference between octane ~ NIR * I(NIR^2) octane ~ NIR * poly(NIR, degree=2, raw=TRUE) ? (I though that adding raw = TRUE prevented the beneficial process of centering the second degree terms.) __ David> >> >> iii) If you want to make the interaction between x and x ^ 2 use the >> command ":" -> x: I(x ^ 2) >> >> iv) For multiple interactions between x and x ^ 2 use the command "*" -> x >> *I (x ^ 2) >> >> i) plsr (octane ~ NIR + NIR_sq, ncomp = 10, data = gasTrain, validation >> "LOO") I (x ^ 2) >> ii)p lsr (octane ~ NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation >> = "LOO") I (x ^ 2) >> iii)p lsr (octane ~ NIR : I(NIR^2), ncomp = 10, data = gasTrain, validation >> = "LOO") I (x ^ 2) >> iv)p lsr (octane ~ NIR * I(NIR^2), ncomp = 10, data = gasTrain, validation >> = "LOO") I (x ^ 2) >> >> 2 - For your regression, did you plan to use MARS instead of PLS? >> >> >> >> >> Dear all, >>> I am using the pls package of R to perform partial least square on a set of >>> multivariate data. Instead of fitting a linear model, I want to fit my >>> data with a quadratic function with interaction terms. But I am not sure >>> how. I will use an example to illustrate my problem: >>> Following the example in the PLS manual: >>> ## Read data >>> data(gasoline) >>> gasTrain <- gasoline[1:50,] >>> ## Perform PLS >>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO") >>> where octane ~ NIR is the model that this example is fitting with. >>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse >>> reflectance measurements from 900 to 1700 nm. >>> Instead of fitting with octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... >>> I want to fit the data with: >>> octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... + >>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ... >>> i.e. quadratic with interaction terms. >>> But I don't know how to formulate this. >>> May I have some help please? >>> Thanks, >>> Kelvin >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius Alameda, CA, USA
Bert Gunter
2017-Jul-16 15:47 UTC
[R] How to formulate quadratic function with interaction terms for the PLS fitting model?
?? If I haven't misunderstood, they are completely different! 1) NIR must be a matrix, or poly(NIR,...) will fail. 2) Due to the previously identified bug in poly, degree must be explicitly given as poly(NIR, degree =2,raw = TRUE). Now consider the following example:> df <-matrix(runif(60),ncol=3) > y <- runif(20) > mdl1 <-lm(y~df*I(df^2)) > mdl2 <-lm(y~df*poly(df,degree=2,raw=TRUE)) > length(coef(mdl1))[1] 16> length(coef(mdl2))[1] 40 Explanation: In mdl1, I(df^2) gives the squared values of the 3 columns of df. The formula df*I(df^2) gives the 3 (linear) terms of df, the 3 pure quadratics of I(df^2), the 9 cubic terms obtained by crossing these, and the constant coefficient = 16 coefs. In mdl2, the poly() expression gives 9 variiables: 3 linear, 3 pure quadratic, 3 interactions (1.2, 1.3, 2.3) of these. The df*poly() term would then give the 3 linear terms of df, the 9 terms of poly(), the crossings between these, and the constant coef = 40 coefs. Many of these will be NA since terms are repeated (e.g. the 3 linear terms of poly() and df) and therefore cannot be estimated. Have I totally misunderstood what you meant or committed some other blunder? Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jul 16, 2017 at 7:36 AM, David Winsemius <dwinsemius at comcast.net> wrote:> >> On Jul 13, 2017, at 7:43 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >> >> Below. >> >> -- Bert >> Bert Gunter >> >> >> >> On Thu, Jul 13, 2017 at 3:07 AM, Luigi Biagini <luigi.biagini at gmail.com> wrote: >>> I have two ideas about it. >>> >>> 1- >>> i) Entering variables in quadratic form is done with the command I >>> (variable ^ 2) - >>> plsr (octane ~ NIR + I (nir ^ 2), ncomp = 10, data = gasTrain, validation >>> "LOO" >>> You could also use a new variable NIR_sq <- (NIR) ^ 2 >>> >>> ii) To insert a square variable, use syntax I (x ^ 2) - it is very >>> important to insert I before the parentheses. >> >> True, but better I believe: see ?poly. >> e.g. poly(cbind(x1,x2,x3), degree = 2, raw = TRUE) is a full quadratic >> polynomial in x1,x2,x3 . >> > > Is there any real difference between > > octane ~ NIR * I(NIR^2) > octane ~ NIR * poly(NIR, degree=2, raw=TRUE) > > ? > (I though that adding raw = TRUE prevented the beneficial process of centering the second degree terms.) > __ > David >> >>> >>> iii) If you want to make the interaction between x and x ^ 2 use the >>> command ":" -> x: I(x ^ 2) >>> >>> iv) For multiple interactions between x and x ^ 2 use the command "*" -> x >>> *I (x ^ 2) >>> >>> i) plsr (octane ~ NIR + NIR_sq, ncomp = 10, data = gasTrain, validation >>> "LOO") I (x ^ 2) >>> ii)p lsr (octane ~ NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation >>> = "LOO") I (x ^ 2) >>> iii)p lsr (octane ~ NIR : I(NIR^2), ncomp = 10, data = gasTrain, validation >>> = "LOO") I (x ^ 2) >>> iv)p lsr (octane ~ NIR * I(NIR^2), ncomp = 10, data = gasTrain, validation >>> = "LOO") I (x ^ 2) >>> >>> 2 - For your regression, did you plan to use MARS instead of PLS? >>> >>> >>> >>> >>> Dear all, >>>> I am using the pls package of R to perform partial least square on a set of >>>> multivariate data. Instead of fitting a linear model, I want to fit my >>>> data with a quadratic function with interaction terms. But I am not sure >>>> how. I will use an example to illustrate my problem: >>>> Following the example in the PLS manual: >>>> ## Read data >>>> data(gasoline) >>>> gasTrain <- gasoline[1:50,] >>>> ## Perform PLS >>>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO") >>>> where octane ~ NIR is the model that this example is fitting with. >>>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse >>>> reflectance measurements from 900 to 1700 nm. >>>> Instead of fitting with octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... >>>> I want to fit the data with: >>>> octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... + >>>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ... >>>> i.e. quadratic with interaction terms. >>>> But I don't know how to formulate this. >>>> May I have some help please? >>>> Thanks, >>>> Kelvin >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA >
David Winsemius
2017-Jul-16 16:52 UTC
[R] How to formulate quadratic function with interaction terms for the PLS fitting model?
> On Jul 16, 2017, at 8:47 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: > > ?? > If I haven't misunderstood, they are completely different! > > 1) NIR must be a matrix, or poly(NIR,...) will fail. > 2) Due to the previously identified bug in poly, degree must be > explicitly given as poly(NIR, degree =2,raw = TRUE). > > Now consider the following example: > >> df <-matrix(runif(60),ncol=3) >> y <- runif(20) >> mdl1 <-lm(y~df*I(df^2)) >> mdl2 <-lm(y~df*poly(df,degree=2,raw=TRUE)) >> length(coef(mdl1)) > [1] 16 >> length(coef(mdl2)) > [1] 40 > > Explanation: > In mdl1, I(df^2) gives the squared values of the 3 columns of df. The > formula df*I(df^2) gives the 3 (linear) terms of df, the 3 pure > quadratics of I(df^2), the 9 cubic terms obtained by crossing these, > and the constant coefficient = 16 coefs. > > In mdl2, the poly() expression gives 9 variiables: 3 linear, 3 pure > quadratic, 3 interactions (1.2, 1.3, 2.3) of these. The df*poly() > term would then give the 3 linear terms of df, the 9 terms of poly(), > the crossings between these, and the constant coef = 40 coefs. Many of > these will be NA since terms are repeated (e.g. the 3 linear terms of > poly() and df) and therefore cannot be estimated. > > Have I totally misunderstood what you meant or committed some other blunder?I was thinking about different model specifications, but clearly I had failed to test my assumptions and will need to do more study:> df <-matrix(runif(60),ncol=3) > y <- runif(20) > mdl1 <-lm(y~ df +I(df^2) ) > mdl2 <-lm(y~ poly(df,degree=2,raw=TRUE)) > mdl1Call: lm(formula = y ~ df + I(df^2)) Coefficients: (Intercept) df1 df2 df3 I(df^2)1 I(df^2)2 I(df^2)3 1.3382 -1.1431 -1.7894 -1.2675 0.9686 1.6605 1.0411> mdl2Call: lm(formula = y ~ poly(df, degree = 2, raw = TRUE)) Coefficients: (Intercept) poly(df, degree = 2, raw = TRUE)1.0.0 1.28217 -0.98032 poly(df, degree = 2, raw = TRUE)2.0.0 poly(df, degree = 2, raw = TRUE)0.1.0 0.89955 -1.89019 poly(df, degree = 2, raw = TRUE)1.1.0 poly(df, degree = 2, raw = TRUE)0.2.0 0.09528 1.63065 poly(df, degree = 2, raw = TRUE)0.0.1 poly(df, degree = 2, raw = TRUE)1.0.1 -1.03744 -0.29368 poly(df, degree = 2, raw = TRUE)0.1.1 poly(df, degree = 2, raw = TRUE)0.0.2 0.09357 0.93400> length(coef(mdl2))[1] 10 I had been reason from my experience with atomic vectors. Clearly poly() handles matrices differently than the combination of `+.formula` with `I`. Thanks for furthering my education:> df <-data.frame(y = runif(20), x=runif(20) ) > > (mdl1 <-lm(y~x +I(x^2) ,data=df) )Call: lm(formula = y ~ x + I(x^2), data = df) Coefficients: (Intercept) x I(x^2) 0.6435 -1.4477 1.8282> (mdl2 <-lm(y~ poly(x,degree=2,raw=TRUE), data=df) )Call: lm(formula = y ~ poly(x, degree = 2, raw = TRUE), data = df) Coefficients: (Intercept) poly(x, degree = 2, raw = TRUE)1 0.6435 -1.4477 poly(x, degree = 2, raw = TRUE)2 1.8282 Best; David.> > Cheers, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Sun, Jul 16, 2017 at 7:36 AM, David Winsemius <dwinsemius at comcast.net> wrote: >> >>> On Jul 13, 2017, at 7:43 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote: >>> >>> Below. >>> >>> -- Bert >>> Bert Gunter >>> >>> >>> >>> On Thu, Jul 13, 2017 at 3:07 AM, Luigi Biagini <luigi.biagini at gmail.com> wrote: >>>> I have two ideas about it. >>>> >>>> 1- >>>> i) Entering variables in quadratic form is done with the command I >>>> (variable ^ 2) - >>>> plsr (octane ~ NIR + I (nir ^ 2), ncomp = 10, data = gasTrain, validation >>>> "LOO" >>>> You could also use a new variable NIR_sq <- (NIR) ^ 2 >>>> >>>> ii) To insert a square variable, use syntax I (x ^ 2) - it is very >>>> important to insert I before the parentheses. >>> >>> True, but better I believe: see ?poly. >>> e.g. poly(cbind(x1,x2,x3), degree = 2, raw = TRUE) is a full quadratic >>> polynomial in x1,x2,x3 . >>> >> >> Is there any real difference between >> >> octane ~ NIR * I(NIR^2) >> octane ~ NIR * poly(NIR, degree=2, raw=TRUE) >> >> ? >> (I though that adding raw = TRUE prevented the beneficial process of centering the second degree terms.) >> __ >> David >>> >>>> >>>> iii) If you want to make the interaction between x and x ^ 2 use the >>>> command ":" -> x: I(x ^ 2) >>>> >>>> iv) For multiple interactions between x and x ^ 2 use the command "*" -> x >>>> *I (x ^ 2) >>>> >>>> i) plsr (octane ~ NIR + NIR_sq, ncomp = 10, data = gasTrain, validation >>>> "LOO") I (x ^ 2) >>>> ii)p lsr (octane ~ NIR + I(NIR^2), ncomp = 10, data = gasTrain, validation >>>> = "LOO") I (x ^ 2) >>>> iii)p lsr (octane ~ NIR : I(NIR^2), ncomp = 10, data = gasTrain, validation >>>> = "LOO") I (x ^ 2) >>>> iv)p lsr (octane ~ NIR * I(NIR^2), ncomp = 10, data = gasTrain, validation >>>> = "LOO") I (x ^ 2) >>>> >>>> 2 - For your regression, did you plan to use MARS instead of PLS? >>>> >>>> >>>> >>>> >>>> Dear all, >>>>> I am using the pls package of R to perform partial least square on a set of >>>>> multivariate data. Instead of fitting a linear model, I want to fit my >>>>> data with a quadratic function with interaction terms. But I am not sure >>>>> how. I will use an example to illustrate my problem: >>>>> Following the example in the PLS manual: >>>>> ## Read data >>>>> data(gasoline) >>>>> gasTrain <- gasoline[1:50,] >>>>> ## Perform PLS >>>>> gas1 <- plsr(octane ~ NIR, ncomp = 10, data = gasTrain, validation = "LOO") >>>>> where octane ~ NIR is the model that this example is fitting with. >>>>> NIR is a collective of variables, i.e. NIR spectra consists of 401 diffuse >>>>> reflectance measurements from 900 to 1700 nm. >>>>> Instead of fitting with octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... >>>>> I want to fit the data with: >>>>> octane[i] = a[0] * NIR[0,i] + a[1] * NIR[1,i] + ... + >>>>> b[0]*NIR[0,i]*NIR[0,i] + b[1] * NIR[0,i]*NIR[1,i] + ... >>>>> i.e. quadratic with interaction terms. >>>>> But I don't know how to formulate this. >>>>> May I have some help please? >>>>> Thanks, >>>>> Kelvin >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius >> Alameda, CA, USA >>David Winsemius Alameda, CA, USA
Maybe Matching Threads
- How to formulate quadratic function with interaction terms for the PLS fitting model?
- How to formulate quadratic function with interaction terms for the PLS fitting model?
- How to formulate quadratic function with interaction terms for the PLS fitting model?
- Quadratic function with interaction terms for the PLS fitting model?
- Quadratic function with interaction terms for the PLS fitting model?