Hello, I need help with a partial least square regression in R. I have read both the vignette and the post on R bloggers but it is hard to figure out how to do it. Here is the script I wrote: library(pls) plsrcue<- plsr(cue~fb+cn+n+ph+fung+bact+resp, data = cue, ncomp=7, na.action = NULL, method = "kernelpls", scale=FALSE, validation = "LOO", model = TRUE, x = FALSE, y = FALSE) summary(plsrcue) and I got this output, where I think I can choose the number of components based on RMSEP, but how do I choose it? Data: X dimension: 33 7 Y dimension: 33 1 Fit method: kernelpls Number of components considered: 7 VALIDATION: RMSEP Cross-validated using 33 leave-one-out segments. (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps CV 0.09854 0.07014 0.05366 0.04712 0.01935 0.01943 0.01882 0.01900 adjCV 0.09854 0.06999 0.05357 0.04703 0.01930 0.01942 0.01876 0.01893 TRAINING: % variance explained 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps X 42.33 78.82 99.15 99.95 100.00 100.00 100.00 cue 56.77 76.14 81.98 97.05 97.11 97.56 97.75 - and also, how to proceed from here? - and how to make a correlation plot? - what to do with the values, coefficients that I get in the Environment (pls values) Thanks for your help! margarida soares [[alternative HTML version deleted]]
Margarida Soares <margaridapmsoares at gmail.com> writes:> library(pls) > plsrcue<- plsr(cue~fb+cn+n+ph+fung+bact+resp, data = cue, ncomp=7, > na.action = NULL, method = "kernelpls", scale=FALSE, validation = "LOO", > model = TRUE, x = FALSE, y = FALSE) > summary(plsrcue) > > and I got this output, where I think I can choose the number of components > based on RMSEP, but how do I choose it?There are no "hard" rules for how to choose the number of components, but one rule of thumb is to stop when the RMSEP starts to flatten out, or to increase. In your case, I would say 4 components. An easier way to look at the RMSEP values is with plot(RMSEP(plsrcue)). (There are some algorithms that can suggest the number of components for you. Two of those are implemented in the development of the plsr package (hopefully released during Christmas). You can check it out here if you wish: https://github.com/bhmevik/pls . Disclaimer: I am the maintainer of the package. :) )> - and also, how to proceed from here?That depends on what you want to do/learn about the system you aremodelling. Many researchers in fields like spectroscopy or chemometrics (where PLSR originated) plot loadings and scores and infer things graphically.)> - and how to make a correlation plot?corrplot(plsrcue) - at least if you mean a correlation loadings plot. See ?corrplot for details> - what to do with the values, coefficients that I get in the Environment > (pls values)Again, that depends on what you want with your model. -- Regards, Bj?rn-Helge Mevik -------------- neste del -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: ikke tilgjengelig URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20171207/bd29fd10/attachment.sig>
Margarida Soares <margaridapmsoares at gmail.com> writes:> Thanks for your reply on pls! > I have tried to do a correlation plot but I get the following group of > graphs. Any way of having only 1 plot? > This is my script: > > corrplot(plsrcue1, comp = 1:4, radii = c(sqrt(1/2), 1), identify = FALSE, > type = "p" )"Correlation loadings" are the correlations between each variable and the selected components, so I don't see how you can have more than two sets of correlations (i.e., more than two components) in a single scatter plot. You could have three sets in a 3d plot, of course, but that you would have to implement yourself. :) -- Regards, Bj?rn-Helge Mevik -------------- neste del -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 800 bytes Desc: ikke tilgjengelig URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20171213/162e860b/attachment.sig>