Dear R-users, I would like to process some spectroscopic data with R, and I was hoping some people might have some example code on how to do this. I would like to be able to do the following things: * Detect outlier-spectra -> This can be done by using scoreplot from the pls package * Determine the range of the spectrum to be used -> For this, I should be able to calculate the regression coefficients * Determine the optimal number of elements in a model * Anything else that you guys think could be useful :-) Any help is greatly appreciated Dirk -- Dirk De Becker Work: Kasteelpark Arenberg 30 3001 Heverlee phone: ++32(0)16/32.14.44 fax: ++32(0)16/32.85.90 Home: Waversebaan 90 3001 Heverlee phone: ++32(0)16/23.36.65 dirk.debecker at biw.kuleuven.be mobile phone: ++32(0)498/51.19.86 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Dirk De Becker wrote:> * Determine the range of the spectrum to be used -> For this, I should > be able to calculate the regression coefficientsYou can get the regression coefficients from a PLSR/PCR with the coef() function. See ?coef.mvr However, using the regression coefficients alone for selecting variables/regions, can be 'dangerous' because the variables are highly correlated. One alternative is 'variable importance' measures, e.g. VIP (variable importance in projections) as described in Chong, Il-Gyo & Jun, Chi-Hyuck, 2005, Performance of some variable selection methods when multicollinearity is present, Chemometrics and Intelligent Laboratory Systems 78, 103--112. A crude implementation of VIP can be found in http://mevik.net/work/software/pls.html Another alternative is to use jackknife-estimated uncertainties of the regression coefficients in significance tests. (I don't have any reference or implementation, sorry. :-) The correlation loadings can also give valuable information about which variables that might be important for the regression. See ?corrplot in the pls package. -- Bj??rn-Helge Mevik
In the biophysics group of Vrije Universiteit Amsterdam we are working on an R package implementing a problem solving environment for multi-way spectroscopic modeling. We do not have a public version yet, but one will be made available in the future. We plan to describe the package at useR2006. If you are interested in the sort of (parametric model-based) analysis we are doing, you can see some project documentation at http://www.nat.vu.nl/comp/proj4.html. To reply to one of your questions, for determination of the optimal number of spectrally distinct components, you could simply take the SVD of your measurements and plot the singular values. The number of values that "stand out" may be used as an estimate of the number of components. Feel free to contact me for more information. We look forward to providing a complete and powerful package soon! ---- Katharine Mullen Department of Physics and Astronomy Faculty of Sciences Vrije Universiteit Amsterdam de Boelelaan 1081 1081 HV Amsterdam The Netherlands room: T.1.06 tel: +31 205987870 fax: +31 205987992 e-mail: kate at nat.vu.nl http://www.nat.vu.nl/~kate/