jz7 at duke.edu
2006-Jul-31 02:45 UTC
[R] question about dataframe ("sensory") in PLS package
Dear all, I am trying to my dataframe for the PLS analysis using the PLS package. However I have some trouble generating the correct dataframe. The main problem is how to use one name to represent several columns in the dataframe. The example dataframe in PLS package is called "sensory". I cannot directly read the data file since it's a binary file. If I use "names(sensory)" command, I will get two names: "Quality" and "Panel". But if I use "summary(sensory)" command, I will get information of five columns for "Quality" and 6 columns for "Panel" (such as "Quality.Acidity" "Quality.Peroxide"...). So when I use PLS regression, the function is simply "Panel ~ Quality" (but it's actually multiple regression). Does anyone know how to build such dataframe? Please share some experience. Really appreciate the help! Sincerely, Jeny
Gabor Grothendieck
2006-Jul-31 03:09 UTC
[R] question about dataframe ("sensory") in PLS package
Try: ?sensory str(sensory) dput(sensory) lapply(sensory, class) lapply(sensory, dim) to see what it looks like inside. Seems that sensory is a data frame consisting of two columns each of which is a matrix except that each has a class of "AsIs". Thus try this (where I(...) creates objects of class "AsIs"): mat1 <- cbind(a = 1:5, b = 11:15) mat2 <- cbind(x = 21:25, y = 31:35) DF <- data.frame(A = I(mat1), B = I(mat2)) On 7/30/06, jz7 at duke.edu <jz7 at duke.edu> wrote:> Dear all, > > I am trying to my dataframe for the PLS analysis using the PLS package. > However I have some trouble generating the correct dataframe. The main > problem is how to use one name to represent several columns in the > dataframe. > > The example dataframe in PLS package is called "sensory". I cannot > directly read the data file since it's a binary file. If I use > "names(sensory)" command, I will get two names: "Quality" and "Panel". But > if I use "summary(sensory)" command, I will get information of five > columns for "Quality" and 6 columns for "Panel" (such as "Quality.Acidity" > "Quality.Peroxide"...). So when I use PLS regression, the function is > simply "Panel ~ Quality" (but it's actually multiple regression). > > Does anyone know how to build such dataframe? Please share some > experience. Really appreciate the help! > > Sincerely, > Jeny > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
jz7 at duke.edu
2006-Jul-31 23:29 UTC
[R] question about prediction etc. in Ridge regression (MASS library)
Dear all, I am trying to apply Ridge regression to my dataset, and then I would like to predict the Y responses using the Ridge model (of certain lambda) for new data point. The only Ridge regression functions I found is in "MASS" library. However, there are very few functions available: lm.ridge(), plot(), and select(). I didn't see any option to "predict" the Y response. Does anyone know what else functions I could use to make prediction (using Ridge model) or how I should write my own code to do the prediction? Also, is there any way to calculate R^2 (or q^2) or the LOO-CV for Ridge model? Really appreciate your kind help! Sincerely, Jeny
Andris Jankevics
2006-Aug-01 07:15 UTC
[R] question about dataframe ("sensory") in PLS package
Hello, I do this in such way: DATAX <- matrix(seq(1,6,1),2,3) DATAY <- matrix (seq(1,4,1),2,2) rownames(DATAX) <- c("s1","s2") rownames(DATAY) <- c("s1","s2") colnames (DATAX) <- c("v1","v2","v3") colnames (DATAY) <- c("respone_1","response_2") KAL <- data.frame (N = rownames(DATAX)) KAL$Y <- DATAY KAL$X <- DATAX KAL$X KAL$Y DATAX is a matrix of testing data, but DATAY is a matrix of responses. Andris Jankevics On Pirmdiena, 31. J?lijs 2006 05:45, jz7 at duke.edu wrote:> Dear all, > > I am trying to my dataframe for the PLS analysis using the PLS package. > However I have some trouble generating the correct dataframe. The main > problem is how to use one name to represent several columns in the > dataframe. > > The example dataframe in PLS package is called "sensory". I cannot > directly read the data file since it's a binary file. If I use > "names(sensory)" command, I will get two names: "Quality" and "Panel". But > if I use "summary(sensory)" command, I will get information of five > columns for "Quality" and 6 columns for "Panel" (such as "Quality.Acidity" > "Quality.Peroxide"...). So when I use PLS regression, the function is > simply "Panel ~ Quality" (but it's actually multiple regression). > > Does anyone know how to build such dataframe? Please share some > experience. Really appreciate the help! > > Sincerely, > Jeny > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code.
jz7 at duke.edu
2006-Aug-02 15:29 UTC
[R] question about correlation coefficeint and root mean square
Dear all, I am using different multiple regression models (OLS and principal component regression (PCR)) to make prediction of my test set. And those models come from the same training set, except that the number of variables or descriptors (columns of X) used in OLS is less than those used in PCR. And I use square correlation coefficient (r^2) and root mean square to see the relationship between my prediction and the experimental measurements of the test set. Here is the problem: My r^2 from PCR prediction is higher than r^2 from OLS prediction (0.8 vs. 0.7). However, my RMS of PCR prediction is also higher than OLS (0.55 vs. 0.48). I would expect r^2 and RMS show consistant trend. But why am I getting opposite results? Is it because PCR is a biased method? Which one (r^2 or RMS) should be more reliable to evaluate the model? Really appreciate your kind help! Sincerely, Jeny