nicolas baurin
2000-Sep-11 17:25 UTC
[R] SAMPLS R implementation : pbm with algorithm application
Hello R people, i'm trying to implement the Partial Least Squares algorithm called SAMPLS from "J.Comp-Aided Molecular Design", 7 (1993), 587-619. It's faster than the classical PLS algorithm for fat matrix (m>>n). Here's the algorithm from the article of Bush B. L. and Nachbar R.B.: X is the matrix of explanatories proprieties (m*n) , y the matrix of responses, h the number of latent variables extracted XT is for X matrix transposed x* is for the quantities for one sample (y* is the response predicted from the model derived; i used one to test my R traduction compared to the R pls module ) Calculate the covariance matrix C=XXT and c*=Xx* for prediction y is centered and become y1 y*1=0 For h =1,2,3...hmax s=Cyh center s working scalar for prediction sample s*=c*Tyh orthogonalize s to previous t: for g=1,...(h-1), s=s-(tgTs/tgTtg)tg orthogonalize s* to previous t*: for g=1,...(h-1), s*=s*-(tgTs/tgTtg)t*g t*h=s* th=s th2=tTt betah=(tTyh)/th2 update yh+1=yh-betahth buid up prediction y*h+1=y*h+betaht*h end of cycle ----------------------------------- R-code ##xe and ye are the explanatories and responses matrices, xtest and ytestsampls the variables for 1 sample x2<-scale(xe,scale=FALSE) y2<-scale(ye,scale=FALSE) lv<-1 xtest<-as.matrix(x2[1,]) t<-matrix(0,nrow(ye),1) c<-xe%*%t(xe) yh<-y2 ytestsampls<-0 ctest<-xe%*%xtest for (h in 1:lv) { s<-c%*%yh s<-scale(s,scale=FALSE) stest<-t(ctest)%*%yh ##what follows works only for h=1 and 2, i know if (h>1) { s<-s- ( as.numeric( (t(t)%*%s) / (t(t)%*%t) ) *t ) stest<-stest-( as.numeric( (t(t)%*%s) / (t(t)%*%t) ) *ttest ) } ttest<-stest t<-s t2<-t(t)%*%t beta<-t(t)%*%yh beta<-as.numeric(beta/t2) ytestsampls<-ytestsampls + as.numeric(beta)*(ttest) yh<-yh-(beta*t) } ytestsampls2<-ytestsampls+mean(ye) ------------------- When lv (number of variables extracted ) is 1 , no problem the y predicted (ytestsampls2) is the same as when using the R module pls (library(pls)). But when using lv=2, there is a difference , thus an error in my code that must come from the update steps. Does it come from the original algorithm or from my traduction. Merci d'avance, sorry for the size of this e-mail and thanks for reading it till all, -- Nicolas Baurin Doctorant Institut de Chimie Organique et Analytique, UPRES-A 6005 Universit? d'Orl?ans, BP 6759 45067 ORLEANS Cedex 2, France Tel: (33+) 2 38 49 45 77 -------------- next part -------------- An HTML attachment was scrubbed... URL: https://stat.ethz.ch/pipermail/r-help/attachments/20000911/dca546d4/attachment.html