Giulio Di Giovanni
2009-Jun-04 16:31 UTC
[R] help needed with ridge regression and choice of lambda with lm.ridge!!!
Hi, I'm a beginner in the field, I have to perform the ridge regression with lm.ridge for many datasets, and I wanted to do it in an automatic way. In which way I can automatically choose lambda ? As said, right now I'm using lm.ridge MASS function, which I found quite simple and fast, and I've seen that among the returned values there are HKB estimate of the ridge constant and L-W estimate of the ridge constant, together with GCV values. I found on the web other studies where people simply choose among one of these quantities. It will be perfect to me to do the same, but how? Which are the decisional criteria, if there are criteria? HKB, L-W or none of these ? Another (for me) important question: Aren't the lambda in general supposed to increase with the increasing of the number of predictors ? Isn't the ridge regression supposed to work fine even with number of predictors > number of observations? At least I was said so... But if I have a dataset of 16 observations and 34 predictors I get:> fmr<-lm.ridge(y~0+ .+., data=x, lambda = seq(0,10,0.01)) >select(fmr)modified HKB estimator is -1.850770e-28 modified L-W estimator is -2.012264e-28 smallest value of GCV at 0.01 and similar values if I reduce the number of predictor in the dataset, all numbars between 17and 34. but if I build a dataset with only 16 predictors (euqal to the number of rows) I get:> select(fmr)modified HKB estimator is 0.1511719 modified L-W estimator is 3.322775 smallest value of GCV at 0.51 And at the same way, other accettable values for any smaller dataset... Please, could anybody help me? Thanks in advance Giulio _________________________________________________________________ [[elided Hotmail spam]] [[alternative HTML version deleted]]