Hi, I have optimized the shrinkage parameter (GCV)for ridge and got my r2 value is 70% . to check the sensitivity of the result, I did permutation test. I permuted the response vector and run for 1000 times and draw a distribution. But now, I get r2 values highest 98% and some of them more than 70 %. Is it expected from such type of test? *I was under impression that, r2 with real data set will always maximum! And permutation will not be effected i.e. permuted r2 will always less than real one! * ** thanks a lot Alex [[alternative HTML version deleted]]
Meyners, Michael, LAUSANNE, AppliedMathematics
2009-Aug-14 12:41 UTC
[R] Permutation test and R2 problem
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Alex Roy > Sent: Freitag, 14. August 2009 12:05 > To: r-help at r-project.org > Subject: [R] Permutation test and R2 problem > > Hi, > > > I have optimized the shrinkage parameter (GCV)for ridge and > got my r2 value is 70% . to check the sensitivity of the > result, I did permutation test. I permuted the response > vector and run for 1000 times and draw a distribution. But > now, I get r2 values highest 98% and some of them more than > 70 %. Is it expected from such type of test?Depends on what exactly you are doing and on your data, but surely this is not "unexpected" (even less given the information we have).> *I was under impression that, r2 with real data set will > always maximum! And permutation will not be effected i.e. > permuted r2 will always less than real one! *Why would that be? And even more, why would you do a permutation test then if you knew in advance that all permuted values are below your observed one? You optimize the shrinkage parameter for your data, not your data for your shrinkage parameter. In the latter case you would have been right. Given any fixed shrinkage parameter, you can always find "some data" (e.g. the predicted values) that fit better than the original. I guess that in most non-artificial cases with a reasonable amount of data, there are at least some permutations that give a higher r2. Not sure what kind of sensitivity you want to check, but probably you'd have to optimize the shrinkage parameter as well. And of course to make sure that your permutations correspond to the original constraints. But this opens a wide field, refer to any textbook on this matter for further detail (e.g. Edgington & Onghena, Randomization tests). HTH, Michael