汤靖
2014-Jan-10 15:20 UTC
[R] Selecting individuals to maximize the correlation of two variables
Hi, Maybe it is not directly related to R but sine many are statistical experts so I post it here for help: I have two variables (say x and y) of length n. Now the cor(x,y) is close to 0. I need to find the subset in {1,.. n} so that the correlation between x and y using the subset data is maximized. A trivial choice would be selecting 2 individuals only so that cor(x,y) =1. As the size of the subset increases, cor(x,y) will go down to 0, but I am assuming the best correlation for each size of the subsets would not be monotonically decreasing. Any idea of how to find the solution? Thanks, Jing [[alternative HTML version deleted]]
Enrico Schumann
2014-Jan-11 14:38 UTC
[R] Selecting individuals to maximize the correlation of two variables
On Fri, 10 Jan 2014, ?? <mr_tangjing at hotmail.com> writes:> Hi, > Maybe it is not directly related to R but sine many are statistical experts so I post it here for help: > > I have two variables (say x and y) of length n. Now the cor(x,y) is close to 0. I need to find the subset in {1,.. n} so that the correlation between x and y using the subset data is maximized. A trivial choice would be selecting 2 individuals only so that cor(x,y) =1. As the size of the subset increases, cor(x,y) will go down to 0, but I am assuming the best correlation for each size of the subsets would not be monotonically decreasing. > > Any idea of how to find the solution? > > Thanks, > Jing > >Hi Jing, in chapter 1 of the NMOF manual I discuss a very similar problem; perhaps it helps you (but it's a draft only...) http://enricoschumann.net/NMOF.htm#NMOFmanual -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net