thr3ads.net - R help - [R] number of effective tests [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Georg Ehret

2008-Jul-11 01:46 UTC

[R] number of effective tests

Dear R community,
       I am using 6 variables to test for an effect (by linear regression).
These 6 variables are strongly correlated among each other and I would like
to find out the number of independent test that I perform in this
calcuation. For this I calculated a matrix of correlation coefficients
between the variables (see below). But to find the rank of the table in R is
not the right approach... What else could I do to find the effective number
of independent tests?
Any suggestion would be very welcome!
Thanking you and with my best regards, Georg.
> for (a in 1:6){+         for (b in 1:6){
+
r[a,b]<-summary(lm(unlist(d[a])~unlist(d[b])),na.action="na.exclude")$adj.r.squared
+         }
+ }>
> r          SR        SU        ST        DR        DU        DT
SR 1.0000000 0.9636642 0.9554952 0.2975892 0.3211303 0.3314694
SU 0.9636642 1.0000000 0.9101678 0.3324979 0.3331389 0.3323826
ST 0.9554952 0.9101678 1.0000000 0.2756876 0.3031676 0.3501157
DR 0.2975892 0.3324979 0.2756876 1.0000000 0.9981733 0.9674843
DU 0.3211303 0.3331389 0.3031676 0.9981733 1.0000000 0.9977780
DT 0.3314694 0.3323826 0.3501157 0.9674843 0.9977780 1.0000000

*************************
Georg Ehret
Johns Hopkins University
Baltimore, US

	[[alternative HTML version deleted]]

Moshe Olshansky

2008-Jul-11 02:29 UTC

head link

[R] number of effective tests

It looks like SR, SU and ST are strongly correlated to each other, as well as
DR, DU and DT.
You can try to do PCA on your 6 variables, pick the first 2 principal components
as your new variables and use them for regression.


--- On Fri, 11/7/08, Georg Ehret <georgehret at gmail.com> wrote:
> From: Georg Ehret <georgehret at gmail.com>
> Subject: [R] number of effective tests
> To: "r-help" <r-help at stat.math.ethz.ch>
> Received: Friday, 11 July, 2008, 11:46 AM
> Dear R community,
>        I am using 6 variables to test for an effect (by
> linear regression).
> These 6 variables are strongly correlated among each other
> and I would like
> to find out the number of independent test that I perform
> in this
> calcuation. For this I calculated a matrix of correlation
> coefficients
> between the variables (see below). But to find the rank of
> the table in R is
> not the right approach... What else could I do to find the
> effective number
> of independent tests?
> Any suggestion would be very welcome!
> Thanking you and with my best regards, Georg.
> 
> > for (a in 1:6){
> +         for (b in 1:6){
> +
>
r[a,b]<-summary(lm(unlist(d[a])~unlist(d[b])),na.action="na.exclude")$adj.r.squared
> +         }
> + }
> >
> > r
>           SR        SU        ST        DR        DU       
> DT
> SR 1.0000000 0.9636642 0.9554952 0.2975892 0.3211303
> 0.3314694
> SU 0.9636642 1.0000000 0.9101678 0.3324979 0.3331389
> 0.3323826
> ST 0.9554952 0.9101678 1.0000000 0.2756876 0.3031676
> 0.3501157
> DR 0.2975892 0.3324979 0.2756876 1.0000000 0.9981733
> 0.9674843
> DU 0.3211303 0.3331389 0.3031676 0.9981733 1.0000000
> 0.9977780
> DT 0.3314694 0.3323826 0.3501157 0.9674843 0.9977780
> 1.0000000
> 
> *************************
> Georg Ehret
> Johns Hopkins University
> Baltimore, US
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.

Daniel Malter

2008-Jul-11 03:34 UTC

head link

[R] number of effective tests

Hi, what do you mean by effective number of tests? How you approach it also
depends on the research tradition in your field. Some fields just include
the variables in alternative regressions and then include them jointly.
However, since your variables are so highly correlated (i.e. they convey
almost the same information), you almost certainly have to reduce the
dimensionality of your data if you want to include them "jointly"
(basically
you make 2 out of your 6 variables or whatever number). PCA, as Moshe
suggested, is a good way. It is typically used when your variables are
measured without error (that is if each of them are hard-fact numbers). If
the variables are measured with error (e.g. subject responses on a survey),
you would typically perform factor analysis.

You may want to standardize each of the six variables before performing pca
or factor analysis so that each of the six has the same scale. Otherwise the
variables with the greater variance will be much more influential than the
others (that's not the best description for it, but I hope its makes the
point).

look for prcomp() or princomp for PCA and at factanal() for factor analysis
(there are packages available for factor analysis too, I think).

Best,
Daniel

Georg Ehret wrote:> 
> Dear R community,
>        I am using 6 variables to test for an effect (by linear
> regression).
> These 6 variables are strongly correlated among each other and I would
> like
> to find out the number of independent test that I perform in this
> calcuation. For this I calculated a matrix of correlation coefficients
> between the variables (see below). But to find the rank of the table in R
> is
> not the right approach... What else could I do to find the effective
> number
> of independent tests?
> Any suggestion would be very welcome!
> Thanking you and with my best regards, Georg.
> 
>> for (a in 1:6){
> +         for (b in 1:6){
> +
>
r[a,b]<-summary(lm(unlist(d[a])~unlist(d[b])),na.action="na.exclude")$adj.r.squared
> +         }
> + }
>>
>> r
>           SR        SU        ST        DR        DU        DT
> SR 1.0000000 0.9636642 0.9554952 0.2975892 0.3211303 0.3314694
> SU 0.9636642 1.0000000 0.9101678 0.3324979 0.3331389 0.3323826
> ST 0.9554952 0.9101678 1.0000000 0.2756876 0.3031676 0.3501157
> DR 0.2975892 0.3324979 0.2756876 1.0000000 0.9981733 0.9674843
> DU 0.3211303 0.3331389 0.3031676 0.9981733 1.0000000 0.9977780
> DT 0.3314694 0.3323826 0.3501157 0.9674843 0.9977780 1.0000000
> 
> *************************
> Georg Ehret
> Johns Hopkins University
> Baltimore, US
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
nabble.com/number-of-effective-tests-tp18395271p18395867.html
Sent from the R help mailing list archive at Nabble.com.

Charles C. Berry

2008-Jul-11 03:59 UTC

head link

[R] number of effective tests

On Thu, 10 Jul 2008, Georg Ehret wrote:
> Dear R community,
>       I am using 6 variables to test for an effect (by linear regression).
> These 6 variables are strongly correlated among each other and I would like
> to find out the number of independent test that I perform in this
> calcuation.
For what purpose?

If you are trying to perform a multiple comparisons adjustment, you might 
do better to skip this bit and go on to a resampling or permutational 
procedure. There is an enormous literature on this subject. One example:

@book{West:Youn:1993,
           author = {Westfall, Peter H. and Young, S. Stanley},
           title = {Resampling-based multiple testing: {E}xamples and
 			methods for $p$-value adjustment},
           year = {1993},
           pages = {340},
           ISBN = {0471557617},
           publisher = {John Wiley \& Sons},
           keywords = {Simultaneous inference; Bootstrap}
       }

HTH,

Chuck

> For this I calculated a matrix of correlation coefficients
> between the variables (see below). But to find the rank of the table in R
is
> not the right approach... What else could I do to find the effective number
> of independent tests?
> Any suggestion would be very welcome!
> Thanking you and with my best regards, Georg.
>
>> for (a in 1:6){
> +         for (b in 1:6){
> +
>
r[a,b]<-summary(lm(unlist(d[a])~unlist(d[b])),na.action="na.exclude")$adj.r.squared
> +         }
> + }
>>
>> r
>          SR        SU        ST        DR        DU        DT
> SR 1.0000000 0.9636642 0.9554952 0.2975892 0.3211303 0.3314694
> SU 0.9636642 1.0000000 0.9101678 0.3324979 0.3331389 0.3323826
> ST 0.9554952 0.9101678 1.0000000 0.2756876 0.3031676 0.3501157
> DR 0.2975892 0.3324979 0.2756876 1.0000000 0.9981733 0.9674843
> DU 0.3211303 0.3331389 0.3031676 0.9981733 1.0000000 0.9977780
> DT 0.3314694 0.3323826 0.3501157 0.9674843 0.9977780 1.0000000
>
> *************************
> Georg Ehret
> Johns Hopkins University
> Baltimore, US
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
famprevmed.ucsd.edu/faculty/cberry  La Jolla, San Diego 92093-0901

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Jul 2008 - number of effective tests

[R] number of effective tests

[R] number of effective tests

[R] number of effective tests

[R] number of effective tests

Possibly Parallel Threads