thr3ads.net - R help - [R] Calculating signicance value [Jan 2009]

If this information is useful, please help other people find it:
Share via:

Moumita Das

2009-Jan-02 14:09 UTC

[R] Calculating signicance value

Hi friends,
If someone can find out some time to go through my problem would be really
grateful.

I have a dataset(dataset1) as shown below:--
       recmeanC1 recmeanC2   recmeanC3    recmeanC4     i1 i2 i3 i4 i5 i6 i7
i8 i9 i10 i11
1         NA         1               1.00                 1.800000
NA  1 NA  1  1 NA  2  2  2  NA   2
2          2         2               1.00                   1.833333
2  2 NA NA NA  1  1  3  2   2   2
3          2         2               2.00                    2.000000
2  2 NA NA NA  2  2  2  2   2   2
4          2         2               2.00                   1.333333
2  2 NA NA NA  2  1  1  2   2   1
5          2        NA            1.00                    2.000000        2
NA NA NA NA  1  2  3  2   2   2
6          2         2             2.00                     2.333333
2  2 NA NA NA  2  1  3  3   3   2
7          1        NA           1.00                     2.333333        1
NA NA NA NA  1  2  3  2   3   3

I want the results of correlation exactly as SPSS produces,with significance
value and N-size.
Here recmeanC1,C2,C3,C4 means the category means of the items....Category 1
has only item 1somean,same as item1,cat 2 has 2,cat 3 has 3,4,5,6 and cat 4
has 7,8,9,10,11,12.For all teh 7 record sets fetched i haves prepared the
dataset for correlation function.


My correlation function looks like this:----
#Function for correlation
getCorrelationVal<-function(corr_dataset)
{

    #Correlation of items and categories
    if(corr_dataset=="NULL")
    {
        print("Correlation cannot be performed on this null dataset.")
    }
    else
    {


        BPcor<-cor(x=corr_dataset,y = NULL, use
="complete.obs",method c("pearson"))
        return(list(matrix=BPcor)

    }
}

Here corr_dataset is the data set i pass,as i have shown above.now how do i
find teh significance level for each correlation.valid N-size however i can
find.


this will generate correlation values like this:----(i have not shown the
whole dataset)
    recmeanC1   recmeanC2   recmeanC3 recmeanC4         i1          i2
recmeanC1  1.0000000  0.77020798  0.72965359 0.6352532  1.0000000
0.77020798
recmeanC2  0.7702080  1.00000000  0.99016409 0.3057984  0.7702080
1.00000000
recmeanC3  0.7296536  0.99016409  1.00000000 0.3138384  0.7296536
0.99016409
recmeanC4  0.6352532  0.30579837  0.31383836 1.0000000  0.6352532
0.30579837
i1         1.0000000  0.77020798  0.72965359 0.6352532  1.0000000
0.77020798
i2         0.7702080  1.00000000  0.99016409 0.3057984  0.7702080
1.00000000
i3         0.7702080  1.00000000  0.99016409 0.3057984  0.7702080
1.00000000
i4         0.7702080  1.00000000  0.99016409 0.3057984  0.7702080
1.00000000
i5         0.7702080  1.00000000  0.99016409 0.3057984  0.7702080
1.00000000
i6         0.4970501  0.82035423  0.89229418 0.2960185  0.4970501
0.82035423
i7         0.3614032  0.69588900  0.76912242 0.2981885  0.3614032
0.69588900
i8         0.1756620  0.13529629  0.11867817 0.3254706  0.1756620
0.13529629
i9         0.5606119  0.43178777  0.37186590 0.3895178  0.5606119
0.43178777
i10        0.5380528  0.58589367  0.60919478 0.6058848  0.5380528
0.58589367
i11        0.4413674 -0.06798894 -0.07156563 0.7973308  0.4413674
-0.06798894



The problem is when i calcualte correlation without taking into
consideration the signification of every pair in the correlation values
shown above i just pass the above dataset .But how do i get significance of
e say:--
recmeanC1&recmeanC2 or say recmeanC1 & i1.
I can add this in my corr function shown above but:----

#Finding out significance of the two items whose correlations are being
found
sig_value<-cor.test(corr_dataset)
and also return that :-
return(list(matrix=BPcor,sig=sig_value))

 For example recmeanC1 and i1 has to be passed here..as 2 separate
dataframes,shown below if i pass the dataset for (recmeanC1 & i1 ) as as
single datframe,cor.test() function doesn't accept it.Moreover cor()
function took care of what will be crossed with what and the correlation
produced.Now do i have to manually get possible pairs of the column names of
my dataset(shown above dataset 1),and also the data and then pass to
cor.test and calculate the significance.
Isn't there any easier way to do this,with minimum number of lines of
code.Because I am dealing with huge datasets.



-- 
Thanks In Advance :)
Moumita

	[[alternative HTML version deleted]]

Ben Bolker

2009-Jan-03 17:49 UTC

head link

[R] Calculating signicance value

Moumita Das <das.moumita.online <at> gmail.com> writes:

 [snip snip snip]
> But how do i get significance of
> e say:--
> recmeanC1&recmeanC2 or say recmeanC1 & i1.
> I can add this in my corr function shown above but:----
> 
> #Finding out significance of the two items whose correlations are being
> found
> sig_value<-cor.test(corr_dataset)
> and also return that :-
> return(list(matrix=BPcor,sig=sig_value))
> 
>  For example recmeanC1 and i1 has to be passed here..as 2 separate
> dataframes,shown below if i pass the dataset for (recmeanC1 & i1 ) as
as
> single datframe,cor.test() function doesn't accept it.Moreover cor()
> function took care of what will be crossed with what and the correlation
> produced.Now do i have to manually get possible pairs of the column names
of
> my dataset(shown above dataset 1),and also the data and then pass to
> cor.test and calculate the significance.
> Isn't there any easier way to do this,with minimum number of lines of
> code.Because I am dealing with huge datasets.
  Take a look at stats:::cor.test.default .  It's pretty
long and complicated but most of the complication is for
dealing with different correlations (i.e., other than Pearson),
and the key lines for your purposes are:

r <- cor(x, y)
df <- n - 2
ESTIMATE <- c(cor = r)
PARAMETER <- c(df = df)
STATISTIC <- c(t = sqrt(df) * r/sqrt(1 - r^2))
p <- pt(STATISTIC, df)

  You can incorporate this in your function.
(I'm assuming you're not treating these computed values
as actual probabilities of observing the data given
the null hypothesis, since there is a huge multiple testing
issue ...)

 good luck
   Ben Bolker

Seemingly Similar Threads

Search for more apparently analagous threads

R help - Jan 2009 - Calculating signicance value

[R] Calculating signicance value

[R] Calculating signicance value

Seemingly Similar Threads