Hi friends,
If someone can find out some time to go through my problem would be really
grateful.
I have a dataset(dataset1) as shown below:--
recmeanC1 recmeanC2 recmeanC3 recmeanC4 i1 i2 i3 i4 i5 i6 i7
i8 i9 i10 i11
1 NA 1 1.00 1.800000
NA 1 NA 1 1 NA 2 2 2 NA 2
2 2 2 1.00 1.833333
2 2 NA NA NA 1 1 3 2 2 2
3 2 2 2.00 2.000000
2 2 NA NA NA 2 2 2 2 2 2
4 2 2 2.00 1.333333
2 2 NA NA NA 2 1 1 2 2 1
5 2 NA 1.00 2.000000 2
NA NA NA NA 1 2 3 2 2 2
6 2 2 2.00 2.333333
2 2 NA NA NA 2 1 3 3 3 2
7 1 NA 1.00 2.333333 1
NA NA NA NA 1 2 3 2 3 3
I want the results of correlation exactly as SPSS produces,with significance
value and N-size.
Here recmeanC1,C2,C3,C4 means the category means of the items....Category 1
has only item 1somean,same as item1,cat 2 has 2,cat 3 has 3,4,5,6 and cat 4
has 7,8,9,10,11,12.For all teh 7 record sets fetched i haves prepared the
dataset for correlation function.
My correlation function looks like this:----
#Function for correlation
getCorrelationVal<-function(corr_dataset)
{
#Correlation of items and categories
if(corr_dataset=="NULL")
{
print("Correlation cannot be performed on this null dataset.")
}
else
{
BPcor<-cor(x=corr_dataset,y = NULL, use
="complete.obs",method c("pearson"))
return(list(matrix=BPcor)
}
}
Here corr_dataset is the data set i pass,as i have shown above.now how do i
find teh significance level for each correlation.valid N-size however i can
find.
this will generate correlation values like this:----(i have not shown the
whole dataset)
recmeanC1 recmeanC2 recmeanC3 recmeanC4 i1 i2
recmeanC1 1.0000000 0.77020798 0.72965359 0.6352532 1.0000000
0.77020798
recmeanC2 0.7702080 1.00000000 0.99016409 0.3057984 0.7702080
1.00000000
recmeanC3 0.7296536 0.99016409 1.00000000 0.3138384 0.7296536
0.99016409
recmeanC4 0.6352532 0.30579837 0.31383836 1.0000000 0.6352532
0.30579837
i1 1.0000000 0.77020798 0.72965359 0.6352532 1.0000000
0.77020798
i2 0.7702080 1.00000000 0.99016409 0.3057984 0.7702080
1.00000000
i3 0.7702080 1.00000000 0.99016409 0.3057984 0.7702080
1.00000000
i4 0.7702080 1.00000000 0.99016409 0.3057984 0.7702080
1.00000000
i5 0.7702080 1.00000000 0.99016409 0.3057984 0.7702080
1.00000000
i6 0.4970501 0.82035423 0.89229418 0.2960185 0.4970501
0.82035423
i7 0.3614032 0.69588900 0.76912242 0.2981885 0.3614032
0.69588900
i8 0.1756620 0.13529629 0.11867817 0.3254706 0.1756620
0.13529629
i9 0.5606119 0.43178777 0.37186590 0.3895178 0.5606119
0.43178777
i10 0.5380528 0.58589367 0.60919478 0.6058848 0.5380528
0.58589367
i11 0.4413674 -0.06798894 -0.07156563 0.7973308 0.4413674
-0.06798894
The problem is when i calcualte correlation without taking into
consideration the signification of every pair in the correlation values
shown above i just pass the above dataset .But how do i get significance of
e say:--
recmeanC1&recmeanC2 or say recmeanC1 & i1.
I can add this in my corr function shown above but:----
#Finding out significance of the two items whose correlations are being
found
sig_value<-cor.test(corr_dataset)
and also return that :-
return(list(matrix=BPcor,sig=sig_value))
For example recmeanC1 and i1 has to be passed here..as 2 separate
dataframes,shown below if i pass the dataset for (recmeanC1 & i1 ) as as
single datframe,cor.test() function doesn't accept it.Moreover cor()
function took care of what will be crossed with what and the correlation
produced.Now do i have to manually get possible pairs of the column names of
my dataset(shown above dataset 1),and also the data and then pass to
cor.test and calculate the significance.
Isn't there any easier way to do this,with minimum number of lines of
code.Because I am dealing with huge datasets.
--
Thanks In Advance :)
Moumita
[[alternative HTML version deleted]]