Hi,
I would like to obtain correlation parameters (e.g., coefficients, p-value)
for multiple samples in regard to a reference. I have my data in a table
with the reference as the second row (first row are headers) and then each
sample in a row. What I do so far is load up the data, get the reference
sample and use "apply" and "lm" to do the regression:
x <- read.table("file.txt",header=TRUE)
x <- t(x)
ref <- x[,1]
test <- apply(x,2,function(z)lm(z~ref))
My problem is that while I can see the results in test and even obtain the
F-statistic with summary for individual rows (i.e., summary(test[[1]]), I
can't seem to be able to apply the summary to all rows or get only the
coefficients from test so I can save them to a new file. Ideally, I would
like to save a new table to a file withthe correlation coefficients and the
F-statistic for each of my samples.
Any suggestions would be extremely useful.
Thanks.
--
View this message in context:
http://www.nabble.com/Obtaining-correlation-parameters-for-multiple-rows-tp16851980p16851980.html
Sent from the R help mailing list archive at Nabble.com.
Jorge Ivan Velez
2008-Apr-25 04:31 UTC
[R] Obtaining correlation parameters for multiple rows
Hi,
I'm sure it could be better but try this:
# F statistics based on lm
FSTAT=function(y,x) summary(lm(y~x))$f[1]
# Correlation and p-value
CORR=function(y,x){
tc=cor.test(x,y,method="spearman",alternative="two.sided")
temp=matrix(c(tc$estimate,tc$p.value),ncol=2)
colnames(temp)=c('rho','pvalue')
temp
}
# 1000 variables and 100 samples
set.seed(124)
X=matrix(rnorm(1000*100),ncol=100)
# Correlation coefficient, p-value and F statistics
corr=t(apply(X[-1,],1,CORR,x=X[1,])) # Your reference is X[1,]
fs=apply(X[-1,],1,FSTAT,x=X[1,]) # Your reference is X[1,]
# Report
temp=data.frame(fstats=fs,rho=corr[,1],pvalue=corr[,2])
rownames(temp)=paste("X",2:nrow(X),sep="")
dim(temp)
[1] 999 3
temp[1:10,]
fstats rho pvalue
X2 1.421307790 -0.05038104 0.61807912
X3 0.051423768 -0.04614461 0.64795111
X4 0.128000634 0.01795380 0.85902211
X5 0.990235820 -0.06540654 0.51730942
X6 5.569006085 0.24232823 0.01532172
X7 0.001862766 -0.01436544 0.88703532
X8 1.025363077 -0.10628263 0.29206908
X9 0.679794149 0.06509451 0.51927479
X10 1.296034903 0.09492949 0.34686211
X11 0.126636867 0.05137714 0.61110106
HTH,
Jorge
On Thu, Apr 24, 2008 at 3:40 PM, jpnitya <joao@genetics.med.harvard.edu>
wrote:
>
> Hi,
>
> I would like to obtain correlation parameters (e.g., coefficients, p-value)
> for multiple samples in regard to a reference. I have my data in a table
> with the reference as the second row (first row are headers) and then each
> sample in a row. What I do so far is load up the data, get the reference
> sample and use "apply" and "lm" to do the regression:
>
> x <- read.table("file.txt",header=TRUE)
> x <- t(x)
> ref <- x[,1]
> test <- apply(x,2,function(z)lm(z~ref))
>
> My problem is that while I can see the results in test and even obtain the
> F-statistic with summary for individual rows (i.e., summary(test[[1]]), I
> can't seem to be able to apply the summary to all rows or get only the
> coefficients from test so I can save them to a new file. Ideally, I would
> like to save a new table to a file withthe correlation coefficients and the
> F-statistic for each of my samples.
>
> Any suggestions would be extremely useful.
>
> Thanks.
>
> --
> View this message in context:
>
http://www.nabble.com/Obtaining-correlation-parameters-for-multiple-rows-tp16851980p16851980.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]