Hi, I would like to obtain correlation parameters (e.g., coefficients, p-value) for multiple samples in regard to a reference. I have my data in a table with the reference as the second row (first row are headers) and then each sample in a row. What I do so far is load up the data, get the reference sample and use "apply" and "lm" to do the regression: x <- read.table("file.txt",header=TRUE) x <- t(x) ref <- x[,1] test <- apply(x,2,function(z)lm(z~ref)) My problem is that while I can see the results in test and even obtain the F-statistic with summary for individual rows (i.e., summary(test[[1]]), I can't seem to be able to apply the summary to all rows or get only the coefficients from test so I can save them to a new file. Ideally, I would like to save a new table to a file withthe correlation coefficients and the F-statistic for each of my samples. Any suggestions would be extremely useful. Thanks. -- View this message in context: http://www.nabble.com/Obtaining-correlation-parameters-for-multiple-rows-tp16851980p16851980.html Sent from the R help mailing list archive at Nabble.com.
Jorge Ivan Velez
2008-Apr-25 04:31 UTC
[R] Obtaining correlation parameters for multiple rows
Hi, I'm sure it could be better but try this: # F statistics based on lm FSTAT=function(y,x) summary(lm(y~x))$f[1] # Correlation and p-value CORR=function(y,x){ tc=cor.test(x,y,method="spearman",alternative="two.sided") temp=matrix(c(tc$estimate,tc$p.value),ncol=2) colnames(temp)=c('rho','pvalue') temp } # 1000 variables and 100 samples set.seed(124) X=matrix(rnorm(1000*100),ncol=100) # Correlation coefficient, p-value and F statistics corr=t(apply(X[-1,],1,CORR,x=X[1,])) # Your reference is X[1,] fs=apply(X[-1,],1,FSTAT,x=X[1,]) # Your reference is X[1,] # Report temp=data.frame(fstats=fs,rho=corr[,1],pvalue=corr[,2]) rownames(temp)=paste("X",2:nrow(X),sep="") dim(temp) [1] 999 3 temp[1:10,] fstats rho pvalue X2 1.421307790 -0.05038104 0.61807912 X3 0.051423768 -0.04614461 0.64795111 X4 0.128000634 0.01795380 0.85902211 X5 0.990235820 -0.06540654 0.51730942 X6 5.569006085 0.24232823 0.01532172 X7 0.001862766 -0.01436544 0.88703532 X8 1.025363077 -0.10628263 0.29206908 X9 0.679794149 0.06509451 0.51927479 X10 1.296034903 0.09492949 0.34686211 X11 0.126636867 0.05137714 0.61110106 HTH, Jorge On Thu, Apr 24, 2008 at 3:40 PM, jpnitya <joao@genetics.med.harvard.edu> wrote:> > Hi, > > I would like to obtain correlation parameters (e.g., coefficients, p-value) > for multiple samples in regard to a reference. I have my data in a table > with the reference as the second row (first row are headers) and then each > sample in a row. What I do so far is load up the data, get the reference > sample and use "apply" and "lm" to do the regression: > > x <- read.table("file.txt",header=TRUE) > x <- t(x) > ref <- x[,1] > test <- apply(x,2,function(z)lm(z~ref)) > > My problem is that while I can see the results in test and even obtain the > F-statistic with summary for individual rows (i.e., summary(test[[1]]), I > can't seem to be able to apply the summary to all rows or get only the > coefficients from test so I can save them to a new file. Ideally, I would > like to save a new table to a file withthe correlation coefficients and the > F-statistic for each of my samples. > > Any suggestions would be extremely useful. > > Thanks. > > -- > View this message in context: > http://www.nabble.com/Obtaining-correlation-parameters-for-multiple-rows-tp16851980p16851980.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]