thr3ads.net - R help - [R] glm analysis repeated for 900 variables [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Terwisscha van Scheltinga, A.

2009-Sep-22 16:14 UTC

[R] glm analysis repeated for 900 variables

Dear R users,

Could you help my with the following problem?
I want to repeat a glm analysis with 2 independent variables for all 900
variables (snps) in my data set. So, I want to check whether snp1 has a
different effect on my outcome variable in patients and
controls(phenotype). And repeat that for snp2 to snp900.
Is there an easy way to get a summary of the data, e.g. a list of P
values of all 900 variables?

I tried something with a loop:
for (i in 1:length(data)) { print (summary (glm
(outcome~data[[i]]*phenotype, data=data))) }   # This works, but gives
900 written summaries


for (i in 1:length(data)) { coef (summary (glm
(outcome~data[[i]]*phenotype, data=data))) }   # changing print to coef
gives no output

for (i in 1:length(data)) { glm.data <- glm
(outcome~data[[i]]*phenotype, data=data)) }
summary (glm.data)
# gives only output of the last variable

for (i in 1:length(data)) { glm.data[[i]] <- glm
(outcome~data[[i]]*phenotype, data=data)) }
summary (glm.data[[i]])
# gives only output of the last variable

Or should I use tapply or something like that? In what way?

Thanks!

Afke Terwisscha
email: aterwiss@umcutrecht.nl <mailto:aterwiss@umcutrecht.nl>





------------------------------------------------------------------------------

De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct
te informeren door het bericht te retourneren. Het Universitair Medisch
Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de W.H.W.
(Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd bij
de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.

Denk s.v.p aan het milieu voor u deze e-mail afdrukt.

------------------------------------------------------------------------------

This message may contain confidential information and is...{{dropped:14}}

Christian Schulz

2009-Sep-23 11:26 UTC

head link

[R] glm analysis repeated for 900 variables

Hi,


nvars <- 902
data <-  as.data.frame(matrix(runif(100*nvars),ncol=nvars))
colnames(data)[901] <- c('phenotype')
colnames(data)[902] <- c('outcome')

### catch all aic values ###
res  <-  matrix(nrow=900,ncol=2)
for (i in 1:(length(data)-2)) {
res[i,1] <-  names(data)[i]
res[i,2] <-  glm(outcome~data[,i]*phenotype, data=data)$aic
}
res


> Dear R users,
>
> Could you help my with the following problem?
> I want to repeat a glm analysis with 2 independent variables for all 900
> variables (snps) in my data set. So, I want to check whether snp1 has a
> different effect on my outcome variable in patients and
> controls(phenotype). And repeat that for snp2 to snp900.
> Is there an easy way to get a summary of the data, e.g. a list of P
> values of all 900 variables?
>
> I tried something with a loop:
> for (i in 1:length(data)) { print (summary (glm
> (outcome~data[[i]]*phenotype, data=data))) }   # This works, but gives
> 900 written summaries
>
>
> for (i in 1:length(data)) { coef (summary (glm
> (outcome~data[[i]]*phenotype, data=data))) }   # changing print to coef
> gives no output
>
> for (i in 1:length(data)) { glm.data <- glm
> (outcome~data[[i]]*phenotype, data=data)) }
> summary (glm.data)
> # gives only output of the last variable
>
> for (i in 1:length(data)) { glm.data[[i]] <- glm
> (outcome~data[[i]]*phenotype, data=data)) }
> summary (glm.data[[i]])
> # gives only output of the last variable
>
> Or should I use tapply or something like that? In what way?
>
> Thanks!
>
> Afke Terwisscha
> email: aterwiss at umcutrecht.nl <mailto:aterwiss at umcutrecht.nl>
>
>
>
>
>
>
------------------------------------------------------------------------------
>
> De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is
> uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht
> ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender
direct
> te informeren door het bericht te retourneren. Het Universitair Medisch
> Centrum Utrecht is een publiekrechtelijke rechtspersoon in de zin van de
W.H.W.
> (Wet Hoger Onderwijs en Wetenschappelijk Onderzoek) en staat geregistreerd
bij
> de Kamer van Koophandel voor Midden-Nederland onder nr. 30244197.
>
> Denk s.v.p aan het milieu voor u deze e-mail afdrukt.
>
>
------------------------------------------------------------------------------
>
> This message may contain confidential information and is...{{dropped:14}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

Christian Schulz

2009-Sep-23 21:05 UTC

head link

[R] glm analysis repeated for 900 variables

> 
> On 23/09/2009, at 11:26 PM, Christian Schulz wrote:
> 
> > Hi,
> >
> >
> > nvars <- 902
> > data <-  as.data.frame(matrix(runif(100*nvars),ncol=nvars))
> > colnames(data)[901] <- c('phenotype')
> > colnames(data)[902] <- c('outcome')
> 
> 	<snip>
> 
> Just ***WHAT*** do you think the ``c( )'' is doing for you in
> the construction ``c('phenotype')'' etc. ???
> 
> Such complete misunderstanding of what the c() does or is useful
> for exasperates me, and is unfortunately very wide spread.  If people
> are going to use R, why don't they learn the basic syntax?
> 
> 	cheers,
> 
> 		Rolf Turner
> 
Sorry!
colnames(data)[901:902] <- c('phenotype','outcome')

cheers, Christian

Maybe Matching Threads

Search for more apparently analagous threads

R help - Sep 2009 - glm analysis repeated for 900 variables

[R] glm analysis repeated for 900 variables

[R] glm analysis repeated for 900 variables

[R] glm analysis repeated for 900 variables

Maybe Matching Threads