Giambartolomei, Claudia
2010-Dec-18 15:27 UTC
[R] association analysis with multiple outcome variables
Hi, I am using the package snpMatrix to do a genetic association analysis, but my problem is I think a simple R trick (sorry...I am an R newbie so I apologize in advance if I am not using the correct terms...) I am trying to figure out a way to loop through different dependent variables without having to repeat the analysis for each. My genotype data is stored in a raw object of class "snp.matrix" (called "snp.matrix"), in which the column names are the SNP names and the row names are the subject identifiers:> snp.matrix@.Data<mailto:snp.matrix@.Data[1:5,1:5>[1:5,1:5<mailto:snp.matrix@.Data[1:5,1:5>]Broad10449636 Broad10450135 Broad10459352 Broad10462884 P160012117 03 03 03 03 P160014466 03 03 03 03 P160021123 03 01 03 03 P160052107 03 03 03 03 P160053905 03 03 03 03 Broad10468812 P160012117 03 P160014466 03 P160021123 03 P160052107 03 P160053905 03 My phenotype data is stored in data frame called "pheno" and it contains my dependent variables that I want to use the loop with, and also some important variables that I want to keep constant in the model as covariates (age and sex).> head(pheno)IDOUT SEX ETHN AGE BMI APO_A CHOL LDL P160012117 P160012117 2 1 59 NA NA NA NA P160014466 P160014466 2 1 60 26.88092 1.92 7.38 5.59 P160021123 P160021123 2 1 54 28.68583 2.39 5.80 3.63 P160052107 P160052107 2 1 48 22.59757 2.51 7.25 5.03 P160053905 P160053905 2 1 46 20.33101 2.52 6.40 4.39 P160076582 P160076582 2 1 50 23.84064 2.19 4.47 2.74 The association works for the single terms when I do this: (using the formula "snp.rhs.test"): BMI<-snp.rhs.tests(BMI~SEX+AGE,family="gaussian",snp.data=snp.matrix) And then changing the result file which is an S4 object into a matrix doing this: res <- data.frame (SNPs= names(BMI), pvalues = p.value(BMI)) But when I try to create a loop through the other independent variables it does not work anymore. This is what I am doing: #Created a file with only the variables to use in the loop: pheno2<-subset(pheno, select=c("IDOUT", "BMI", "APO_A", "CHOL", "LDL")) for (cov in names(pheno2)) { res<-snp.rhs.tests(as.formula(paste(pheno2$cov, "~pheno$SEX+pheno$AGE")),family="gaussian",snp.data=snp.matrix) res <- data.frame (SNPs= names(HDL.reg.rhs), pvalues = p.value(HDL.reg.rhs)) output.file <- paste(''myres_'', cov, ''.tab'', sep = '''') write.table(res, file = output.file, sep = ''\t'', quote = FALSE, row.names = FALSE) print(output.file) } But I get the following error: "Error in snp.rhs.tests(as.formula(paste(pheno3$cov, "~pheno2$SEX+pheno2$XAGE_C")), : Argument error - Y" I have been looking for this error but I cannot find anything on the help pages...If you could help me figure this out it would be great!! Thank you very much in advance! - claudia [[alternative HTML version deleted]]
Giambartolomei, Claudia
2010-Dec-18 15:35 UTC
[R] association analysis with multiple outcome variables
Hi, I am using the package snpMatrix to do a genetic association analysis, but my problem is I think a simple R trick (sorry...I am an R newbie so I apologize in advance if I am not using the correct terms...) I am trying to figure out a way to loop through different dependent variables without having to repeat the analysis for each. My genotype data is stored in a raw object of class "snp.matrix" (called "snp.matrix"), in which the column names are the SNP names and the row names are the subject identifiers:> snp.matrix@.Data<mailto:snp.matrix@.Data[1:5,1:5>[1:5,1:5<mailto:snp.matrix@.Data[1:5,1:5>]Broad10449636 Broad10450135 Broad10459352 Broad10462884 P160012117 03 03 03 03 P160014466 03 03 03 03 P160021123 03 01 03 03 P160052107 03 03 03 03 P160053905 03 03 03 03 Broad10468812 P160012117 03 P160014466 03 P160021123 03 P160052107 03 P160053905 03 My phenotype data is stored in data frame called "pheno" and it contains my dependent variables that I want to use the loop with, and also some important variables that I want to keep constant in the model as covariates (age and sex).> head(pheno)IDOUT SEX ETHN AGE BMI APO_A CHOL LDL P160012117 P160012117 2 1 59 NA NA NA NA P160014466 P160014466 2 1 60 26.88092 1.92 7.38 5.59 P160021123 P160021123 2 1 54 28.68583 2.39 5.80 3.63 P160052107 P160052107 2 1 48 22.59757 2.51 7.25 5.03 P160053905 P160053905 2 1 46 20.33101 2.52 6.40 4.39 P160076582 P160076582 2 1 50 23.84064 2.19 4.47 2.74 The association works for the single terms when I do this: (using the formula "snp.rhs.test"): BMI<-snp.rhs.tests(BMI~SEX+AGE,family="gaussian",snp.data=snp.matrix) And then changing the result file which is an S4 object into a matrix doing this: res <- data.frame (SNPs= names(BMI), pvalues = p.value(BMI)) But when I try to create a loop through the other independent variables it does not work anymore. This is what I am doing: #Created a file with only the variables to use in the loop: pheno2<-subset(pheno, select=c("IDOUT", "BMI", "APO_A", "CHOL", "LDL")) for (cov in names(pheno2)) { res<-snp.rhs.tests(as.formula(paste(pheno2$cov, "~pheno$SEX+pheno$AGE")),family="gaussian",snp.data=snp.matrix) res <- data.frame (SNPs= names(HDL.reg.rhs), pvalues = p.value(HDL.reg.rhs)) output.file <- paste(''myres_'', cov, ''.tab'', sep = '''') write.table(res, file = output.file, sep = ''\t'', quote = FALSE, row.names = FALSE) print(output.file) } But I get the following error: "Error in snp.rhs.tests(as.formula(paste(pheno3$cov, "~pheno2$SEX+pheno2$XAGE_C")), : Argument error - Y" I have been looking for this error but I cannot find anything on the help pages...If you could help me figure this out it would be great!! Thank you very much in advance! - claudia [[alternative HTML version deleted]]