jinwei wang
2012-Aug-24 03:15 UTC
[R] A question about GRAMMAR calculations in the FAM_MDR algorithm
Dear R developers: I am a PHD candidate student in the school of public health of Peking University and my major is genetic epidemiology. I am learning the FAM-MDR algorithm, which is used to detect the gene-gene and gene-environment interactions in the data of pedigree. The codes were written by Tom Cattaert of the University of Liege. The algorithms and the sample datasets are available at http://www.statgen.ulg.ac.be/index.php?option=com_content&view=article&id=90&Itemid=85. The R I used is R-2.15.1 in the Windows XP system. The command lines I have run are as follows: # data files pedfile="pedigree1.ped" phenofil="pheno1.dat" mapfile="simulation.map" # solver settings npermute = 1000 # fixed number of permutations, 1000 is recommended alpha = 0.05 # type-I error Pval = 0.1 # significance level at first step of mbmdr set.seed(1982) # for reproducibility # loading libraries and code library(kinship2) library(GenABEL) source("mbmdr_gaussian.r") # removing all possible existing files first as a safety measure file.remove("permoutput.txt") file.remove("simulation.raw") file.remove("permuted.txt") file.remove("permchisq.txt") file.remove("rawoutput.txt") # loading data and bringing in GenABEL format rawfile="simulation.raw" convert.snp.ped(pedfile, mapfile, rawfile) simulation.GenABEL = load.gwaa.data(phenofile = phenofil, genofile rawfile, force=F,makemap=F,sort=F) pedigree=read.table(pedfile) pedsize=nrow(pedigree) nsnps=(ncol(pedigree)-6)/2 # minor allele count and handling missing genotype data allelic = function(k){ geno=pedigree[,(5+2*k):(6+2*k)] allelic=rowSums(geno==2)-(geno[,1]==0 & geno[,2]==0) # -1 for missing, 0,1,2 gives count of variant allele } # preparing MB-MDR SNPS = matrix(0,nrow=pedsize,ncol=nsnps) for (k in 1:nsnps){ SNPS[,k] = allelic(k) } SNPS = as.data.frame(SNPS) SNPS[SNPS==-1] = NA # GRAMMAR calculations # here one has to include main effect and/or covariate adjustments in the polygenic model statement pkin = kinship(pedigree[,2],pedigree[,3],pedigree[,4]) maineff1 = as.factor(SNPS[,1]) maineff2 = as.factor(SNPS[,2]) Yfit polygenic(trait~maineff1+maineff2,pkin,simulation.GenABEL,trait.type="gaussian") resi = Yfit$pgresidualY When I calculated the GRAMMAR and ran the command, "*Yfit polygenic(trait~maineff1+maineff2,pkin,simulation.GenABEL,trait.type="gaussian") *", the software gave an error alert as "*Error in polygenic(trait ~ maineff1 + maineff2, pkin, simulation.GenABEL, : subscript out of bounds *". I have tried to solve the problem by checking all the commands written above and the datasets created before. I assume the reason of the problem is that the "pkin" file is a 994*994 matrix, while "simulation.GenABEL" is a data frame of 994 subjects and 10 SNPs. The dimentions of the two files are different, so "subscript out of bounds". However, I don't know how to solve the problem. I have sent the messege to the authors of FAMMDR, but have got no reply. So I hope the developers of R can give me some help. Best regards Jinwei Wang [[alternative HTML version deleted]]