thr3ads.net - similar to: "how to use GenABEL genetic information??"

2010 Nov 03

0

how to handle 'gwaa@gtdata' ?

I have a few questions about GenABEL, gwaa data. 1) is there a universal way that most GenABEL people use to add more individuals into a 'gwaa' data? For example, I have a 'gwaa' data, but I need to add some dummy parents, for 'gwaa at phdata', it's easy to add these rows, but for 'gwaa at gtdata', I think I need to create SNP data as '0 0 0 0 0.....'

a question about 'read.table' with or without 'read.table'.(urgent)

2010 Aug 05

2

a question about 'read.table' with or without 'read.table'.(urgent)

Hi, I've got a quite tricky question. I have a txt file, named 'temp.txt', as the following: snp1 snp2 snp3 AA 00 00 GG GG 00 00 AA 00 I want to read the file into R. 1) when I use 'read.table' without 'header=T' option, > temp <- read.table('temp.txt') # I got > temp V1

questions about string handling

2010 Aug 05

2

questions about string handling

Hi, I have a question about the data handling. I have a dataset as following: ID snp1 snp2 snp3 1001 0/0 1/1 1/1 1002 2/2 3/3 1/1 1003 4/4 3/3 2/2 I want to convert the dataset to the following format: ID snp1 snp2 snp3 1001 00 AA AA 1002 GG

SNP Tables

2011 Jul 27

1

SNP Tables

Hello, I have indicators for the present of absent of a snps in columns and the categorey (case control column). I would like to extract ONLY the tables and the indices (SNPS) that give me 2 x 3 tables. Some gives 2x 2 tables when one of the allelle is missing. The data look like the matrix snpmat below: so the first snp should give me the following table: (aa=0, Aa=1 and AA=2) aa

Using PCA to correct p-values from snpMatrix

2011 Jan 03

0

Using PCA to correct p-values from snpMatrix

Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complicated, so here's a simple example of what

reordering huge data file

2008 Jan 21

2

reordering huge data file

Dear R-experts, My problem is how to handle a 10GB data file containing genotype data. The file is in a particular format (Illumina final report) and needs to be altered and merged with phenotype data for further analysis. PERL seems to be an frequently used solution for this type of work, however I am inclined to think it should be doable with R. How do I open a text-file, line by line,

array dimension changes with assignment

2008 May 13

2

array dimension changes with assignment

Why does the assignment of a 3178x93 object to another 3178x93 object remove the dimension attribute? > GT <- array(dim = c(6,nrow(InData),ncol(InSNPs))) > dim(GT) [1] 6 3178 93 > SNP1 <- InSNPs[InData[,"C1"],] > dim(SNP1) [1] 3178 93 > SNP2 <- InSNPs[InData[,"C2"],] > dim(SNP2) [1] 3178 93 > dim(pmin(SNP1,SNP2)) [1] 3178 93

snp-chip table

2011 Mar 10

1

snp-chip table

Dear R helpers I have a table and i need to make new table table1: sire snp1 snp2 snp3 snp4 snp5 snp6 snp7 snp8 snp9 snp10 snp11 snp12 snp13 snp14 snp15 8877 -1 -1 -1 -1 0 0 -1 -1 -1 0 1 1 1 -1 -1 7765 1 1 1 0 0 0 -1 1 1 1 0 0 0 1 0 8766 1 1 -1 0 -1 -1 0 -1 0 -1 -1 -1 0 1 0 6756 0 1 0 -1 1 -1 -1 0 0 0 0 -1 0 1 1 5644 -1 0 1 -1 0 0 0 0 -1 -1 0 0 0 0 1 I have table2 sire

A question about GRAMMAR calculations in the FAM_MDR algorithm

2012 Aug 24

0

A question about GRAMMAR calculations in the FAM_MDR algorithm

Dear R developers: I am a PHD candidate student in the school of public health of Peking University and my major is genetic epidemiology. I am learning the FAM-MDR algorithm, which is used to detect the gene-gene and gene-environment interactions in the data of pedigree. The codes were written by Tom Cattaert of the University of Liege. The algorithms and the sample datasets are available at

reshape dataframe

2009 Mar 20

1

reshape dataframe

Hi, I have a large dataset on which I would like to do the following: x<-data.frame(id=c(1,2,3), snp1=c("AA","GG", "AG"),snp2=c("GG","AG","GG"),snp3=c("GG","AG","AA")) > x id snp1 snp2 snp3 1 1 AA GG GG 2 2 GG AG AG 3 3 AG GG AA And then

R TABELS

2011 Jan 22

1

R TABELS

Hi ihave one table that look like SNP1 SNP2 SNP3 SNP4 SNP5 SIRE1 1 -1 -1 1 -1 SIRE2 1 -1 1 1 1 SIRE3 -1 -1 1 1 0 SIRE4 -1 1 1 0 1 SIRE5 -1 1 -1 -1 1 SIRE6 0 0 0 1 -1 SIRE7 -1 0 -1 1 1 SIRE8 1 -1 NA 0 NA SIRE9 -1 1 1 -1 -1 SIRE10 1 1 1 1 1 table 2 only one line SNP1 SNP2 SNP3 SNP4 SNP5 SIRE100 -1 -1 1

glm analysis repeated for 900 variables

2009 Sep 22

2

glm analysis repeated for 900 variables

Dear R users, Could you help my with the following problem? I want to repeat a glm analysis with 2 independent variables for all 900 variables (snps) in my data set. So, I want to check whether snp1 has a different effect on my outcome variable in patients and controls(phenotype). And repeat that for snp2 to snp900. Is there an easy way to get a summary of the data, e.g. a list of P values of all

same value in column-->delete

2009 Mar 26

4

same value in column-->delete

Hi Readers, I have a question. I have a large dataset and want to throw away columns that have the same value in the column itself and I want to know which column this was. For example > x<-data.frame(id=c(1,2,3), snp1=c("A","G", "G"),snp2=c("G","G","G"),snp3=c("G","G","A"))

SNP IMPUTATION

2011 Jan 23

1

SNP IMPUTATION

Hi ihave one table that look like SNP1 SNP2 SNP3 SNP4 SNP5 SIRE1 1 -1 -1 1 -1 SIRE2 1 -1 1 1 1 SIRE3 -1 -1 1 1 0 SIRE4 -1 1 1 0 1 SIRE5 -1 1 -1 -1 1 SIRE6 0 0 0 1 -1 SIRE7 -1 0 -1 1 1 SIRE8 1 -1 NA 0 NA SIRE9 -1 1 1 -1 -1 SIRE10 1 1 1 1 1 table 2 only one line SNP1 SNP2 SNP3 SNP4 SNP5 SIRE100 -1 -1 1 1 -1 I need to male

permutation and reshuffling

2009 Sep 01

1

permutation and reshuffling

Hi, I'm looking for an efficient code that will enable me to reshuffle data (phenotype) for certain number of individuals and creating a loop that will randomly simulate it for 10000 times *(permutation)*. I also need to find how I keep the information (p value for each SNP) gathered for all the 10000 iterations. My data set looks like this (n=500): Individual # Phenotype SNP1 SNP2

Filtering a dataset's columns by another dataset's column names

2009 Feb 27

5

Filtering a dataset's columns by another dataset's column names

Hello all, I hope some of you can come to my rescue, yet again. I have two genetic datasets, and I want one of the datasets to have only the columns that are in common with the other dataset. Here is a toy example (my real datasets have hundreds of columns): Dataset 1: Individual SNP1 SNP2 SNP3 SNP4 SNP5 1 A G T C A 2 T C A G T 3 A C T

problems when loading package GenABEL

2013 Jan 08

1

problems when loading package GenABEL

Dear all, since yesterday, I have been experiencing problems with the package GenABEL. When I try to load the package (library(GenABEL)) I get the following error message: Loading required package: MASS Error : .onLoad failed in loadNamespace() for 'GenABEL', details: call: stringSplit[[1]] error: subscript out of bounds Error: package/namespace load failed for ?GenABEL? The funny

Kinship2 and GenABEL

2012 Nov 09

0

Kinship2 and GenABEL

Hi, I'm using kinship2 to calculate heritabilty, but I would like calculate in GenABEL too. I trying the code: > require(kinship2) > require(GenABEL) > pedig = with(Dados, pedigree(id=IID, dadid=PAT, momid=MAT, sex=SEX, famid=FID, missid=0)) > kmat = kinship(pedig) > (mod1 = polygenic(altura ~ SEX + idade, data=Dados, kin=kmat)) Erro em intI(i, n = d[1],

GenABEL - problems with load.gwaa.data

2010 Feb 23

1

GenABEL - problems with load.gwaa.data

Hi all! I am using GenABEL on R for GWAS analysis. I am having a couple of issues: First, I am having a problem reading files (.map, & .ped, size 900Mb, using windows 32-bit) onto R in the "convert.snp.ped" statement. I am thinking this problem is likely due to the large size of the files & my version of R is not able to handle them, since I can read in smaller files.

Recoding variables based on reference values in data frame

2013 Jul 02

2

Recoding variables based on reference values in data frame

I'm new to R (previously used SAS primarily) and I have a genetics data frame consisting of genotypes for each of 300+ subjects (ID1, ID2, ID3, ...) at 3000+ genetic locations (SNP1, SNP2, SNP3...). A small subset of the data is shown below: SNP_ID SNP1 SNP2 SNP3 SNP4 Maj_Allele C G C A Min_Allele T A T G ID1 CC GG CT AA ID2 CC GG CC AA ID3 CC GG nc AA

similar to: how to use GenABEL genetic information??