thr3ads.net - similar to: "permutation test

Displaying 20 results from an estimated 2000 matches similar to: "permutation test - query"

2009 Sep 01

permutation and reshuffling

Hi, I'm looking for an efficient code that will enable me to reshuffle data (phenotype) for certain number of individuals and creating a loop that will randomly simulate it for 10000 times *(permutation)*. I also need to find how I keep the information (p value for each SNP) gathered for all the 10000 iterations. My data set looks like this (n=500): Individual # Phenotype SNP1 SNP2

reordering huge data file

2008 Jan 21

reordering huge data file

Dear R-experts, My problem is how to handle a 10GB data file containing genotype data. The file is in a particular format (Illumina final report) and needs to be altered and merged with phenotype data for further analysis. PERL seems to be an frequently used solution for this type of work, however I am inclined to think it should be doable with R. How do I open a text-file, line by line,

efficient code. how to reduce running time?

2007 Jan 21

efficient code. how to reduce running time?

Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that

merging multiple columns from two dataframes

2011 May 04

merging multiple columns from two dataframes

Hello, I have data in a dataframe with 139104 rows which is multiple of 96x1449. i have a phenotype file which contains the phenotype information for the 96 samples. the snp name is repeated 1449X96 samples. I haveto merge the two dataframes based on sid and sen. this is how my two dataframes look like dat<-data.frame(snpname=rep(letters[1:12],12),sid=rep(1:12,each=12),

Using something like the "by" command, but on rows instead of columns

2009 Nov 09

Using something like the "by" command, but on rows instead of columns

Hello R Forum users, I was hoping someone could help me with the following problem. Consider the following "toy" dataset: Accession SNP_CRY2 SNP_FLC Phenotype 1 NA A 0.783143079 2 BQ A 0.881714811 3 BQ A 0.886619488 4 AQ B 0.416893034 5 AQ B 0.621392903 6 AS B 0.031719125 7 AS NA 0.652375037 "Accession"

help in R

2006 Apr 26

help in R

Hi, I cant understand where I am going wrong.Below is my code.I would really appreciate your help. Thanks. > genfile<-read.table("c:/tina/phd/bs871/hw/genfile.txt",skip=1) > > #read in SNP data > snp.dat <- as.matrix(genfile) > snp.name <- scan("c:/tina/phd/bs871/hw/genfile.txt",nline=1,what="character") Read 100 items

R-help Digest, Vol 31, Issue 9

2005 Sep 09

R-help Digest, Vol 31, Issue 9

Hi: I use lm (linear model) to analyze 47 variables , 8 responses So I use loop to finish it . I want the program to show the results that P-value is less than 0.05. How can I cite the P-valus from lm result ? Ping The code: #using LM to model general fati for (j in 48:52) { for (i in 3:46){ gen.fat<-y_x[,j] gen.fat<-as.numeric(gen.fat) snp_marker<-y_x[,i] x<-colnames(y_x)

Speeding up lots of calls to GLM

2012 Mar 12

Speeding up lots of calls to GLM

Dear useRs, First off, sorry about the long post. Figured it's better to give context to get good answers (I hope!). Some time ago I wrote an R function that will get all pairwise interactions of variables in a data frame. This worked fine at the time, but now a colleague would like me to do this with a much larger dataset. They don't know how many variables they are going to have in the

loop through columns in S4 objects

2011 Nov 24

loop through columns in S4 objects

Dear experts, I am trying to perform an association using snpStats. I have a snp matrix called 'plink' which contains my genotype data (as a list of $genotypes, $map, $fam), and a phenotype data frame which contains the outcomes (outcome1, outcome2,...) I would like to associate with the genotype. My question is, how do I loop through the outcomes? This type of data seems different from

R package: pbatR

2011 Jul 14

R package: pbatR

Dear All, Does anybody have experience with R package pbatR (http://cran.r-project.org/web/packages/pbatR/index.html)? I am trying to use it to analyze the family-based case-control data, but the package totally doesn?t work on my computer. I contacted the authors of the package, but I haven?t heard anything from them. Following the package manual, I tried the simple example as below:

Using PCA to correct p-values from snpMatrix

2011 Jan 03

Using PCA to correct p-values from snpMatrix

Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complicated, so here's a simple example of what

Message for R-help mailing list

2011 Jul 26

Message for R-help mailing list

Dear r-helpers, I would be very grateful if you could post the message below on the r-help discussion board. Thank you very much! Best Wishes, Pawel Hello R community, I am generating lots of results using the fisher.test function, testing many 2x2 tables of SNPs for association with a particular phenotype. A typical output of the fisher.test function would be (for example): data: data1

Reshaping genetic data from long to wide

2006 Apr 06

Reshaping genetic data from long to wide

Bottom Line Up Front: How does one reshape genetic data from long to wide? I currently have a lot of data. About 180 individuals (some probands/patients, some parents, rare siblings) and SNP data from 6000 loci on each. The standard formats seem to be something along the lines of Famid, pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2, locus2Allele1, locus2Allele2, etc In other

data grouping and fitting mixed model with lme function

2013 Feb 28

data grouping and fitting mixed model with lme function

Dear all, I have data from the following experimental design and trying to fit a mixed model with lme function according to following steps but struggling. Any help is deeply appreciated. 1) Experimental design: I have 40 plants each of which has 4 clones. Each clone planted to one of 4 blocks. Phenotypes were collected from each clone for 3 consecutive years. I have genotypes of plants. I

Fastest way to do HWE.exact test on 100K SNP data?

2006 Jun 05

Fastest way to do HWE.exact test on 100K SNP data?

Hi everyone, I'm using the function 'HWE.exact' of 'genetics' package to compute p-values of the HWE test. My data set consists of ~600 subjects (cases and controls) typed at ~ 10K SNP markers; the test is applied separately to cases and controls. The genotypes are stored in a list of 'genotype' objects, all.geno, and p-values are calculated inside the loop over all

SNPRelate: Plink conversion

2013 Nov 08

SNPRelate: Plink conversion

Hi, Following my earlier posts about having problems performing a PCA, I have worked out what the problem is. The problem lies within the PLINK to gds conversion. It seems as though the SNPs are imported as "samples" and in turn, the samples are recognised as SNPs: >snpsgdsSummary("chr2L") Some values of snp.position are invalid (should be > 0)! Some values of

glm analysis repeated for 900 variables

2009 Sep 22

glm analysis repeated for 900 variables

Dear R users, Could you help my with the following problem? I want to repeat a glm analysis with 2 independent variables for all 900 variables (snps) in my data set. So, I want to check whether snp1 has a different effect on my outcome variable in patients and controls(phenotype). And repeat that for snp2 to snp900. Is there an easy way to get a summary of the data, e.g. a list of P values of all

lme with nested factor and random effect

2011 Dec 15

lme with nested factor and random effect

Hello all, I'm having difficulty with setting up a mixed model using lme in the nlme package. To summarize my study, I am testing for effects of ornamentation on foraging behavior of wolf spiders. I tested spiders at two different ages (penultimate vs. mature) and of two different phenotypes (one species tested lacks ornamentation throughout life [non-ornamented males] while the other

Dominance in qtl model

2007 Apr 23

Dominance in qtl model

Hi, I'm using R for a QTL analysis of SNP data. I was wondering if anyone had any advice on fitting a dominance effect into the following function; > myfun4 function (x) { x <- scan(con, nmax=169) y <- unique(x[which(!is.na(x))]) if(length(y)>1) { summary(lme(Ad ~ x, random= ~1|sire, na.action="na.omit")) } else {print("no.infomation")} } Con is the

Error in inherits(x, "data.frame") : subscript out of bounds

2010 Mar 05

Error in inherits(x, "data.frame") : subscript out of bounds

Hi, I have a list p with different size dataframes and length of over 8000. I'm trying to calculate correlations between the rows of dataframes of this list and columns of another dataset (type data.frame also) so that first column is correlated with all the rows in the list dataframe. Some information from another dataset is also included to the final output (all.corrs). This worked a

similar to: permutation test - query