thr3ads.net - similar to: "2 x 3 Probability under the null"

Displaying 20 results from an estimated 9000 matches similar to: "2 x 3 Probability under the null"

2011 Jun 21

Re; Getting SNPS from PLINK to R

I a using plink on a large SNP dataset with a .map and .ped file. I want to get some sort of file say a list of all the SNPs that plink is saying that I have. ANyideas on how to do this? -- Thanks, Jim. [[alternative HTML version deleted]]

Hardy Weinberg Case Control Test in gap R package

2011 Jul 13

Hardy Weinberg Case Control Test in gap R package

Hi, I am using the gap R package to do the Hardy Weinberg Case Control test for many SNP. I am not sure what the values initial1 and initial2 should be for the test. I tried values but they failed. I emailed the author but to no avail. There seems to be some documentation that is deleted at the top, if anyone can direct me how to get this I will be grateful. -- Thanks, Jim. [[alternative HTML

SNP Tables

2011 Jul 27

SNP Tables

Hello, I have indicators for the present of absent of a snps in columns and the categorey (case control column). I would like to extract ONLY the tables and the indices (SNPS) that give me 2 x 3 tables. Some gives 2x 2 tables when one of the allelle is missing. The data look like the matrix snpmat below: so the first snp should give me the following table: (aa=0, Aa=1 and AA=2) aa

package metafor: error when setting 'col' and 'at' for a forest plot

2013 Jan 05

package metafor: error when setting 'col' and 'at' for a forest plot

I am using metafor to create forest plots. This code gives me the expected plot (setting x axis tick marks): forest(forest$OR, ci.lb=forest$Low, ci.ub=forest$High, at=log(c(.05, .25, 1, 10)), slab=forest$SNP, atransf=exp) As does this (setting colors): forest(forest$OR, ci.lb=forest$Low, ci.ub=forest$High, col=c(1,2,3), slab=forest$SNP, atransf=exp) But if I try to set both 'at' and

Stirlings Approximation

2011 May 09

Stirlings Approximation

I have some big combinations like: 4444444444444444444444444444 choose 784645433 Can R compute these? Is there any package that does stirlings approximation in R? -- Thanks, Jim. [[alternative HTML version deleted]]

Memory problem on a linux cluster using a large data set

2006 Dec 18

Memory problem on a linux cluster using a large data set

Hello, I have a large data set 320.000 rows and 1000 columns. All the data has the values 0,1,2. I wrote a script to remove all the rows with more than 46 missing values. This works perfect on a smaller dataset. But the problem arises when I try to run it on the larger data set I get an error “cannot allocate vector size 1240 kb”. I’ve searched through previous posts and found out that it might

help in R

2006 Apr 26

help in R

Hi, I cant understand where I am going wrong.Below is my code.I would really appreciate your help. Thanks. > genfile<-read.table("c:/tina/phd/bs871/hw/genfile.txt",skip=1) > > #read in SNP data > snp.dat <- as.matrix(genfile) > snp.name <- scan("c:/tina/phd/bs871/hw/genfile.txt",nline=1,what="character") Read 100 items

Errors melt()ing data...

2008 Feb 28

Errors melt()ing data...

Hi, I'm trying to melt() some data for subsequent cast()ing and am encoutering errors. The overall process requires a couple of casts()s and melt()s. ########Start Session 1########## ## I have the data in a (fully) melted format and can cast it fine... > norm1[1:10,] Pool SNP Sample.Name variable value 1 1 rs1045485 CA0092 Height.1 0.003488853 2 1 rs1045485

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

2007 Jan 10

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

Hi I listened to all your advise and ran my data on a computer with a 64 bits procesor but i still get the same error saying "it cannot allocate a vector of that size 1240 kb" . I don't want to cut my data in smaller pieces because we are looking at interaction. So are there any other options for me to try out or should i wait for the development of more advanced computers!

efficient code. how to reduce running time?

2007 Jan 21

efficient code. how to reduce running time?

Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that

automating regression or correlations for many variables

2011 Apr 04

automating regression or correlations for many variables

Dear All, I have a large data frame with 10 rows and 82 columns. I want to apply the same function to all of the columns with a single command. e.g. zl <- lm (snp$a_109909 ~ snp$lat) will fit a linear model to the values in lat and a_109909. What I want to do is fit linear models for the values in each column against lat. I tried doing zl <- (snp[,2:82] ~ snp$lat[,1]) but got the following

splitting multiple data in one column into multiple rows with one entry per column

2009 Jul 26

splitting multiple data in one column into multiple rows with one entry per column

Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG00000101605 3 rs13406898 ENSG00000167165 4 rs7030479

Order a data frame based on the order of another data frame

2012 Mar 05

Order a data frame based on the order of another data frame

Hi, I am trying to match the order of the rownames of a dataframe with the rownames of another dataframe (I can't simply sort both sets because I would have to change the order of many other connected datasets if I did that): Also, the second dataset (snp.matrix$fam) is a snp matrix slot: so for example: data_one: x y

SNPRelate: Plink conversion

2013 Nov 08

SNPRelate: Plink conversion

Hi, Following my earlier posts about having problems performing a PCA, I have worked out what the problem is. The problem lies within the PLINK to gds conversion. It seems as though the SNPs are imported as "samples" and in turn, the samples are recognised as SNPs: >snpsgdsSummary("chr2L") Some values of snp.position are invalid (should be > 0)! Some values of

2 D density plot interpretation and manipulating the data

2020 Oct 09

2 D density plot interpretation and manipulating the data

Hi Abby, Thanks for getting back to me, yes I believe I did that by doing this: SNP$density <- get_density(SNP$mean, SNP$var) > summary(SNP$density) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 383 696 738 1170 1789 where get_density() is function from here: https://slowkow.com/notes/ggplot2-color-by-density/ and keep only entries with density > 400

Simulating from the null distribution of a 2 x 3 table

2011 Jul 07

Simulating from the null distribution of a 2 x 3 table

Dear all, I want to simulate from the null distribution of the following 2 x 3 table, 2 5 10 4 8 5 I am using a chi-squared test. Anyone has any idea how to do this? -- Thanks, Jim. [[alternative HTML version deleted]]

R coding to extract allele frequencies from NCBI for ALL alleles of one SNP?

2024 Nov 15

R coding to extract allele frequencies from NCBI for ALL alleles of one SNP?

Dear All, The following code extracts from NCBI very nice output for ONE allele of a SNP (often the allele with the second largest frequency - usually termed the minor allele). It gives an average minor allele frequency from all NCBI sources (which is what I want, except I'd like the addition of data for all the other alleles of one SNP) plus a table of minor allele frequencies from each

RFC: lchoose() vs lfactorial() etc

2009 Dec 15

RFC: lchoose() vs lfactorial() etc

lgamma(x) and lfactorial(x) are defined to return ln|Gamma(x)| {= log(abs(gamma(x)))} or ln|Gamma(x+1)| respectively. Unfortunately, we haven't chosen the analogous definition for lchoose(). So, currently > lchoose(1/2, 1:10) [1] -0.6931472 -2.0794415 NaN -3.2425924 NaN -3.8869494 [7] NaN -4.3357508 NaN -4.6805913 Warning message: In

Problems reshaping data with cast()

2008 Feb 07

Problems reshaping data with cast()

Hi, I'm trying to cast() some data, but keep on getting the following error... > norm.all.melted.height <- transform(all.melted.height, + norm.height = value / ave(value, SNP, Pool, FUN = max) + ) Warning messages: 1: In FUN(X[[147L]], ...) : no non-missing arguments to max; returning -Inf 2: In FUN(X[[147L]],

permutation test - query

2009 Aug 31

permutation test - query

Hi, My query is regarding permutation test and reshuffling of genotype/phenotype data I have been using the haplo.stats package of R. for haplotype analysis and I would like to perform an analysis which I'm requesting your advice. I have a data set of individuals genotyped for 12 SNP and a dichotomous phenotype. At first, I have tested each of those SNP independently in order to bypass

similar to: 2 x 3 Probability under the null