thr3ads.net - similar to: "can you help me please :)"

Displaying 20 results from an estimated 1200 matches similar to: "can you help me please :)"

Plotting question: how to plot SNP location data?

2007 Oct 30

Plotting question: how to plot SNP location data?

Hello, I would like to plot specific SNPs with their exact locations on a chromosome. Based on my genotyping results I would like to separate these SNPs in three different categories: 1, 2 and 3 and use different colours to represent these categories. The script below generates the sample data. I can plot these with the image function using the following: val <- 1:3 samp <- sample(val,

Memory problem on a linux cluster using a large data set [Broadcast]

2006 Dec 21

Memory problem on a linux cluster using a large data set [Broadcast]

Thank you all for your help! So with all your suggestions we will try to run it on a computer with a 64 bits proccesor. But i've been told that the new R versions all work on a 32bits processor. I read in other posts that only the old R versions were capable of larger data sets and were running under 64 bit proccesors. I also read that they are adapting the new R version for 64 bits

help in R

2006 Apr 26

help in R

Hi, I cant understand where I am going wrong.Below is my code.I would really appreciate your help. Thanks. > genfile<-read.table("c:/tina/phd/bs871/hw/genfile.txt",skip=1) > > #read in SNP data > snp.dat <- as.matrix(genfile) > snp.name <- scan("c:/tina/phd/bs871/hw/genfile.txt",nline=1,what="character") Read 100 items

Using PCA to correct p-values from snpMatrix

2011 Jan 03

Using PCA to correct p-values from snpMatrix

Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complicated, so here's a simple example of what

Memory problem on a linux cluster using a large data set

2006 Dec 18

Memory problem on a linux cluster using a large data set

Hello, I have a large data set 320.000 rows and 1000 columns. All the data has the values 0,1,2. I wrote a script to remove all the rows with more than 46 missing values. This works perfect on a smaller dataset. But the problem arises when I try to run it on the larger data set I get an error “cannot allocate vector size 1240 kb”. I’ve searched through previous posts and found out that it might

Errors melt()ing data...

2008 Feb 28

Errors melt()ing data...

Hi, I'm trying to melt() some data for subsequent cast()ing and am encoutering errors. The overall process requires a couple of casts()s and melt()s. ########Start Session 1########## ## I have the data in a (fully) melted format and can cast it fine... > norm1[1:10,] Pool SNP Sample.Name variable value 1 1 rs1045485 CA0092 Height.1 0.003488853 2 1 rs1045485

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

2007 Jan 10

Fw: Memory problem on a linux cluster using a large data set [Broadcast]

Hi I listened to all your advise and ran my data on a computer with a 64 bits procesor but i still get the same error saying "it cannot allocate a vector of that size 1240 kb" . I don't want to cut my data in smaller pieces because we are looking at interaction. So are there any other options for me to try out or should i wait for the development of more advanced computers!

FW: Index out SNP position

2013 Jan 04

FW: Index out SNP position

I think you mean between column 1 and 2 of A? Why is 36003918 not included? It is clearly between 35838396 and 36151202 in the first row of A. My earlier solution should work fine. Just create a new matrix AX that has the columns switched so that the start is always column 1 and use that to identify the ones you want to select. That way you are not modifying B. This will be faster than checking

2 x 3 Probability under the null

2011 Oct 27

2 x 3 Probability under the null

I have a 2 x 3 matrix called snp and I want to compute the following probability: choose(sum(snp[,1]), snp[1,1]) * choose(sum(snp[,2]), snp[1,2]) * choose(sum(snp[,3]), snp[1,3])/choose(sum(snp), sum(snp[1,])) but I keep getting Infs and NaNs. Is there a function that can do this in R? -- Thanks, Jim. [[alternative HTML version deleted]]

efficient code. how to reduce running time?

2007 Jan 21

efficient code. how to reduce running time?

Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that

automating regression or correlations for many variables

2011 Apr 04

automating regression or correlations for many variables

Dear All, I have a large data frame with 10 rows and 82 columns. I want to apply the same function to all of the columns with a single command. e.g. zl <- lm (snp$a_109909 ~ snp$lat) will fit a linear model to the values in lat and a_109909. What I want to do is fit linear models for the values in each column against lat. I tried doing zl <- (snp[,2:82] ~ snp$lat[,1]) but got the following

Request for help on manipulation large data sets

2012 Jan 26

Request for help on manipulation large data sets

Dear All, I would like to ask for help on how to read different files automatically and do analysis using scripts. 1. Description of the data 1.1. there are 5 text files, each of which contains cleaned data for the same 100 SNPs. Observations (e.g., position on gnome, alelle type, ...) for SNPs are rows ordered by the SNP numbers, 1.2. there are 1 text file, containing the expression level of

2 D density plot interpretation and manipulating the data

2020 Oct 09

2 D density plot interpretation and manipulating the data

My understanding is that this represents bivariate normal approximation of the data which uses the kernel density function to test for inclusion within a level set. (please correct me) In order to exclude the outlier to these ellipses/contours is it advisable to do something like this: SNP$density <- get_density(SNP$mean, SNP$var) > summary(SNP$density) Min. 1st Qu. Median Mean 3rd

loop through columns in S4 objects

2011 Nov 24

loop through columns in S4 objects

Dear experts, I am trying to perform an association using snpStats. I have a snp matrix called 'plink' which contains my genotype data (as a list of $genotypes, $map, $fam), and a phenotype data frame which contains the outcomes (outcome1, outcome2,...) I would like to associate with the genotype. My question is, how do I loop through the outcomes? This type of data seems different from

Order a data frame based on the order of another data frame

2012 Mar 05

Order a data frame based on the order of another data frame

Hi, I am trying to match the order of the rownames of a dataframe with the rownames of another dataframe (I can't simply sort both sets because I would have to change the order of many other connected datasets if I did that): Also, the second dataset (snp.matrix$fam) is a snp matrix slot: so for example: data_one: x y

R coding to extract allele frequencies from NCBI for ALL alleles of one SNP?

2024 Nov 15

R coding to extract allele frequencies from NCBI for ALL alleles of one SNP?

Dear All, The following code extracts from NCBI very nice output for ONE allele of a SNP (often the allele with the second largest frequency - usually termed the minor allele). It gives an average minor allele frequency from all NCBI sources (which is what I want, except I'd like the addition of data for all the other alleles of one SNP) plus a table of minor allele frequencies from each

splitting multiple data in one column into multiple rows with one entry per column

2009 Jul 26

splitting multiple data in one column into multiple rows with one entry per column

Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG00000101605 3 rs13406898 ENSG00000167165 4 rs7030479

snpStats imputed SNP probabilities

2011 Dec 13

snpStats imputed SNP probabilities

Hi, Does anybody know how to obtain the imputed SNP genotype probabilities from the snpStats package? I am interested in using an imputation method implemented in R to be further used in a simulation study context. I have found the snpStats package that seems to contain suitable functions to do so. As far as I could find out from the package vignette examples and its help, it gives the

permutation test - query

2009 Aug 31

permutation test - query

Hi, My query is regarding permutation test and reshuffling of genotype/phenotype data I have been using the haplo.stats package of R. for haplotype analysis and I would like to perform an analysis which I'm requesting your advice. I have a data set of individuals genotyped for 12 SNP and a dichotomous phenotype. At first, I have tested each of those SNP independently in order to bypass

problem with the assignment function

2005 Nov 05

problem with the assignment function

Hello, I run into the most weird problem I have ever met. I wrote a function "rhopair", which calls a .C function. I cannot assign its value to a variable using either "=" nor "<-". After I did the assignment, "rhopair" cannot reproduce the same result as before with the same argument. Here is the code and results: >

similar to: can you help me please :)