thr3ads.net - similar to: "RSNPper SNPinfo and making it handle a vector"

Displaying 20 results from an estimated 400 matches similar to: "RSNPper SNPinfo and making it handle a vector"

Needing a better solution to a lookup problem.

2012 Mar 14

Needing a better solution to a lookup problem.

I have a solution (actually a few) to this problem, but none are computationally efficient enough to be useful. I'm hoping someone can enlighten me to a better solution. I have data frame of chromosome/position pairs (along with other data for the location). For each pair I need to determine if it is with in a given data frame of ranges. I need to keep only the pairs that are within any of

factor or character

2012 Oct 23

factor or character

Hi, The program below work very well. (snps = c('rs621782_G', 'rs8087639_G', 'rs8094221_T', 'rs7227515_A', 'rs537202_C')) Selec = todos[ , colnames(todos) %in% snps] head(Selec) But, I have a data set with 1.000 columns and I need extract 70 to use (like snps in command above). This 70 snps are in a file. So I create a file to extract them with

SNPRelate: Plink conversion

2013 Nov 08

SNPRelate: Plink conversion

Hi, Following my earlier posts about having problems performing a PCA, I have worked out what the problem is. The problem lies within the PLINK to gds conversion. It seems as though the SNPs are imported as "samples" and in turn, the samples are recognised as SNPs: >snpsgdsSummary("chr2L") Some values of snp.position are invalid (should be > 0)! Some values of

Multiple versions of data in a package

2014 Jul 21

Multiple versions of data in a package

Dear R-devel, I am writing for help on how I should include parallel sets of data in my package. Brief summary: I am new to using data within packages. I want a user to be able to specify one of two alternative versions of within-package datasets to use, and I want to load just that one. I have a solution that works, but it doesn't seem as simple as it should be from a user's

piece wise application of functions

2004 Feb 19

piece wise application of functions

Dear all, After struggling for some time with *apply() and eva() without success, I decided to ask for help. I have 3 lists labeled with, each contains 3 different interpolation functions with identical names: > names(missgp0) [1] "spl.1mb" "spl.2mb" "spl.5mb" > > names(missgp1) [1] "spl.1mb" "spl.2mb" "spl.5mb" > >

Working with massive matrices in R

2011 Apr 18

Working with massive matrices in R

Hello, I'm (eventually) attempting a singular value decomposition of a 3200 x 527829 matrix in R version 2.10.1. The script is as follows: ###---------Begin Script here-------### library(Matrix) snps <- 527829 ## Number of SNPs N <- 3200 ## Sample size y <- rnorm(N, 100,1) ## simulated phenotype system.time( ## read in matrix

Reshaping genetic data from long to wide

2006 Apr 06

Reshaping genetic data from long to wide

Bottom Line Up Front: How does one reshape genetic data from long to wide? I currently have a lot of data. About 180 individuals (some probands/patients, some parents, rare siblings) and SNP data from 6000 loci on each. The standard formats seem to be something along the lines of Famid, pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2, locus2Allele1, locus2Allele2, etc In other

Re; Getting SNPS from PLINK to R

2011 Jun 21

Re; Getting SNPS from PLINK to R

I a using plink on a large SNP dataset with a .map and .ped file. I want to get some sort of file say a list of all the SNPs that plink is saying that I have. ANyideas on how to do this? -- Thanks, Jim. [[alternative HTML version deleted]]

prcomp - surprising structure

2013 Oct 03

prcomp - surprising structure

Hello, I did a pca with over 200000 snps for 340 observations (ids). If I plot the eigenvectors (called rotation in prcomp) 2,3 and 4 (e.g. plot (rotation[,2]) I see a strange "column" in my data (see attachment). I suggest it is an artefact (but of what?). Suggestion: I used prcomp this way: prcomp (mat), where mat is a matrix with the column means already substracted followed by a

logistic regression weights problem

2005 Apr 13

logistic regression weights problem

Hi All, I have a problem with weighted logistic regression. I have a number of SNPs and a case/control scenario, but not all genotypes are as "guaranteed" as others, so I am using weights to downsample the importance of individuals whose genotype has been heavily "inferred". My data is quite big, but with a dummy example: > status <- c(1,1,1,0,0) > SNPs <-

R: sim1000G

2020 Oct 29

R: sim1000G

Hi, I am using the sim1000G R package to simulate data for case/control study. I can not figure out how to manipulate this code to be able to generate 10% or 50% causal SNPs in R. This is whole code provided as example on GitHub: library(sim1000G) vcf_file = "region-chr4-357-ANK2.vcf.gz" #nvariants = 442, ss=1000 vcf = readVCF( vcf_file, maxNumberOfVariants = 442 ,min_maf =

bug in codetools/R CMD check?

2011 Feb 03

bug in codetools/R CMD check?

Hi Mr Tierney, I have noticed an error message from R 1.12.x's CMD check for a while (apparently prof Ripley completely rewrote CMD check in R 1.12+) e.g.: http://bioconductor.org/checkResults/2.7/bioc-LATEST/snpMatrix/lamb2-checksrc.html ---------------- * checking R code for possible problems ... NOTE Warning: non-unique value when setting 'row.names': ?new? Error in

SNP Tables

2011 Jul 27

SNP Tables

Hello, I have indicators for the present of absent of a snps in columns and the categorey (case control column). I would like to extract ONLY the tables and the indices (SNPS) that give me 2 x 3 tables. Some gives 2x 2 tables when one of the allelle is missing. The data look like the matrix snpmat below: so the first snp should give me the following table: (aa=0, Aa=1 and AA=2) aa

Speeding up lots of calls to GLM

2012 Mar 12

Speeding up lots of calls to GLM

Dear useRs, First off, sorry about the long post. Figured it's better to give context to get good answers (I hope!). Some time ago I wrote an R function that will get all pairwise interactions of variables in a data frame. This worked fine at the time, but now a colleague would like me to do this with a much larger dataset. They don't know how many variables they are going to have in the

Getting SNPS from PLINK to R

2011 Jun 21

Getting SNPS from PLINK to R

snpMatrix package is quite nice (read.plink())

R package: pbatR

2011 Jul 14

R package: pbatR

Dear All, Does anybody have experience with R package pbatR (http://cran.r-project.org/web/packages/pbatR/index.html)? I am trying to use it to analyze the family-based case-control data, but the package totally doesn?t work on my computer. I contacted the authors of the package, but I haven?t heard anything from them. Following the package manual, I tried the simple example as below:

"drop if missing" command?

2010 Feb 12

"drop if missing" command?

This will probably seem very simple to experienced R programmers: I am doing a snp association analysis and am at the model-fitting stage. I am using the Stats package's "drop1" with the following code: ##geno is the dataset ## the dependent variable (casectrln) is dichotomous and coded 0,1 ## rs743572_2 is one of the snps (which is coded 0,1,2 for the 3 genotypes)

minor allele frequency comparison

2011 Dec 09

minor allele frequency comparison

Hi all, We are using two methods to identify SNPs. One is based on resequencing the genome and aligning the reads to the sequenced genome to identify SNPs (data available for 44 individuals). Another is based on SNP array with selected loci (30000 loci, 870 individuals). I want to compare the results from the resequencing based minor allele frequency and Array based minor allele frequency.

reordering huge data file

2008 Jan 21

reordering huge data file

Dear R-experts, My problem is how to handle a 10GB data file containing genotype data. The file is in a particular format (Illumina final report) and needs to be altered and merged with phenotype data for further analysis. PERL seems to be an frequently used solution for this type of work, however I am inclined to think it should be doable with R. How do I open a text-file, line by line,

mhplot error with test example: "ylim not found"

2010 Jun 23

mhplot error with test example: "ylim not found"

Hello all, I am trying to make a genome association plot for p-values related to SNPs and was fortunate to find that R contains a package that produces Manhattan plots which is what's preferred for my current project. The function mhtplot() is found in the 'gap' package which I installed in R 2.11.1 on Windows. I thought I'd test out the function first with the examples they

similar to: RSNPper SNPinfo and making it handle a vector