Displaying 20 results from an estimated 7000 matches similar to: "Needing a better solution to a lookup problem."
2013 Nov 08
1
SNPRelate: Plink conversion
Hi,
Following my earlier posts about having problems performing a PCA, I have
worked out what the problem is. The problem lies within the PLINK to gds
conversion.
It seems as though the SNPs are imported as "samples" and in turn, the
samples are recognised as SNPs:
>snpsgdsSummary("chr2L")
Some values of snp.position are invalid (should be > 0)!
Some values of
2011 Dec 09
1
minor allele frequency comparison
Hi all,
We are using two methods to identify SNPs. One is based on resequencing
the genome and aligning the reads to the sequenced genome to identify SNPs
(data available for 44 individuals). Another is based on SNP array with
selected loci (30000 loci, 870 individuals). I want to compare the results
from the resequencing based minor allele frequency and Array based minor
allele frequency.
2020 Oct 29
1
R: sim1000G
Hi,
I am using the sim1000G R package to simulate data for case/control study.
I can not figure out how to manipulate this code to be able to generate 10%
or 50% causal SNPs in R.
This is whole code provided as example on GitHub:
library(sim1000G)
vcf_file = "region-chr4-357-ANK2.vcf.gz" #nvariants = 442, ss=1000
vcf = readVCF( vcf_file, maxNumberOfVariants = 442 ,min_maf =
2007 Feb 05
3
RSNPper SNPinfo and making it handle a vector
If I run an analysis which generates statistical tests on many SNPs I would
naturally want to get more details on the most significant SNPs. Directly
from within R one can get the information by loading RSNPer (from
Bioconductor) and simply issuing a command SNPinfo(2073285). Unfortunately,
the command cannot handle a vector and therefore only wants to do one at a
time.
I tried the lapply and
2013 Oct 03
1
prcomp - surprising structure
Hello,
I did a pca with over 200000 snps for 340 observations (ids). If I plot the
eigenvectors (called rotation in prcomp) 2,3 and 4 (e.g. plot
(rotation[,2]) I see a strange "column" in my data (see attachment). I
suggest it is an artefact (but of what?).
Suggestion:
I used prcomp this way: prcomp (mat), where mat is a matrix with the column
means already substracted followed by a
2010 Jun 23
1
mhplot error with test example: "ylim not found"
Hello all,
I am trying to make a genome association plot for p-values related to SNPs
and was fortunate to find that R contains a package that produces Manhattan
plots which is what's preferred for my current project. The function
mhtplot() is found in the 'gap' package which I installed in R 2.11.1 on
Windows. I thought I'd test out the function first with the examples they
2006 Apr 06
4
Reshaping genetic data from long to wide
Bottom Line Up Front: How does one reshape genetic data from long to wide?
I currently have a lot of data. About 180 individuals (some
probands/patients, some parents, rare siblings) and SNP data from 6000 loci
on each. The standard formats seem to be something along the lines of Famid,
pid, fatid, motid, affected, sex, locus1Allele1, locus1Allele2,
locus2Allele1, locus2Allele2, etc
In other
2012 Sep 26
3
replace string values with numbers
Hi everyone, I have a data frame Gene with SNPs eg. P1 P2 P3
CG CG GG
-- -- AC
-- AC CC
AC -- AC I tried to replace all the GG with a value 3. Gene[Gene=="GG"]<-3 It always give me: Warning in `[<-.factor`(`*tmp*`, thisvar, value = 3) :
invalid factor level, NAs generated Does any know if there is anything wrong with my code? Thanks, Zhengyu
2009 Oct 12
2
R commander.
i have two RData files,,i want to print them to check the format of the
tables in these files,,,i can load both the files and can read it as well
> load('ann.RData')
> str(ann)
List of 4
$ Name : chr [1:561466] "rs3094315" "rs12562034" "rs3934834"
"rs9442372" ...
$ Position : int [1:561466] 742429 758311 995669 1008567 1011278 1011521
2007 Oct 30
0
Plotting question: how to plot SNP location data?
Hello,
I would like to plot specific SNPs with their exact locations on a
chromosome. Based on my genotyping results I would like to separate
these SNPs in three different categories: 1, 2 and 3 and use different
colours to represent these categories. The script below generates the
sample data. I can plot these with the image function using the
following:
val <- 1:3
samp <- sample(val,
2012 Oct 23
1
factor or character
Hi,
The program below work very well.
(snps = c('rs621782_G', 'rs8087639_G', 'rs8094221_T', 'rs7227515_A',
'rs537202_C'))
Selec = todos[ , colnames(todos) %in% snps]
head(Selec)
But, I have a data set with 1.000 columns and I need extract 70 to use
(like snps in command above).
This 70 snps are in a file. So I create a file to extract them with
2011 Aug 10
2
Loops for repetitive task
Hello,
I have an R script that I use as a template to perform a task for multiple
files (in this case, multiple chromosomes).
What I would like to do is to utilize a simple loop to parse through each
chromosome number so that I don't have to type the same code over and over
again in the R console.
I've tried using:
for(i in 1:22){
etc..
}
and replacing each chromosome number with
2011 Oct 25
4
comparing two tables
Hi everybody,
I would like to know whether it is possible to compare to tables for certain
parameters.
I have these two tables:
gene table
name chr start end str accession Length
gen1 4 646752 646838 + MI0005806 86
gen12 2L 243035 243141 - MI0005821 106
gen3 2L 159838 159928 + MI0005813 90
gen7 2L
2018 Mar 05
2
Help with apply and new column?
Thanks. I think nabble is good for programming questions. Bear with me if I'm incorrect.
Data: Genomics SNP information
Goal: I need to add Chromosome and SNP position to the data frame I'm using through apply.
I'd like to add new column from text processed through apply function.
For example: 10:60523:T:G (Column 2)
CHR: 10
Position: 60523
Dataset:
chr rs ps n_miss allele1
2010 Feb 10
2
How to create probeAnno object?
Hi,
I want to use segChrom() method in tilingArray package. For that I need to create a probeAnno object. I could not find much much info by ?probeAnno. I need help in creating probeAnno object.
Snap shot of the file(.txt):
chr1 2500014 2500038 + 0.232689943122845
chr1 2500039 2500063 + 2.60502410304227
chr1 2500062 2500086 + 0.0756595313279895
chr1 2500080 2500104 + 0.78574617788405
chr1
2004 Aug 06
1
questions related to ploting in R
Dear all.
I need to draw a scatter plot of 23 chromosome copy numbers (y axes) against
chromosome and physical location within each chromosome in one plot. The
data matrix looks as below:
chr location copy_num
1 118345 1.320118
1 3776202 1.133879
1 4798845 0.989997
1 5350951 1.100967
. more data here
.
.
2 118345 2.459119
2 157739 1.915919
2 1530065 1.924372
2
2014 Jul 21
1
Multiple versions of data in a package
Dear R-devel,
I am writing for help on how I should include parallel sets of data in
my package.
Brief summary: I am new to using data within packages. I want a user to
be able to specify one of two alternative versions of within-package
datasets to use, and I want to load just that one. I have a solution
that works, but it doesn't seem as simple as it should be from a user's
2009 Nov 27
1
problem tick marker and text
Hi R-ers,
I am struggling with my x-axis in a association plot. What I would like is
to place the labels of the x-axis between the tick markers and normally the
labels are printed at the place where the tick marker is placed. I don???t
want to move the tick marker (it gives the switch between one chromosome and
the next) but I just want to put the chromosome number in between the
2011 Apr 18
2
Working with massive matrices in R
Hello,
I'm (eventually) attempting a singular value decomposition of a 3200 x
527829 matrix in R version 2.10.1. The script is as follows:
###---------Begin Script here-------###
library(Matrix)
snps <- 527829 ## Number of SNPs
N <- 3200 ## Sample size
y <- rnorm(N, 100,1) ## simulated phenotype
system.time(
## read in matrix
2011 Oct 17
2
Histogram for each ID value
I have a dataframe in the general format:
chr1 0.5
chr1 0
chr1 0.75
chr2 0
chr2 0
chr3 1
chr3 1
chr3 0.5
chr7 0.75
chr9 1
chr9 1
chr22 0.5
chr22 0.5
where the first column is the chromosome location and the second column is
some value. What I'd like to do is have a histogram created for each chr
location (i.e. a separate histogram for chr1, chr2, chr3, chr7, chr9, and
chr22). I am just