search for: snp

Displaying 20 results from an estimated 238 matches for "snp".

Did you mean: np
2006 Dec 18
1
Memory problem on a linux cluster using a large data set
...here a way to change the settings or processor under R? I want to run the function Random Forest on my large data set it should be able to cope with that amount. Perhaps someone has tried this before in R or is Fortram a better choice? I added my R script down below. Best regards, Iris Kolder SNP <- read.table("file.txt", header=FALSE, sep="") # read in data file SNP[SNP==9]<-NA # change missing values from a 9 to a NA SNP$total.NAs = rowSums(is.na(SN # calculate the number of NA per row and adds a colum with total Na...
2006 Apr 26
2
help in R
Hi, I cant understand where I am going wrong.Below is my code.I would really appreciate your help. Thanks. > genfile<-read.table("c:/tina/phd/bs871/hw/genfile.txt",skip=1) > > #read in SNP data > snp.dat <- as.matrix(genfile) > snp.name <- scan("c:/tina/phd/bs871/hw/genfile.txt",nline=1,what="character") Read 100 items > n.snp <- length(snp.name) > n.id <- 1 #number of fields for ids, sex and affection status > > ###form gntp using t...
2011 Oct 27
3
2 x 3 Probability under the null
I have a 2 x 3 matrix called snp and I want to compute the following probability: choose(sum(snp[,1]), snp[1,1]) * choose(sum(snp[,2]), snp[1,2]) * choose(sum(snp[,3]), snp[1,3])/choose(sum(snp), sum(snp[1,])) but I keep getting Infs and NaNs. Is there a function that can do this in R? -- Thanks, Jim. [[alternative HTML ve...
2008 Feb 28
1
Errors melt()ing data...
Hi, I'm trying to melt() some data for subsequent cast()ing and am encoutering errors. The overall process requires a couple of casts()s and melt()s. ########Start Session 1########## ## I have the data in a (fully) melted format and can cast it fine... > norm1[1:10,] Pool SNP Sample.Name variable value 1 1 rs1045485 CA0092 Height.1 0.003488853 2 1 rs1045485 CA0142 Height.2 0.333274200 3 1 rs1045485 CO0007 Height.2 0.396250961 4 1 rs1045485 CA0047 Height.2 0.535686831 5 1 rs1045485 CO0149 Height.2 0.296611673 6 1 rs1...
2007 Jan 21
2
efficient code. how to reduce running time?
...es ) { a <- anova(lm(newY~factor(newX[,i]))); F[i] <- a$`F value`[1]; } MSSid <- which (F == max(F)); # index of MSS (Most Significant Site) maxF = cbind(maxF,max(F)); } maxF; } # set the output file sink("/tmp/R.out.3932.100") # load the dataset snp = read.table(file("/tmp/msoutput.3932.100")) #print (snp); # pi: desired proportion of variation due to QTN pi = 0.05; print (paste("pi:", pi)); MAF = 0.05; print (paste("MAF:", MAF)); # S: number of segregating sites S = length(snp[1,]); # N: number of samples N = le...
2007 Oct 30
0
Plotting question: how to plot SNP location data?
Hello, I would like to plot specific SNPs with their exact locations on a chromosome. Based on my genotyping results I would like to separate these SNPs in three different categories: 1, 2 and 3 and use different colours to represent these categories. The script below generates the sample data. I can plot these with the image function usi...
2007 Jan 10
1
Fw: Memory problem on a linux cluster using a large data set [Broadcast]
...more sensitive to big-data issues and tracking down > unnecessary memory copying. > > > "cannot allocate vector size 1240 kb". I've searched through > > use traceback() or options(error=recover) to figure out where > this is actually occurring. > > > SNP <- read.table("file.txt", header=FALSE, sep="") # > read in data file > > This makes a data.frame, and data frames have several aspects > (e.g., automatic creation of row names on sub-setting) that > can be problematic in terms of memory use. Probably be...
2013 Jan 03
4
Index out SNP position
Dear R experts, I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang---- A <- matrix(c(35838396,35838674,36003908,36004090,36150188,36151202,35838584,35838674,36003908,36003992), ncol = 2) B <- matrix(c(36003918,35838399,35838589,36262559),ncol...
2006 Dec 21
1
Memory problem on a linux cluster using a large data set [Broadcast]
...more sensitive to big-data issues and tracking down > unnecessary memory copying. > > > "cannot allocate vector size 1240 kb". I've searched through > > use traceback() or options(error=recover) to figure out where > this is actually occurring. > > > SNP <- read.table("file.txt", header=FALSE, sep="") # > read in data file > > This makes a data.frame, and data frames have several aspects > (e.g., automatic creation of row names on sub-setting) that > can be problematic in terms of memory use. Probably be...
2009 Jul 26
1
splitting multiple data in one column into multiple rows with one entry per column
Dear R colleagues, I annotated a list of single nuclotide polymorphiosms (SNP) with the corresponding genes using biomaRt. The result is the following data.frame (pasted from R): snp ensembl_gene_id 1 rs8032583 2 rs1071600 ENSG00000101605 3 rs13406898 ENSG000001671...
2011 Jan 03
0
Using PCA to correct p-values from snpMatrix
Hi R-help folks, I have been doing some single SNP association work using snpMatrix. This works well, but produces a lot of false positives, because of population structure in my data. I would like to correct the p-values (which snpMatrix gives me) for population structure, possibly using principle component analysis (PCA). My data is complica...
2011 Apr 04
1
automating regression or correlations for many variables
Dear All, I have a large data frame with 10 rows and 82 columns. I want to apply the same function to all of the columns with a single command. e.g. zl <- lm (snp$a_109909 ~ snp$lat) will fit a linear model to the values in lat and a_109909. What I want to do is fit linear models for the values in each column against lat. I tried doing zl <- (snp[,2:82] ~ snp$lat[,1]) but got the following error message "Error in model.frame.default(formula = snp[,...
2013 Jan 04
0
FW: Index out SNP position
...ect. That way you are not modifying B. This will be faster than checking the order of the columns in A each time you process a line from B. > Ax <- t(apply(A, 1, function(x) c(min(x), max(x)))) > indx <- sapply(1:nrow(B), function(i) any(B[i]>Ax[,1] & B[i]<Ax[,2])) > SNP <- B[indx] > SNP [1] 36003918 35838399 35838589 -------------------- David C > From: JiangZhengyu [mailto:zhyjiang2006 at hotmail.com] > Sent: Friday, January 04, 2013 9:03 AM > To: dcarlson at tamu.edu > Subject: RE: [R] Index out SNP position > > Hi David, > >...
2013 Nov 08
1
SNPRelate: Plink conversion
Hi, Following my earlier posts about having problems performing a PCA, I have worked out what the problem is. The problem lies within the PLINK to gds conversion. It seems as though the SNPs are imported as "samples" and in turn, the samples are recognised as SNPs: >snpsgdsSummary("chr2L") Some values of snp.position are invalid (should be > 0)! Some values of snp.chromosome are invalid (should be finite and >=1)! Some of snp.allele are not standard! E.g,...
2012 Mar 05
1
Order a data frame based on the order of another data frame
Hi, I am trying to match the order of the rownames of a dataframe with the rownames of another dataframe (I can't simply sort both sets because I would have to change the order of many other connected datasets if I did that): Also, the second dataset (snp.matrix$fam) is a snp matrix slot: so for example: data_one: x y z sample_1110001 -0.3352623 -1.141462 -0.4032494 sample_1110005 0.1862424 0.015944 0.1329059 sample_1110420 0.1309120 0.004005596...
2009 Oct 22
2
Replacing multiple elements in a vector !
Hi, I have a vector with elements rs.id=c(''rs100'',''rs101'',''rs102'',''rs103'') And a dataframe ''snp.id'' 1 SNP_100 rs100 2 SNP_101 rs101 3 SNP_102 rs102 4 SNP_103 rs103 Task is to replace rs.id vector with corresponding ''SNP_'' ids in snp.id. Thanks in a...
2020 Oct 09
0
2 D density plot interpretation and manipulating the data
Hi Abby, Thanks for getting back to me, yes I believe I did that by doing this: SNP$density <- get_density(SNP$mean, SNP$var) > summary(SNP$density) Min. 1st Qu. Median Mean 3rd Qu. Max. 0 383 696 738 1170 1789 where get_density() is function from here: https://slowkow.com/notes/ggplot2-color-by-density/ and keep only entries with density...
2020 Oct 09
3
2 D density plot interpretation and manipulating the data
...that from the plot I provided? Would outliers be > >> outside of ellipses? If so how do I extract those from my data frame, > >> based on which parameter? > >> > >> So I am trying to connect outliers based on what the plot is showing: > >> s <- ggplot(SNP, mapping = aes(x = mean, y = var)) > >> s <- s + geom_density_2d() + geom_point() + my.theme + ggtitle("SNPs") > >> > >> versus what is in the data: > >> > >> > head(SNP) > >> mean var sd > >> FQ...
2008 Feb 07
1
Problems reshaping data with cast()
Hi, I'm trying to cast() some data, but keep on getting the following error... > norm.all.melted.height <- transform(all.melted.height, + norm.height = value / ave(value, SNP, Pool, FUN = max) + ) Warning messages: 1: In FUN(X[[147L]], ...) : no non-missing arguments to max; returning -Inf 2: In FUN(X[[147L]], ...) : no non-missing arguments to max; returning -Inf 3: In FUN(X[[147L]], ...) : no non-missing arguments to max; retu...
2005 Apr 05
2
cat bailing out in a for loop
Dear All, I am trying to calculate the Hardy-Weinberg Equilibrium p-value for 42 SNPs. I am using the function HWE.exact from the package "genetics". In order not to do a lot of coding "by hand", I have a for loop that goes through each column (each column is one SNP) and gives me the p.value for HWE.exact. Unfortunately some SNP have reached fixation and HWE.e...