Displaying 20 results from an estimated 4000 matches similar to: "replace string values with numbers"
2013 Feb 21
3
Ask for help: find corresponding elements between matrix
Dear R experts,
I have two matrix (seq & mat) & I want to retrieve in a new matrix all the numbers from mat that =1 (corresponding to the same row/ column position) in seq, or all the numbers in mat that =-1 in seq. - Replace all the numbers with NA if it's not 1/-1 in seq. There are some "NA"s in seq.
seq=matrix(c(1,-1,0,1,1,-1,0,0,-1,1,1,NA),3,4)
2012 Oct 02
5
smoothScatter plot
Hi, I want to make a plot similar to sm1 (attached). The code I tried is: dcols <- densCols(x,y)
smoothScatter(x,y, col = dcols, pch=20,xlab="A",ylab="B")
abline(h=0, col="red")
But it turned out to be s1 (attached) with big dots. I was wondering if anything wrong with my code. Thanks,Zhengyu
-------------- next part --------------
A non-text
2013 Jan 03
4
Index out SNP position
Dear R experts,
I have 2 matix: A& B. I am trying to index B against A - (1) find out B rows that fall between the col 1 and 2 of A& put them into a new vector SNP.I made code as below, but I cannot think of a right way to do it. Could anyone help me with the code? Thanks,Jiang----
A <-
2005 Oct 28
3
replacing a factor value in a data frame
Hi All,
I have the following problem, that's driving me mad.
I have a dataframe of factors, from a genetic scan of SNPs. I DO have
NAs in the dataframe, which would look like:
V4 V5 V6 V7 V8 V9 V10
1 TT GG TT AC AG AG TT
2 AT CC TT AA AA AA TT
3 AT CC TT AC AA <NA> TT
4 TT CC TT AA AA AA TT
5 AT CG TT CC AA AA TT
6 TT CC TT AA AA AA TT
7 AT CC
2006 Jun 30
3
data extraction
Dear mailing list I have a data that have 20,000 rows and 20 columns. Io
wonted to extract the 10th row only. Example the 10th, 20th, 30th 40th…..20000
th. can you please help me how do I do that.Than kyou.
Example is below.
Inpute:
AG GG GG AG
CC CC CC CC
CT CC CT CT
GG GG GG GG
CC CC CC CC
GG GG GG GG
CC CC CC CC
GG CG CG GG
GG GG GG GG
*CC CC CC CC*
AA AG AG AA
AA AA AA AA
GG AG AG GG
GG AG AG
2006 Jun 17
2
managing data
Dear mailing list, may some one be kind to help me solve following problem.
I am trying to write a code that will combine two tables "x" and "y". The
first columns of both tables are unique identification for the rows. The
first column of table "X" is a sub set of the first column of "Y". I need to
find the matching rows in both tables by looking on their
2010 Sep 10
4
Counting occurances of a letter by a factor
I'm trying to find a more elegant way of doing this. What I'm trying to accomplish is to count the frequency of letters (major / minor alleles) in a string grouped by the factor levels in another column of my data frame.
Ex.
> DF<-data.frame(c("CC", "CC", NA, "CG", "GG", "GC"), c("L", "U", "L",
2006 May 22
1
editing a big file
I have a file that has 90 columns and 20,000 rows and looks like
C/G CC GG CG G/T GG TT GT C/T CC TT CT A/G AA GG AG A/C AA CC AC A/T AA
TT AT
I want to write a code that will read through each row first the first looks
at the first column and then replace the three columns with 12 if it is the
same as the first column e.g. third column 11 if it is a repeat of the first
alphabet like the
2018 Mar 15
3
stats 'dist' euclidean distance calculation
Hello,
I am working with a matrix of multilocus genotypes for ~180 individual snail samples, with substantial missing data. I am trying to calculate the pairwise genetic distance between individuals using the stats package 'dist' function, using euclidean distance. I took a subset of this dataset (3 samples x 3 loci) to test how euclidean distance is calculated:
3x3 subset used
2009 Nov 11
1
loop through variable names
Often I perform the same task on a series of variables in a dataframe, by looping through a character vector that holds the names and using paste(), eval(), and parse() inside the loop.
For instance:
thesevars<-names(environmental)
environmental$ToyOutcome<-rnorm(nrow(environmental))
tableOfResults<-data.frame(var=thesevars)
tableOfResults$Beta<- NA
2006 Jun 05
3
Fastest way to do HWE.exact test on 100K SNP data?
Hi everyone,
I'm using the function 'HWE.exact' of 'genetics' package to compute p-values of
the HWE test. My data set consists of ~600 subjects (cases and controls) typed
at ~ 10K SNP markers; the test is applied separately to cases and controls. The
genotypes are stored in a list of 'genotype' objects, all.geno, and p-values are
calculated inside the loop over all
2002 Sep 09
1
getting variable names into formulas
Hello,
I have a dataframe with several hundred variables. I would like to
explore updates of some baseline lme fit by including each of some
subset of these variables, one at a time. For various reasons it is
inconvenient to rely on the positions of the numbered columns in the
dataframe. Here is what I want to do:
mod.baseline<-lme(fixed=foo,data=dat,random=bar)
for(thisvar in vars){
2010 Feb 12
1
"drop if missing" command?
This will probably seem very simple to experienced R programmers:
I am doing a snp association analysis and am at the model-fitting stage. I
am using the Stats package's "drop1" with the following code:
##geno is the dataset
## the dependent variable (casectrln) is dichotomous and coded 0,1
## rs743572_2 is one of the snps (which is coded 0,1,2 for the 3 genotypes)
2013 Jan 04
0
FW: Index out SNP position
I think you mean between column 1 and 2 of A? Why is 36003918 not
included? It is clearly between 35838396 and 36151202 in the first row of A.
My earlier solution should work fine. Just create a new matrix AX that has
the columns switched so that the start is always column 1 and use that to
identify the ones you want to select. That way you are not modifying B. This
will be faster than checking
2012 Jan 16
1
rho stat from a fasta sequence file
Hi all,
I have a sequence file (fasta format) and want to calculate the rho
statistics for dinucleotide abundance value on my data.. the code which I
use is (using seqinr library and current working directory)
seq_info<-read.fasta("gene.txt")
rho(seq_info[1],2)
but it yields only the dinucleotides, not their rho values, i.e,
> rho(seq_info[1],2)
aa ac ag at ca cc cg ct ga gc
2010 Nov 09
5
Question regarding to replace <NA>
Dear r-users,
Basically, I have a data as follows,
> data
S s1 s2 s3 s4 s5 prob obs num.strata
1 NNNNN N N N N N 0.0000108 32 <NA>
2 NNNNY N N N N Y 0.0005292 16 <NA>
3 NNNYN N N N Y N 0.0005292 24 <NA>
4 NNNYY N N N Y Y 0.0259308 8 1
....
I want to replace <NA> by 0, when I tried the following
2011 Aug 30
2
Error in evalauating a function
Hi,
? I am very new to R. So, pardon my dumb question. I was trying to write my own function to run a different model (perform an ordered logistic regression) using the example in website http://pngu.mgh.harvard.edu/~purcell/plink/rfunc.shtml
But R returns a error `R Error in eval(expr, envir, enclos) : object 's' not found' when I run it. What am I doing wrong here? Here's
2010 May 20
1
Geneland error on unix: Error in MCMC(........ :, unused argument(s) (ploidy = 2, genotypes = geno)
I am receiving the above error ( full r session output below) the
script runs OK in windows. and "genotypes" and "ploidy" are both
correct arguments
any suggestions would be most welcome
Nevil Amos
MERG/ACB
Monash University School of Biological Sciences
> library(Geneland)
Loading required package: RandomFields
Loading required package: fields
Loading required
2009 Sep 16
1
re-code missing value
Dear all,
I have partial data set with four colums. First column is "site" with three
factors (i.e., A, B, and C). From second to fourth columns (v1 ~ v3) are my
observations. In the observations of the data set, "." indicates missing
value. I replaced "." with NA. To replace "." with NA, I used two steps.
First, I replaced "." with NA, and
2017 Oct 24
2
as.data.frame doesn't set col.names
Why doesn't this work?
> samples$geno <- as.data.frame(sapply(yo, toupper), col.names="geno")
> samples
quant_samples age sapply(yo, toupper)
E11.5 F20het BA40 E11.5 F20het BA40 E11.5 F20HET
E11.5 F20het BA45 E11.5 F20het BA45 E11.5 F20HET