Dear all, I am trying to calculate a score for a string sequence consisting of the following four letters: ACGT. I have got a matrix giving the scores for each pair of letters. So for example the string ACCT has got the pairs: AC, CC and CT. The matrix has got the following form: names<-c("A","C","G","T"); mscore<-matrix(0,4,4); rownames(mscore)<-names; colnames(mscore)<-names; So for the first example pair above I could get the score contained in the matrix with the following code: >> mscore["A","C"] I am now trying to sum up all the scores with the following code: score<-0; for(j in 1:length(sequence)-1){ score<-score+mscore[sequence[j],sequence[j+1]]; } where sequence is the string sequence. Unfortunately, it does not work and I get the following error message: Error in "[<-"(`*tmp*`, 1, i, value = numeric(0)) : nothing to replace with What does this mean? Strangely, the command "print(score+mscore[sequence[j],sequence[j+1]])" works, so it really is the assignment which won't work. Why is that? I am running R 2.0 series (GUI version) on Mac OSX 10.3.7. Any suggestions are welcome. Thanks a lot, Dax
On 29-Dec-04 dax42 wrote:> I am now trying to sum up all the scores with the following code: > > score<-0; > > for(j in 1:length(sequence)-1){ > score<-score+mscore[sequence[j],sequence[j+1]]; > } > > where sequence is the string sequence. > Unfortunately, it does not work and I get the following error message: > > Error in "[<-"(`*tmp*`, 1, i, value = numeric(0)) : > nothing to replace withYou've fallen into a classic trap: 1:length(sequence)-1 does not mean what one might naturally expect.> sequence<-(1:10) > length(sequence)[1] 10> 1:length(sequence)-1[1] 0 1 2 3 4 5 6 7 8 9> 1:(length(sequence)-1)[1] 1 2 3 4 5 6 7 8 9 In other words, "1:length(sequence)" is constructed first, then 1 is subtracted from every element. As a result, you tried to read from sequence[0], which isn't there. However, you can make it evaluate "length(sequence)-1" before constructing 1:(length(sequence)-1), by using the parantheses to force precedence. Cheers, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 29-Dec-04 Time: 13:45:25 ------------------------------ XFMail ------------------------------
You would probably be better off without the loop, for example: ns <- length( sequence ) num.seq <- match( sequence, names ) scores <- mscore[cbind(num.seq[-1],num.seq[-ns])] sum( scores ) I have used the fact that if you index a matrix with a two-column,matrix ( here, cbind( , ) ), you select the corresponding elements of the matrix, see ?"[" But it only works if the matrix is numeric. Bendix Carstensen ---------------------- Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2 DK-2820 Gentofte Denmark tel: +45 44 43 87 38 mob: +45 30 75 87 38 fax: +45 44 43 07 06 bxc at steno.dk www.biostat.ku.dk/~bxc ----------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of dax42 > Sent: Wednesday, December 29, 2004 2:18 PM > To: r-help at stat.math.ethz.ch > Subject: [R] numeric(0) > > > Dear all, > > I am trying to calculate a score for a string sequence consisting of > the following four letters: ACGT. > I have got a matrix giving the scores for each pair of > letters. So for example the string ACCT has got the pairs: > AC, CC and CT. > > The matrix has got the following form: > names<-c("A","C","G","T"); mscore<-matrix(0,4,4); > rownames(mscore)<-names; colnames(mscore)<-names; > > So for the first example pair above I could get the score > contained in > the matrix with the following code: > >> mscore["A","C"] > > I am now trying to sum up all the scores with the following code: > > score<-0; > > for(j in 1:length(sequence)-1){ > score<-score+mscore[sequence[j],sequence[j+1]]; > } > > where sequence is the string sequence. > Unfortunately, it does not work and I get the following error message: > > Error in "[<-"(`*tmp*`, 1, i, value = numeric(0)) : > nothing to replace with > > What does this mean? Strangely, the command > "print(score+mscore[sequence[j],sequence[j+1]])" > works, so it really is the assignment which won't work. Why > is that? I am running R 2.0 series (GUI version) on Mac OSX 10.3.7. > > Any suggestions are welcome. > Thanks a lot, > Dax > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read > the posting guide! http://www.R-project.org/posting-guide.html >
On Dec 29, 2004, at 8:17 AM, dax42 wrote:> Dear all, > > I am trying to calculate a score for a string sequence consisting of > the following four letters: ACGT. > I have got a matrix giving the scores for each pair of letters. > So for example the string ACCT has got the pairs: AC, CC and CT. > > The matrix has got the following form: > names<-c("A","C","G","T"); > mscore<-matrix(0,4,4); > rownames(mscore)<-names; > colnames(mscore)<-names; > > So for the first example pair above I could get the score contained in > the matrix with the following code: > >> mscore["A","C"] > > I am now trying to sum up all the scores with the following code: > > score<-0; > > for(j in 1:length(sequence)-1){ > score<-score+mscore[sequence[j],sequence[j+1]]; > } >Is sequence a string? If so, you will probably need to make some modifications like these: > sequence <- 'ACGT' > sequence[1] [1] "ACGT" > substr(sequence,1,1) [1] "A" > substr(sequence,2,1) [1] "" > substr(sequence,1,2) [1] "AC" > substr(sequence,2,2) [1] "C" > substr(sequence,3,3) [1] "G" > mscore <- matrix(runif(16),4,4) > names <- c('A','C','G','T') > rownames(mscore) <- names > colnames(mscore) <- names > mscore A C G T A 0.6200289 0.6324337 0.1895207 0.28253473 C 0.5026072 0.6552428 0.7978809 0.43131540 G 0.1669823 0.8808445 0.6021024 0.01563101 T 0.4184646 0.9620714 0.7723088 0.33045464 > mscore[substr(sequence,1,1),substr(sequence,2,2)] [1] 0.6324337 > score <- 0 > for (j in 1:(nchar(sequence)-1)) {score <- score+ mscore[substr(sequence,j,j),substr(sequence,j+1,j+1)]} > score [1] 1.445946 Hope this helps. I imagine this isn't the most efficient way of solving this problem, though. Sean
`names' is a builtin R function for extracting names of elements in a vector, or names of components in a list. If I use nm <- c("A","C","G","T") rownames(mscore) <- colnames(mscore) <- nm I don't get any error. Andy> From: dax42 > > Dear all, > > I am trying to calculate a score for a string sequence consisting of > the following four letters: ACGT. > I have got a matrix giving the scores for each pair of letters. > So for example the string ACCT has got the pairs: AC, CC and CT. > > The matrix has got the following form: > names<-c("A","C","G","T"); > mscore<-matrix(0,4,4); > rownames(mscore)<-names; > colnames(mscore)<-names; > > So for the first example pair above I could get the score > contained in > the matrix with the following code: > >> mscore["A","C"] > > I am now trying to sum up all the scores with the following code: > > score<-0; > > for(j in 1:length(sequence)-1){ > score<-score+mscore[sequence[j],sequence[j+1]]; > } > > where sequence is the string sequence. > Unfortunately, it does not work and I get the following error message: > > Error in "[<-"(`*tmp*`, 1, i, value = numeric(0)) : > nothing to replace with > > What does this mean? Strangely, the command > "print(score+mscore[sequence[j],sequence[j+1]])" > works, so it really is the assignment which won't work. Why is that? > I am running R 2.0 series (GUI version) on Mac OSX 10.3.7. > > Any suggestions are welcome. > Thanks a lot, > Dax > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >
>: Date: Wed, 29 Dec 2004 14:17:35 +0100 : From: dax42 <Dax42 at web.de> : [ Add to Address Book | Block Address | Report as Spam ] : To: <r-help at stat.math.ethz.ch> : Subject: [R] numeric(0) : : : Dear all, : : I am trying to calculate a score for a string sequence consisting of : the following four letters: ACGT. : I have got a matrix giving the scores for each pair of letters. : So for example the string ACCT has got the pairs: AC, CC and CT. : : The matrix has got the following form: : names<-c("A","C","G","T"); : mscore<-matrix(0,4,4); : rownames(mscore)<-names; : colnames(mscore)<-names; : : So for the first example pair above I could get the score contained in : the matrix with the following code: : >> mscore["A","C"] : : I am now trying to sum up all the scores with the following code: : : score<-0; : : for(j in 1:length(sequence)-1){ : score<-score+mscore[sequence[j],sequence[j+1]]; : } The above code sutracts 1 from the vector 1:length(sequence) giving a vector that starts at 0. I think you meant 1:(length(sequence)-1) : : where sequence is the string sequence. : Unfortunately, it does not work and I get the following error message: : : Error in "[<-"(`*tmp*`, 1, i, value = numeric(0)) : : nothing to replace with : : What does this mean? Strangely, the command : "print(score+mscore[sequence[j],sequence[j+1]])" : works, so it really is the assignment which won't work. Why is that? : I am running R 2.0 series (GUI version) on Mac OSX 10.3.7. : : Any suggestions are welcome. Try this. (The first line is to ensure that levels that don't appear still get counted.) f <- factor(sequence, levels = names) sum(table(f[-length(f)], f[-1]) * mscore)
From: <Ted.Harding at nessie.mcc.ac.uk>> What's the best place to look for the details on operator > precedence and the like in R?Check out the language reference manual in the directory .../R/rw2001/doc/manual/r-lang.pdf . On Windows its also accessible from R via Help | Manuals | R Language The manuals can also be found online. e.g. Google for r-lang "infix and prefix" to find the relevant section. By the way, even without this problem seq is better than : in programs as it has better boundary behavior. Consider 1:n vs. seq(length = n). For n=0 the : operator actually generates the vector c(1,0) whereas the seq solution gives a zero length vector, as desired. Even seq has problems unless you use length= . The : operator is mainly useful if you are typing throwaway code directly into R.
Maybe Matching Threads
- Coercing by/tapply to data.frame for more than two indices?
- Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)
- Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)
- item characteristic curves (logistic regression w. constant)
- DLL Memory Problem