may some one please help me to sort this out, i am trying to writ a R code for calculating the frequencies of the amino acids in 9 different sequences, i want the code to read the sequence from external text file, i used the following code to do so: x<-read.table("sequence.txt",header=FALSE) then i defined an array for 20 amino acids as following: AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') i am using the following code to calculate the frequencies: frequency<-function(X) { y<-rep(0,20) for(j in 1:nchar(as.character(x$V1[i]))){ for(i in 1:9){ res<-which(AA==substr(x$V1[i],j,j)) y[res]=y[res]+1 } } return(y) } but this code actually is not working, it reads only one sequence, i dont know why the loop is not working for the "i", which suppose to read the nine rows of the file sequence.txt. the sequence.txt file is attached to this message. cheers http://n4.nabble.com/file/n997072/sequence.txt sequence.txt -- View this message in context: http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997072.html Sent from the R help mailing list archive at Nabble.com.
On Jan 1, 2010, at 11:59 PM, che wrote:> > may some one please help me to sort this out, i am trying to writ a > R code > for calculating the frequencies of the amino acids in 9 different > sequences, > i want the code to read the sequence from external text file, i used > the > following code to do so: > x<-read.table("sequence.txt",header=FALSE) > > then i defined an array for 20 amino acids as following: > AA<- > c > ('A > ','C > ','D > ','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') > i am using the following code to calculate the frequencies: > > frequency<-function(X) > { > y<-rep(0,20) > for(j in 1:nchar(as.character(x$V1[i]))){# at this point you are referencing "i" but it is not yet being iterated and might not even exist. # did you mean "j"? # also might be safer to use seq_along()> for(i in 1:9){ > > res<-which(AA==substr(x$V1[i],j,j))# Is that really working for even one sequence? Without an "x" sequence I cannot test, but it "looks wrong".> y[res]=y[res]+1 > } > } > return(y) > } > > but this code actually is not working, it reads only one sequence, i > dont > know why the loop is not working for the "i", which suppose to read > the nine > rows of the file sequence.txt. the sequence.txt file is attached to > this > message. > > cheers > http://n4.nabble.com/file/n997072/sequence.txt sequence.txt > -- > View this message in context: http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997072.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
i know it would be better to ask R to make the data, but i need to sequence this particular file, because it is data for some Amino Acids and i cant play with, so i need to ask R to go through the sequence one by one, and then give me the numbers of each letters of each sequence, i am quite confused between using "i" and "j" and how to iterate both of them and make them work functionally. i attached the sequence.txt with my original message, and i will attach it here in case. thanks for your help. http://n4.nabble.com/file/n997087/sequence.txt sequence.txt che wrote:> > may some one please help me to sort this out, i am trying to writ a R code > for calculating the frequencies of the amino acids in 9 different > sequences, i want the code to read the sequence from external text file, i > used the following code to do so: > x<-read.table("sequence.txt",header=FALSE) > > then i defined an array for 20 amino acids as following: > AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') > i am using the following code to calculate the frequencies: > > frequency<-function(X) > { > y<-rep(0,20) > for(j in 1:nchar(as.character(x$V1[i]))){ > for(i in 1:9){ > > res<-which(AA==substr(x$V1[i],j,j)) > y[res]=y[res]+1 > } > } > return(y) > } > > but this code actually is not working, it reads only one sequence, i dont > know why the loop is not working for the "i", which suppose to read the > nine rows of the file sequence.txt. the sequence.txt file is attached to > this message. > > cheers > http://n4.nabble.com/file/n997072/sequence.txt sequence.txt >-- View this message in context: http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997087.html Sent from the R help mailing list archive at Nabble.com.
Hi fadialnaji, Take a look at the Biostring package in Bioconductor [1] It might be an alternative to do what you want. HTH, Jorge [1] http://www.bioconductor.org/packages/release/bioc/html/Biostrings.html On Fri, Jan 1, 2010 at 11:59 PM, che <> wrote:> > may some one please help me to sort this out, i am trying to writ a R code > for calculating the frequencies of the amino acids in 9 different > sequences, > i want the code to read the sequence from external text file, i used the > following code to do so: > x<-read.table("sequence.txt",header=FALSE) > > then i defined an array for 20 amino acids as following: > > AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') > i am using the following code to calculate the frequencies: > > frequency<-function(X) > { > y<-rep(0,20) > for(j in 1:nchar(as.character(x$V1[i]))){ > for(i in 1:9){ > > res<-which(AA==substr(x$V1[i],j,j)) > y[res]=y[res]+1 > } > } > return(y) > } > > but this code actually is not working, it reads only one sequence, i dont > know why the loop is not working for the "i", which suppose to read the > nine > rows of the file sequence.txt. the sequence.txt file is attached to this > message. > > cheers > http://n4.nabble.com/file/n997072/sequence.txt sequence.txt > -- > View this message in context: > http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997072.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks very much the code is working perfectly, but I hope guys that you can help me to do the same thing but by using the loop structure, i want to know if i am doing right, i want to use the loop structure to scan each sequence from the file sequence.txt (the file is attached) to get the frequency for each Amino Acid, and i wrote the following code so far, and i stopped, got confused, specially that i am a very beginner in R http://n4.nabble.com/file/n997581/sequence.txt sequence.txt : x<-read.table("sequence.txt",header=FALSE) AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') test<-nchar(as.character(x$V1[i])) frequency<-function(X) { y<-rep(0,20) for(j in 1:test){ for(i in 1:nrow(x)){ res<-which(AA==substr(x$V1[i],j,j)) y[res]=y[res]+1 } } return(y) } So how to fix this code, how to give the life for the ?i? and the ?j? in order to initiate the indexing..... Sorry for bothering you guys. che wrote:> > may some one please help me to sort this out, i am trying to writ a R code > for calculating the frequencies of the amino acids in 9 different > sequences, i want the code to read the sequence from external text file, i > used the following code to do so: > x<-read.table("sequence.txt",header=FALSE) > > then i defined an array for 20 amino acids as following: > AA<-c('A','C','D','E','F','G','H','I','K','L','M','N','P','Q','R','S','T','V','W','Y') > i am using the following code to calculate the frequencies: > > frequency<-function(X) > { > y<-rep(0,20) > for(j in 1:nchar(as.character(x$V1[i]))){ > for(i in 1:9){ > > res<-which(AA==substr(x$V1[i],j,j)) > y[res]=y[res]+1 > } > } > return(y) > } > > but this code actually is not working, it reads only one sequence, i dont > know why the loop is not working for the "i", which suppose to read the > nine rows of the file sequence.txt. the sequence.txt file is attached to > this message. > > cheers > http://n4.nabble.com/file/n997072/sequence.txt sequence.txt >-- View this message in context: http://n4.nabble.com/caculate-the-frequencies-of-the-Amino-Acids-tp997072p997581.html Sent from the R help mailing list archive at Nabble.com.