Dear Sir / Madam, I am new for R coding. Kindly help me out in sorting out the following problem. There are 50 rows with six coloumns(you could see in the attached .txt file). I wish to go for filtering this 50 rows for any one of the six coloumns satisfying the value >= 64. I need to have a final table with rows having >= 64 value in any one of the six coloumns and the rest could be <=64. For this purpose I use the following R code; --------------- datax<-read.table("filter_test.txt",row.names=1,sep="\t",header=TRUE,dec ".",as.is =TRUE,na.strings = "NA", colClasses = NA,check.names FALSE,strip.white = FALSE, blank.lines.skip = TRUE, allowEscapes = FALSE, flush = FALSE,encoding = "unknown") filter<-datax[,1:6] filtered<-vector() for(i in 1:(dim(filter)[1])) { for(j in 1:(dim(filter)[2])) { x=(filter[i,j])>=64 filtered[i]<-x } } # summing the result of the above sum(filtered) which(filtered) z<-which(filtered) filereddata<-filter[z,] write.table(filtereddata,file ="filterdgenes.txt",quote = TRUE, sep = "\t ", dec = ".",row.names=T,col.names = NA, qmethod = c (escape", "double")) --------------------------- There is something is missing in my coding therefore the filteration is done according to the value of the last column that is the sixth coloumn value not takiing into consideration the rest of the coloumns. For example with the table in .txt file I have attached, the first coloumn has 29 rows having values >= 64 but the last coloumn has only 25 rows. The filtered list should have around 29 rows but only 25 since the coding has considered only the last coloumn. How to sort out this problem. Kindly help me out. Thanking in advance, With Regards, antony -- Thony University of Cologne 50931 Cologne/Germany Tel: 004922125918042 Handy: 004917683142627
Dear John, It looks like you are stuck in both the second and the third circle of the R inferno (http://www.burns-stat.com/pages/Tutor/R_inferno.pdf) You problem is easy to vectorise. #could the number of columns >= 64 in each row NumCols <- rowSums(datax >= 64) #select rows with at least one column >= 64 datax[NumCols >= 1, ] HTH, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Namens John Antonydas Gaspar Verzonden: maandag 2 maart 2009 11:31 Aan: r-help at r-project.org Onderwerp: [R] R-code help for filtering with for loop Dear Sir / Madam, I am new for R coding. Kindly help me out in sorting out the following problem. There are 50 rows with six coloumns(you could see in the attached .txt file). I wish to go for filtering this 50 rows for any one of the six coloumns satisfying the value >= 64. I need to have a final table with rows having >= 64 value in any one of the six coloumns and the rest could be <=64. For this purpose I use the following R code; --------------- datax<-read.table("filter_test.txt",row.names=1,sep="\t",header=TRUE,dec ".",as.is =TRUE,na.strings = "NA", colClasses = NA,check.names FALSE,strip.white = FALSE, blank.lines.skip = TRUE, allowEscapes = FALSE, flush = FALSE,encoding = "unknown") filter<-datax[,1:6] filtered<-vector() for(i in 1:(dim(filter)[1])) { for(j in 1:(dim(filter)[2])) { x=(filter[i,j])>=64 filtered[i]<-x } } # summing the result of the above sum(filtered) which(filtered) z<-which(filtered) filereddata<-filter[z,] write.table(filtereddata,file ="filterdgenes.txt",quote = TRUE, sep "\t ", dec = ".",row.names=T,col.names = NA, qmethod = c (escape", "double")) --------------------------- There is something is missing in my coding therefore the filteration is done according to the value of the last column that is the sixth coloumn value not takiing into consideration the rest of the coloumns. For example with the table in .txt file I have attached, the first coloumn has 29 rows having values >= 64 but the last coloumn has only 25 rows. The filtered list should have around 29 rows but only 25 since the coding has considered only the last coloumn. How to sort out this problem. Kindly help me out. Thanking in advance, With Regards, antony -- Thony University of Cologne 50931 Cologne/Germany Tel: 004922125918042 Handy: 004917683142627 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
The apply function which can work on either a row-wise of column-wise basis can be used with max and ">" can return a logical vector that will let you separate the rows into those with and without a maximum greater than 60. > datax <- matrix(rnorm(300)*30,nrow=50) > datax <- as.data.frame(datax) datax[ apply(datax, 1 ,max) >= 64, ] # the rows from datax with any value greater than 64 datax[ apply(datax, 1 ,max) < 64, ] # the other rows I am not sure what you mean by a table, since in R "table" generally means a contingency table. If you wanted a vector of row numbers, this might help: which(apply(datax,1,max) > 60) # [1] 7 17 22 25 29 46 49 -- David Winsemius On Mar 2, 2009, at 5:30 AM, John Antonydas Gaspar wrote:> Dear Sir / Madam, > > I am new for R coding. Kindly help me out in sorting out the > following problem. > > There are 50 rows with six coloumns(you could see in the > attached .txt file). I > wish to go for filtering this 50 rows for any one of the six coloumns > satisfying the value >= 64. > > I need to have a final table with rows having >= 64 value in any one > of the six > coloumns and the rest could be <=64. For this purpose I use the > following R > code; > --------------- > datax<- > read.table("filter_test.txt",row.names=1,sep="\t",header=TRUE,dec > ".",as.is =TRUE,na.strings = "NA", colClasses = NA,check.names > FALSE,strip.white = FALSE, blank.lines.skip = TRUE, > allowEscapes = FALSE, flush = FALSE,encoding = "unknown") > > > filter<-datax[,1:6] > > filtered<-vector() > > for(i in 1:(dim(filter)[1])) > { > for(j in 1:(dim(filter)[2])) > { > x=(filter[i,j])>=64 > filtered[i]<-x > } > } > > # summing the result of the above > sum(filtered) > > > which(filtered) > z<-which(filtered) > filereddata<-filter[z,] > > > write.table(filtereddata,file ="filterdgenes.txt",quote = TRUE, sep > = "\t ", > dec = ".",row.names=T,col.names = NA, qmethod = c (escape", > "double")) > > --------------------------- > > > There is something is missing in my coding therefore the filteration > is done > according to the value of the last column that is the sixth coloumn > value not > takiing into consideration the rest of the coloumns. > > For example with the table in .txt file I have attached, the first > coloumn has > 29 rows having values >= 64 but the last coloumn has only 25 rows. > > The filtered list should have around 29 rows but only 25 since the > coding has > considered only the last coloumn. How to sort out this problem. > Kindly help me > out. > > Thanking in advance, > With Regards, > > antony > > > -- > Thony > University of Cologne > 50931 Cologne/Germany > > Tel: 004922125918042 > Handy: 004917683142627 > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.