Hi I am trying to process tabular data as follows: Data in the input file is of the form genome1 genome2 tree-dist log10escore Genome1 and genome2 are alphabetic. Tree-dist and log10escore are numeric. I wish to extract only those rows from this table where the log10escore is less than -3. data <-read.table(filename); data$log10escore = data$log10escore[ data$log10escore < -3]; I would like to use this pruned list of escores to get the corresponding genomenames and treedist. I did not find anything useful in the FAQs and Notes on R for this part of the data extraction. As I am just beginning programming in R, I would appreciate your input about this. Thanks L ____________________________________________________________________________________ Food fight? Enjoy some healthy debate
lalitha viswanath wrote:> Hi > I am trying to process tabular data as follows: > > Data in the input file is of the form > > genome1 genome2 tree-dist log10escore > > Genome1 and genome2 are alphabetic. > Tree-dist and log10escore are numeric. > > I wish to extract only those rows from this table > where the log10escore is less than -3. > > > data <-read.table(filename); > data$log10escore = data$log10escore[ data$log10escore > < -3];> library(fortunes) > fortune("dog")Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'. -- Barry Rowlingson R-help (October 2004)> I would like to use this pruned list of escores to get > the corresponding genomenames and treedist.?subset df.sub <- subset(df, log10escore < -3) summary(df.sub)> I did not find anything useful in the FAQs and Notes > on R for this part of the data extraction. > > As I am just beginning programming in R, I would > appreciate your input about this. > > Thanks > L > > > > ____________________________________________________________________________________ > Food fight? Enjoy some healthy debate > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
On Tue, 2007-01-23 at 09:28 -0800, lalitha viswanath wrote:> Hi > I am trying to process tabular data as follows: > > Data in the input file is of the form > > genome1 genome2 tree-dist log10escore > > Genome1 and genome2 are alphabetic. > Tree-dist and log10escore are numeric. > > I wish to extract only those rows from this table > where the log10escore is less than -3. > > > data <-read.table(filename); > data$log10escore = data$log10escore[ data$log10escore > < -3]; > > I would like to use this pruned list of escores to get > the corresponding genomenames and treedist. > > I did not find anything useful in the FAQs and Notes > on R for this part of the data extraction. > > As I am just beginning programming in R, I would > appreciate your input about this. > > Thanks > Lhelp.search("subset") would lead you to ?subset, where you could do something like: DF <- subset(YourData, log10escore < -3) If you just wanted the values of the two other columns, you could also use: DF <- subset(YourData, log10escore < -3, select = c(genomenames, treedist)) One additional alternative is to use which(). This will return the _indices_ of the values that match the criteria. For example: Ind <- which(YourData$log10escore < -3) In that case, you could then use: YourData$genomename[Ind] and YourData$treedist[Ind] These would return vectors of the two columns meeting the criteria. Which approach you take depends upon what else you may want to do with the data. See ?which for more information. HTH, Marc Schwartz
Here is how you can do the extraction back to the original input: data <- data[ data$log10escore < -3, ] On 1/23/07, lalitha viswanath <lalithaviswanath at yahoo.com> wrote:> Hi > I am trying to process tabular data as follows: > > Data in the input file is of the form > > genome1 genome2 tree-dist log10escore > > Genome1 and genome2 are alphabetic. > Tree-dist and log10escore are numeric. > > I wish to extract only those rows from this table > where the log10escore is less than -3. > > > data <-read.table(filename); > data$log10escore = data$log10escore[ data$log10escore > < -3]; > > I would like to use this pruned list of escores to get > the corresponding genomenames and treedist. > > I did not find anything useful in the FAQs and Notes > on R for this part of the data extraction. > > As I am just beginning programming in R, I would > appreciate your input about this. > > Thanks > L > > > > ____________________________________________________________________________________ > Food fight? Enjoy some healthy debate > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?