Bogdan Tanasa
2018-Nov-02 00:45 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
Dear all, please may I ask for a suggestion : considering a dataframe that contains the numerical values for gene expression, for example : x = data.frame(TTT=c(0,1,0,0), TTA=c(0,1,1,0), ATA=c(1,0,0,0), gene=c("gene1", "gene2", "gene3", "gene4")) how could I select only the COLUMNS where the value of a GENE (a ROW) is non-zero ? thank you ! -- bogdan [[alternative HTML version deleted]]
William Dunlap
2018-Nov-02 01:08 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
This would be a bit simpler if 'gene' were the rownames of the data.frame. The '-4' is to remove the gene column from the calculations.> x[ x[,"gene"]=="gene2",]TTT TTA ATA gene 2 1 1 0 gene2> colnames(x)[-4][ 1 == x[ x[,"gene"]=="gene2",-4] ][1] "TTT" "TTA"> colnames(x)[-4][ 1 == x[ x[,"gene"]=="gene3",-4] ][1] "TTA"> xTTT TTA ATA gene 1 0 0 1 gene1 2 1 1 0 gene2 3 0 1 0 gene3 4 0 0 0 gene4 Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote:> Dear all, please may I ask for a suggestion : > > considering a dataframe that contains the numerical values for gene > expression, for example : > > x = data.frame(TTT=c(0,1,0,0), > TTA=c(0,1,1,0), > ATA=c(1,0,0,0), > gene=c("gene1", "gene2", "gene3", "gene4")) > > how could I select only the COLUMNS where the value of a GENE (a ROW) is > non-zero ? > > thank you ! > > -- bogdan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
William Michels
2018-Nov-02 01:33 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
Hi Bogdan, Are you saying you want to drop columns that sum to zero? If so, I'm not sure you've given us a good example dataframe, since all your numeric columns give non-zero sums. Otherwise, what you're asking for is trivial. Below is an example dataframe ("ygene") with an example "AGA" column that gets dropped:> xgene <- data.frame(TTT=c(0,1,0,0),+ TTA=c(0,1,1,0), + ATA=c(1,0,0,0), + gene=c("gene1", "gene2", "gene3", "gene4"))> > xgene[ , colSums(xgene[,1:3]) > 0 ]TTT TTA ATA gene 1 0 0 1 gene1 2 1 1 0 gene2 3 0 1 0 gene3 4 0 0 0 gene4> > ygene <- data.frame(TTT=c(0,1,0,0),+ TTA=c(0,1,1,0), + AGA=c(0,0,0,0), + gene=c("gene1", "gene2", "gene3", "gene4"))> > ygene[ , colSums(ygene[,1:3]) > 0 ]TTT TTA gene 1 0 0 gene1 2 1 1 gene2 3 0 1 gene3 4 0 0 gene4 HTH, Bill. William Michels, Ph.D. On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote:> Dear all, please may I ask for a suggestion : > > considering a dataframe that contains the numerical values for gene > expression, for example : > > x = data.frame(TTT=c(0,1,0,0), > TTA=c(0,1,1,0), > ATA=c(1,0,0,0), > gene=c("gene1", "gene2", "gene3", "gene4")) > > how could I select only the COLUMNS where the value of a GENE (a ROW) is > non-zero ? > > thank you ! > > -- bogdan > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Bogdan Tanasa
2018-Nov-02 04:07 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
Dear Bill, and Bill, many thanks for taking the time to advice, and for your suggestions. I believe that I shall rephrase a bit my question, with a better example : thank you again in advance for your help. Let's assume that we start from a data frame : x = data.frame( TTT=c(0,1,0,0), TTA=c(0,1,1,0), ATA=c(1,0,0,0), ATT=c(0,0,0,0), row.names=c("gene1", "gene2", "gene3", "gene4")) Shall we select "gene2", at the end, we would like to have ONLY the COLUMNS, where "gene2" is NOT-ZERO. In other words, the output contains only the first 2 columns : output = data.frame( TTT=c(0,1,0,0), TTA=c(0,1,1,0), row.names=c("gene1", "gene2", "gene3", "gene4")) with much appreciation, -- bogdan On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 at caa.columbia.edu> wrote:> Hi Bogdan, > > Are you saying you want to drop columns that sum to zero? If so, I'm > not sure you've given us a good example dataframe, since all your > numeric columns give non-zero sums. > > Otherwise, what you're asking for is trivial. Below is an example > dataframe ("ygene") with an example "AGA" column that gets dropped: > > > xgene <- data.frame(TTT=c(0,1,0,0), > + TTA=c(0,1,1,0), > + ATA=c(1,0,0,0), > + gene=c("gene1", "gene2", "gene3", "gene4")) > > > > xgene[ , colSums(xgene[,1:3]) > 0 ] > TTT TTA ATA gene > 1 0 0 1 gene1 > 2 1 1 0 gene2 > 3 0 1 0 gene3 > 4 0 0 0 gene4 > > > > ygene <- data.frame(TTT=c(0,1,0,0), > + TTA=c(0,1,1,0), > + AGA=c(0,0,0,0), > + gene=c("gene1", "gene2", "gene3", "gene4")) > > > > ygene[ , colSums(ygene[,1:3]) > 0 ] > TTT TTA gene > 1 0 0 gene1 > 2 1 1 gene2 > 3 0 1 gene3 > 4 0 0 gene4 > > > HTH, > > Bill. > > William Michels, Ph.D. > > > On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: > > Dear all, please may I ask for a suggestion : > > > > considering a dataframe that contains the numerical values for gene > > expression, for example : > > > > x = data.frame(TTT=c(0,1,0,0), > > TTA=c(0,1,1,0), > > ATA=c(1,0,0,0), > > gene=c("gene1", "gene2", "gene3", "gene4")) > > > > how could I select only the COLUMNS where the value of a GENE (a ROW) is > > non-zero ? > > > > thank you ! > > > > -- bogdan > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]