Bogdan Tanasa
2018-Nov-02 04:07 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
Dear Bill, and Bill,
many thanks for taking the time to advice, and for your suggestions. I
believe that I shall rephrase a bit my question, with a better example :
thank you again in advance for your help.
Let's assume that we start from a data frame :
x = data.frame( TTT=c(0,1,0,0),
TTA=c(0,1,1,0),
ATA=c(1,0,0,0),
ATT=c(0,0,0,0),
row.names=c("gene1", "gene2",
"gene3", "gene4"))
Shall we select "gene2", at the end, we would like to have ONLY the
COLUMNS, where "gene2" is NOT-ZERO. In other words, the output
contains
only the first 2 columns :
output = data.frame( TTT=c(0,1,0,0),
TTA=c(0,1,1,0),
row.names=c("gene1",
"gene2", "gene3",
"gene4"))
with much appreciation,
-- bogdan
On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 at caa.columbia.edu>
wrote:
> Hi Bogdan,
>
> Are you saying you want to drop columns that sum to zero? If so, I'm
> not sure you've given us a good example dataframe, since all your
> numeric columns give non-zero sums.
>
> Otherwise, what you're asking for is trivial. Below is an example
> dataframe ("ygene") with an example "AGA" column that
gets dropped:
>
> > xgene <- data.frame(TTT=c(0,1,0,0),
> + TTA=c(0,1,1,0),
> + ATA=c(1,0,0,0),
> + gene=c("gene1", "gene2",
"gene3", "gene4"))
> >
> > xgene[ , colSums(xgene[,1:3]) > 0 ]
> TTT TTA ATA gene
> 1 0 0 1 gene1
> 2 1 1 0 gene2
> 3 0 1 0 gene3
> 4 0 0 0 gene4
> >
> > ygene <- data.frame(TTT=c(0,1,0,0),
> + TTA=c(0,1,1,0),
> + AGA=c(0,0,0,0),
> + gene=c("gene1", "gene2",
"gene3", "gene4"))
> >
> > ygene[ , colSums(ygene[,1:3]) > 0 ]
> TTT TTA gene
> 1 0 0 gene1
> 2 1 1 gene2
> 3 0 1 gene3
> 4 0 0 gene4
>
>
> HTH,
>
> Bill.
>
> William Michels, Ph.D.
>
>
> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com>
wrote:
> > Dear all, please may I ask for a suggestion :
> >
> > considering a dataframe that contains the numerical values for gene
> > expression, for example :
> >
> > x = data.frame(TTT=c(0,1,0,0),
> > TTA=c(0,1,1,0),
> > ATA=c(1,0,0,0),
> > gene=c("gene1", "gene2",
"gene3", "gene4"))
> >
> > how could I select only the COLUMNS where the value of a GENE (a ROW)
is
> > non-zero ?
> >
> > thank you !
> >
> > -- bogdan
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
William Michels
2018-Nov-02 04:59 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
Perhaps one of the following two methods:> zgene = data.frame( TTT=c(0,1,0,0),+ TTA=c(0,1,1,0), + ATA=c(1,0,0,0), + ATT=c(0,0,0,0), + row.names=c("gene1", "gene2", "gene3", "gene4"))> zgeneTTT TTA ATA ATT gene1 0 0 1 0 gene2 1 1 0 0 gene3 0 1 0 0 gene4 0 0 0 0> > zgene[ , zgene[2,1:4] > 0]TTT TTA gene1 0 0 gene2 1 1 gene3 0 1 gene4 0 0> > zgene[ , zgene[rownames(zgene) == "gene2",1:4] > 0]TTT TTA gene1 0 0 gene2 1 1 gene3 0 1 gene4 0 0>Best Regards, Bill. William Michels, Ph.D. On Thu, Nov 1, 2018 at 9:07 PM, Bogdan Tanasa <tanasa at gmail.com> wrote:> Dear Bill, and Bill, > > many thanks for taking the time to advice, and for your suggestions. I > believe that I shall rephrase a bit my question, with a better example : > thank you again in advance for your help. > > Let's assume that we start from a data frame : > > x = data.frame( TTT=c(0,1,0,0), > TTA=c(0,1,1,0), > ATA=c(1,0,0,0), > ATT=c(0,0,0,0), > row.names=c("gene1", "gene2", "gene3", "gene4")) > > Shall we select "gene2", at the end, we would like to have ONLY the COLUMNS, > where "gene2" is NOT-ZERO. In other words, the output contains only the > first 2 columns : > > output = data.frame( TTT=c(0,1,0,0), > TTA=c(0,1,1,0), > row.names=c("gene1", "gene2", "gene3", > "gene4")) > > with much appreciation, > > -- bogdan > > On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 at caa.columbia.edu> > wrote: >> >> Hi Bogdan, >> >> Are you saying you want to drop columns that sum to zero? If so, I'm >> not sure you've given us a good example dataframe, since all your >> numeric columns give non-zero sums. >> >> Otherwise, what you're asking for is trivial. Below is an example >> dataframe ("ygene") with an example "AGA" column that gets dropped: >> >> > xgene <- data.frame(TTT=c(0,1,0,0), >> + TTA=c(0,1,1,0), >> + ATA=c(1,0,0,0), >> + gene=c("gene1", "gene2", "gene3", "gene4")) >> > >> > xgene[ , colSums(xgene[,1:3]) > 0 ] >> TTT TTA ATA gene >> 1 0 0 1 gene1 >> 2 1 1 0 gene2 >> 3 0 1 0 gene3 >> 4 0 0 0 gene4 >> > >> > ygene <- data.frame(TTT=c(0,1,0,0), >> + TTA=c(0,1,1,0), >> + AGA=c(0,0,0,0), >> + gene=c("gene1", "gene2", "gene3", "gene4")) >> > >> > ygene[ , colSums(ygene[,1:3]) > 0 ] >> TTT TTA gene >> 1 0 0 gene1 >> 2 1 1 gene2 >> 3 0 1 gene3 >> 4 0 0 gene4 >> >> >> HTH, >> >> Bill. >> >> William Michels, Ph.D. >> >> >> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: >> > Dear all, please may I ask for a suggestion : >> > >> > considering a dataframe that contains the numerical values for gene >> > expression, for example : >> > >> > x = data.frame(TTT=c(0,1,0,0), >> > TTA=c(0,1,1,0), >> > ATA=c(1,0,0,0), >> > gene=c("gene1", "gene2", "gene3", "gene4")) >> > >> > how could I select only the COLUMNS where the value of a GENE (a ROW) is >> > non-zero ? >> > >> > thank you ! >> > >> > -- bogdan >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code.
Bogdan Tanasa
2018-Nov-02 05:02 UTC
[R] selecting the COLUMNS in a dataframe function of the numerical values in a ROW
very helpful, thanks a lot ! On Thu, Nov 1, 2018 at 9:59 PM William Michels <wjm1 at caa.columbia.edu> wrote:> Perhaps one of the following two methods: > > > zgene = data.frame( TTT=c(0,1,0,0), > + TTA=c(0,1,1,0), > + ATA=c(1,0,0,0), > + ATT=c(0,0,0,0), > + row.names=c("gene1", "gene2", "gene3", "gene4")) > > zgene > TTT TTA ATA ATT > gene1 0 0 1 0 > gene2 1 1 0 0 > gene3 0 1 0 0 > gene4 0 0 0 0 > > > > zgene[ , zgene[2,1:4] > 0] > TTT TTA > gene1 0 0 > gene2 1 1 > gene3 0 1 > gene4 0 0 > > > > zgene[ , zgene[rownames(zgene) == "gene2",1:4] > 0] > TTT TTA > gene1 0 0 > gene2 1 1 > gene3 0 1 > gene4 0 0 > > > > Best Regards, > > Bill. > > William Michels, Ph.D. > > > > On Thu, Nov 1, 2018 at 9:07 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: > > Dear Bill, and Bill, > > > > many thanks for taking the time to advice, and for your suggestions. I > > believe that I shall rephrase a bit my question, with a better example : > > thank you again in advance for your help. > > > > Let's assume that we start from a data frame : > > > > x = data.frame( TTT=c(0,1,0,0), > > TTA=c(0,1,1,0), > > ATA=c(1,0,0,0), > > ATT=c(0,0,0,0), > > row.names=c("gene1", "gene2", "gene3", "gene4")) > > > > Shall we select "gene2", at the end, we would like to have ONLY the > COLUMNS, > > where "gene2" is NOT-ZERO. In other words, the output contains only the > > first 2 columns : > > > > output = data.frame( TTT=c(0,1,0,0), > > TTA=c(0,1,1,0), > > row.names=c("gene1", "gene2", "gene3", > > "gene4")) > > > > with much appreciation, > > > > -- bogdan > > > > On Thu, Nov 1, 2018 at 6:34 PM William Michels <wjm1 at caa.columbia.edu> > > wrote: > >> > >> Hi Bogdan, > >> > >> Are you saying you want to drop columns that sum to zero? If so, I'm > >> not sure you've given us a good example dataframe, since all your > >> numeric columns give non-zero sums. > >> > >> Otherwise, what you're asking for is trivial. Below is an example > >> dataframe ("ygene") with an example "AGA" column that gets dropped: > >> > >> > xgene <- data.frame(TTT=c(0,1,0,0), > >> + TTA=c(0,1,1,0), > >> + ATA=c(1,0,0,0), > >> + gene=c("gene1", "gene2", "gene3", "gene4")) > >> > > >> > xgene[ , colSums(xgene[,1:3]) > 0 ] > >> TTT TTA ATA gene > >> 1 0 0 1 gene1 > >> 2 1 1 0 gene2 > >> 3 0 1 0 gene3 > >> 4 0 0 0 gene4 > >> > > >> > ygene <- data.frame(TTT=c(0,1,0,0), > >> + TTA=c(0,1,1,0), > >> + AGA=c(0,0,0,0), > >> + gene=c("gene1", "gene2", "gene3", "gene4")) > >> > > >> > ygene[ , colSums(ygene[,1:3]) > 0 ] > >> TTT TTA gene > >> 1 0 0 gene1 > >> 2 1 1 gene2 > >> 3 0 1 gene3 > >> 4 0 0 gene4 > >> > >> > >> HTH, > >> > >> Bill. > >> > >> William Michels, Ph.D. > >> > >> > >> On Thu, Nov 1, 2018 at 5:45 PM, Bogdan Tanasa <tanasa at gmail.com> wrote: > >> > Dear all, please may I ask for a suggestion : > >> > > >> > considering a dataframe that contains the numerical values for gene > >> > expression, for example : > >> > > >> > x = data.frame(TTT=c(0,1,0,0), > >> > TTA=c(0,1,1,0), > >> > ATA=c(1,0,0,0), > >> > gene=c("gene1", "gene2", "gene3", "gene4")) > >> > > >> > how could I select only the COLUMNS where the value of a GENE (a ROW) > is > >> > non-zero ? > >> > > >> > thank you ! > >> > > >> > -- bogdan > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > ______________________________________________ > >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]