Dear R help, I am new to ggplot so I apologize if my question is a bit obvious. I would like to create a plot where a compare the fraction of the values of a variable called PASP out of the number of subjects, for two groups of subject codified with a dummy variable called SUBJC. The variable PASP is discrete and only takes values 0,4,8.. My data are as following: PASP SUBJC 0 0 4 1 0 0 8 0 4 0 0 1 0 1 . . . . . . I would like to calculate the fraction of positive levels of PASP out of the total number of observations, divided per values of SUBJ=0 and 1. I am new to the use of GGPlot and I do not know how to organize the data and what to use to summarize these data as to obtain a picture as follows: I hope my request is clear. Thanks for any help you can provide. Francesca
Hello, Your request is not entirely clear. What kind of a graph do you want? A bar graph with a bar of the fraction of positive levels of PASP per each level of SUBJC? You need to be more specific. Also, please post data like this: # post the output of this command in your next mail dput(head(data, 30)) Hope this helps, Rui Barradas ?s 17:47 de 18-07-2018, Francesca escreveu:> Dear R help, > > I am new to ggplot so I apologize if my question is a bit obvious. > > I would like to create a plot where a compare the fraction of the values of a variable called PASP out of the number of subjects, for two groups of subject codified with a dummy variable called SUBJC. > > The variable PASP is discrete and only takes values 0,4,8.. > > My data are as following: > > > > PASP SUBJC > > > > 0 0 > > 4 1 > > 0 0 > > 8 0 > > 4 0 > > 0 1 > > 0 1 > > . . > > . . > > . . > > > > > I would like to calculate the fraction of positive levels of PASP out of the total number of observations, divided per values of SUBJ=0 and 1. I am new to the use of GGPlot and I do not know how to organize the data and what to use to summarize these data as to obtain a picture as follows: > > > > > > I hope my request is clear. Thanks for any help you can provide. > > Francesca > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Wed, 18 Jul 2018, Francesca wrote:> Dear R help, >> I am new to ggplot so I apologize if my question is a bit obvious.Or perhaps not, as this is the "R-help" mailing list, not the "Ggplot-help" mailing list. Fortunately for you, what you really need to learn is R, and then ggplot will be much easier to get along with.> I would like to create a plot where a compare the fraction of the values > of a variable called PASP out of the number of subjects, for two groups > of subject codified with a dummy variable called SUBJC. > > The variable PASP is discrete and only takes values 0,4,8.. > > My data are as following: > > PASP SUBJC > > 0 0 > > 4 1 > > 0 0 > > 8 0 > > 4 0 > > 0 1 > > 0 1 > > . . > > . . > > . . > > > > > I would like to calculate the fraction of positive levels of PASP out of > the total number of observations, divided per values of SUBJ=0 and 1. I > am new to the use of GGPlot and I do not know how to organize the data > and what to use to summarize these data as to obtain a picture as > follows: > > > > > > I hope my request is clear. Thanks for any help you can provide.The funky text formatting and reference to "picture as follows" of the above makes me think you composed this in HTML and then converted it to plain text without looking at the result. * We got no picture.. this is a plain-text-only mailing list. * HTML makes terrible plain text. The following is an example of how you can send us sample data and code in the body of your email that will survive these plain-text-only limitations. Note that writing R code is the key to communicating unambiguously. You can start by preparing a sample of your data (usually not all of it)doing something like dput(head(mydta,100)) and inserting the "dta <- " with the output so you get a line of R code that we can execute and have some rows of your data: ----- dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8, 8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4, 8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12, 8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4, 0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12, 4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L ), class = "data.frame") ----- and then ideally you would tell us the results of a sample of the calculation you expect to see, though in this case you might not have thought to present them organized as below: ----- result <- read.table( text " PASP SUBJC Fraction 0 0 0.279 4 0 0.186 8 0 0.395 12 0 0.140 0 1 0.263 4 1 0.211 8 1 0.123 12 1 0.404 ", header=TRUE) ----- And with your existing text, we might come up with something like: ----- library(ggplot2) dta <- structure(list(PASP = c(0, 12, 8, 0, 12, 12, 12, 8, 12, 8, 8, 8, 8, 4, 0, 12, 12, 0, 12, 0, 0, 12, 4, 8, 12, 8, 4, 4, 4, 4, 8, 8, 8, 12, 12, 12, 8, 0, 12, 12, 0, 12, 12, 8, 0, 4, 4, 12, 8, 8, 12, 8, 0, 12, 0, 0, 4, 0, 0, 4, 4, 12, 0, 4, 8, 8, 8, 4, 0, 0, 4, 0, 12, 4, 12, 12, 8, 0, 0, 0, 4, 8, 8, 0, 4, 0, 12, 4, 12, 0, 4, 12, 8, 0, 4, 0, 0, 12, 12, 8), SUBJC = c(0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L)), .Names = c("PASP", "SUBJC"), row.names = c(NA, -100L ), class = "data.frame") table(dta) #> SUBJC #> PASP 0 1 #> 0 12 15 #> 4 8 12 #> 8 17 7 #> 12 6 23 dtasum <- aggregate( list( Count = rep(1,100) ) , dta , FUN = sum ) dtasum$Fraction <- ave( dtasum$Count , dtasum$SUBJC , FUN = function(x) ( x/sum(x) ) ) dtasum$PASPfactor <- factor( dtasum$PASP ) dtasum$SUBJCfactor <- factor( dtasum$SUBJC ) dtasum #> PASP SUBJC Count Fraction PASPfactor SUBJCfactor #> 1 0 0 12 0.2790698 0 0 #> 2 4 0 8 0.1860465 4 0 #> 3 8 0 17 0.3953488 8 0 #> 4 12 0 6 0.1395349 12 0 #> 5 0 1 15 0.2631579 0 1 #> 6 4 1 12 0.2105263 4 1 #> 7 8 1 7 0.1228070 8 1 #> 8 12 1 23 0.4035088 12 1 ggplot( dtasum , aes( x=SUBJCfactor , y=Fraction , fill=PASPfactor ) ) + geom_bar( stat = "identity" ) + xlab( "SUBJ" ) + scale_fill_discrete( name = "PASP" ) #' Created on 2018-07-18 by the [reprex package](http://reprex.tidyverse.org) (v0.2.0). ----- Obviously, since I never saw the figure you thought I was going to see, the plot I made may not be the one you had in mind, but you should at least have some example code to compare with the "Introduction to R" document that comes with R, and some functions to look up help pages on, e.g. ?aggregate ?ave and you can execute pieces of code to see what they create: rep(1,100) You should read he Posting Guide carefully, as there are hints in it as to how to do much of this.> > Francesca > > > > ______________________________________________> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >--------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
Hi Francesca, This looks like a fairly simple task. Try this: fpdf<-read.table(text="PASP SUBJC 0 0 4 1 0 0 8 0 4 0 0 1 0 1", header=TRUE) # get the number of positive PASP results by group ppos<-by(fpdf$SUBJC,fpdf$PASPpos,sum) # get the number of subjects per group spg<-c(sum(fpdf$SUBJC==0),sum(fpdf$SUBJC==1)) barplot(ppos/spg,names.arg=c(0,1),xlab="Group", ylab="Proportion PASP > 0",main="Proportion of PASP positive by group") Jim On Thu, Jul 19, 2018 at 2:47 AM, Francesca <francesca.pancotto at gmail.com> wrote:> Dear R help, > > I am new to ggplot so I apologize if my question is a bit obvious. > > I would like to create a plot where a compare the fraction of the values of a variable called PASP out of the number of subjects, for two groups of subject codified with a dummy variable called SUBJC. > > The variable PASP is discrete and only takes values 0,4,8.. > > My data are as following: > > > > PASP SUBJC > > > > 0 0 > > 4 1 > > 0 0 > > 8 0 > > 4 0 > > 0 1 > > 0 1 > > . . > > . . > > . . > > > > > I would like to calculate the fraction of positive levels of PASP out of the total number of observations, divided per values of SUBJ=0 and 1. I am new to the use of GGPlot and I do not know how to organize the data and what to use to summarize these data as to obtain a picture as follows: > > > > > > I hope my request is clear. Thanks for any help you can provide. > > Francesca > > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi again, Sorry, forgot this line: fpdf$PASPpos<-fpdf$PASP > 0 just after reading in the data frame. Jim On Thu, Jul 19, 2018 at 9:04 AM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Francesca, > This looks like a fairly simple task. Try this: > > fpdf<-read.table(text="PASP SUBJC > 0 0 > 4 1 > 0 0 > 8 0 > 4 0 > 0 1 > 0 1", > header=TRUE) > # get the number of positive PASP results by group > ppos<-by(fpdf$SUBJC,fpdf$PASPpos,sum) > # get the number of subjects per group > spg<-c(sum(fpdf$SUBJC==0),sum(fpdf$SUBJC==1)) > barplot(ppos/spg,names.arg=c(0,1),xlab="Group", > ylab="Proportion PASP > 0",main="Proportion of PASP positive by group") > > Jim > > On Thu, Jul 19, 2018 at 2:47 AM, Francesca <francesca.pancotto at gmail.com> wrote: >> Dear R help, >> >> I am new to ggplot so I apologize if my question is a bit obvious. >> >> I would like to create a plot where a compare the fraction of the values of a variable called PASP out of the number of subjects, for two groups of subject codified with a dummy variable called SUBJC. >> >> The variable PASP is discrete and only takes values 0,4,8.. >> >> My data are as following: >> >> >> >> PASP SUBJC >> >> >> >> 0 0 >> >> 4 1 >> >> 0 0 >> >> 8 0 >> >> 4 0 >> >> 0 1 >> >> 0 1 >> >> . . >> >> . . >> >> . . >> >> >> >> >> I would like to calculate the fraction of positive levels of PASP out of the total number of observations, divided per values of SUBJ=0 and 1. I am new to the use of GGPlot and I do not know how to organize the data and what to use to summarize these data as to obtain a picture as follows: >> >> >> >> >> >> I hope my request is clear. Thanks for any help you can provide. >> >> Francesca >> >> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Thanks for the answer. Il gio 19 lug 2018, 01:04 Jim Lemon <drjimlemon at gmail.com> ha scritto:> Hi Francesca, > This looks like a fairly simple task. Try this: > > fpdf<-read.table(text="PASP SUBJC > 0 0 > 4 1 > 0 0 > 8 0 > 4 0 > 0 1 > 0 1", > header=TRUE) > # get the number of positive PASP results by group > ppos<-by(fpdf$SUBJC,fpdf$PASPpos,sum) > # get the number of subjects per group > spg<-c(sum(fpdf$SUBJC==0),sum(fpdf$SUBJC==1)) > barplot(ppos/spg,names.arg=c(0,1),xlab="Group", > ylab="Proportion PASP > 0",main="Proportion of PASP positive by group") > > Jim > > On Thu, Jul 19, 2018 at 2:47 AM, Francesca <francesca.pancotto at gmail.com> > wrote: > > Dear R help, > > > > I am new to ggplot so I apologize if my question is a bit obvious. > > > > I would like to create a plot where a compare the fraction of the values > of a variable called PASP out of the number of subjects, for two groups of > subject codified with a dummy variable called SUBJC. > > > > The variable PASP is discrete and only takes values 0,4,8.. > > > > My data are as following: > > > > > > > > PASP SUBJC > > > > > > > > 0 0 > > > > 4 1 > > > > 0 0 > > > > 8 0 > > > > 4 0 > > > > 0 1 > > > > 0 1 > > > > . . > > > > . . > > > > . . > > > > > > > > > > I would like to calculate the fraction of positive levels of PASP out of > the total number of observations, divided per values of SUBJ=0 and 1. I am > new to the use of GGPlot and I do not know how to organize the data and > what to use to summarize these data as to obtain a picture as follows: > > > > > > > > > > > > I hope my request is clear. Thanks for any help you can provide. > > > > Francesca > > > > > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]