Hi all, Thanks for the really great help I've received on this board in the past. I have a very particular graph that I'm trying to plot, and I'm not really sure how to do it. I think I should be able to use ggplot for this, but I'm not really sure how. I have a data.frame which contains fifty sub frames containing one hundred data points each. I can do a histogram of each of these sub frames individually, and see the distribution. I can also plot the mean & standard deviation of the fifty together in one plot, where the x axis identifies the subframe to which it refers. What I'd like to do is combine these two things, so that I have a 2 -d graph. The x axis specifies the sub-frame. The y axis is just the data. Each x column plots the minimum of the data in the sub frame, the maximum, and the median, as points. AND each x column also displays histogram data, so that the y values which have more density in the subframe are darker, and the ones with less density are lighter. I know this is fairly particular, and may not be possible, but it would be really great for me! If anyone can help - thanks! -- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]]
Hi Ian, Have a look at the examples in http://had.co.nz/ggplot2/geom_tile.html for some ideas on how to do this with ggplot2. Hadley On Sat, Jul 10, 2010 at 8:10 PM, Ian Bentley <ian.bentley at gmail.com> wrote:> Hi all, > > Thanks for the really great help I've received on this board in the past. > > I have a very particular graph that I'm trying to plot, and I'm not really > sure how to do it. ?I think I should be able to use ggplot for this, but I'm > not really sure how. > > I have a data.frame which contains fifty sub frames containing one hundred > data points each. > > I can do a histogram of each of these sub frames individually, and see the > distribution. ?I can also plot the mean & standard deviation of the fifty > together in one plot, where the x axis identifies the subframe to which it > refers. > > What I'd like to do is combine these two things, so that I have a 2 -d > graph. > > The x axis specifies the sub-frame. > The y axis is just the data. > > Each x column plots the minimum of the data in the sub frame, the maximum, > and the median, as points. ?AND each x column also displays histogram data, > so that the y values which have more density in the subframe are darker, and > the ones with less density are lighter. > > I know this is fairly particular, and may not be possible, but it would be > really great for me! > > If anyone can help - thanks! > > -- > Ian Bentley > M.Sc. Candidate > Queen's University > Kingston, Ontario > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/
I've got a couple of more changes that I want to make to my plot, and I can't figure things out. Thanks for all the help. I'm using this R script library(ggplot2) library(lattice) # Generate 50 data sets of size 100 and assign them to a list object low <- 1 n <- 50 #Load data from file for(i in low:n) assign(paste('df', i, sep = ''), read.table(paste("tot-LinkedList",i*100,"query.log",sep=''), header=TRUE)) dnames <- paste('df', low:n, sep = '') l <- vector('list', n) for(i in seq_along(dnames)) l[[i]] <- with(get(dnames[i]), Send + Receive) ml <- melt(l) dsum <- ddply(ml, 'L1', summarise, mins = min(value), meds = median(value), maxs = max(value)) p <- ggplot(ml, aes(x = L1*100, y = value)) + geom_point(alpha = 0.2) + geom_point(data = dsum, aes(y = mins), shape = 1, size = 3, solid=TRUE, colour='blue') + geom_point(data = dsum, aes(y = meds), shape = 2, size = 3, solid=TRUE, colour='blue') + geom_point(data = dsum, aes(y = maxs), shape = 3, size = 3, solid=TRUE, colour='blue') + geom_smooth(data = dsum, aes(y = mins)) + geom_smooth(data = dsum, aes(y = meds)) + geom_smooth(data = dsum, aes(y = maxs)) + opts(axis.text.x = theme_text(size = 7, angle = 90, hjust = 1), title 'Linked List Query Costs Increasing Network Size') + xlab('Network Complexity (nodes)') + ylab('Battery Cost (uJ)') + --END-- And this works great, except that I think that I am not being very R'y, since now I want to add a legend saying that circle (i.e. shape 1) is the minimum, and shape 2 is the med, and shape 3 is max. I'd also like to be able to move the legend to the top left part of the plot since that area is empty anyways. Is there any way that I can do it easily? Thanks Ian On 11 July 2010 10:29, Ian Bentley <ian.bentley@gmail.com> wrote:> Thanks to both of you! > > > I was able to get exactly the plot I was looking for! > > Ian > > On 11 July 2010 09:30, Hadley Wickham <hadley@rice.edu> wrote: > >> Hi Ian, >> >> Have a look at the examples in http://had.co.nz/ggplot2/geom_tile.html >> for some ideas on how to do this with ggplot2. >> >> Hadley >> >> On Sat, Jul 10, 2010 at 8:10 PM, Ian Bentley <ian.bentley@gmail.com> >> wrote: >> > Hi all, >> > >> > Thanks for the really great help I've received on this board in the >> past. >> > >> > I have a very particular graph that I'm trying to plot, and I'm not >> really >> > sure how to do it. I think I should be able to use ggplot for this, but >> I'm >> > not really sure how. >> > >> > I have a data.frame which contains fifty sub frames containing one >> hundred >> > data points each. >> > >> > I can do a histogram of each of these sub frames individually, and see >> the >> > distribution. I can also plot the mean & standard deviation of the >> fifty >> > together in one plot, where the x axis identifies the subframe to which >> it >> > refers. >> > >> > What I'd like to do is combine these two things, so that I have a 2 -d >> > graph. >> > >> > The x axis specifies the sub-frame. >> > The y axis is just the data. >> > >> > Each x column plots the minimum of the data in the sub frame, the >> maximum, >> > and the median, as points. AND each x column also displays histogram >> data, >> > so that the y values which have more density in the subframe are darker, >> and >> > the ones with less density are lighter. >> > >> > I know this is fairly particular, and may not be possible, but it would >> be >> > really great for me! >> > >> > If anyone can help - thanks! >> > >> > -- >> > Ian Bentley >> > M.Sc. Candidate >> > Queen's University >> > Kingston, Ontario >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Assistant Professor / Dobelman Family Junior Chair >> Department of Statistics / Rice University >> http://had.co.nz/ >> > > > > -- > Ian Bentley > M.Sc. Candidate > Queen's University > Kingston, Ontario >-- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]]
Hi Dennis, Thanks for the quick reply. Once I removed solid = TRUE, which was giving errors, the code is accepted fine. It's strange though, no legend appears. Even when I try something simple like: p + scale_shape_manual(values=1:3) No legend appears. I can't find any similar problems on google. Thanks again, Ian On 14 July 2010 03:56, Dennis Murphy <djmuser@gmail.com> wrote:> Hi: > > This is untested, so caveat emptor. I believe Hadley is busy teaching a > ggplot2 course this week so his availability is limited at best. I guess I > can give it a shot... > > You need a scale_shape_* construct to add to your plot, perhaps something > like > > scale_shape_manual('Statistic', breaks = 1:3, labels = c('Min', 'Median', > 'Max'), solid = TRUE) > > The 'Statistic' puts a title on the legend, the breaks argument should > supply the values of the shapes, > the labels argument should provide the label to associate to each shape, > and solid = TRUE should > produce the same behavior as in the geom_point() calls wrt shapes. [Notice > how I say 'should'...] > > No guarantees this will work - scales are one of my greatest frustrations > in ggplot2. Expect this to be the first of several iterations you'll have to > go through to get it to work the way you want. > > HTH, > Dennis > > > On Tue, Jul 13, 2010 at 4:32 PM, Ian Bentley <ian.bentley@gmail.com>wrote: > >> I've got a couple of more changes that I want to make to my plot, and I >> can't figure things out. Thanks for all the help. >> >> I'm using this R script >> >> library(ggplot2) >> library(lattice) >> # Generate 50 data sets of size 100 and assign them to a list object >> >> low <- 1 >> n <- 50 >> #Load data from file >> for(i in low:n) assign(paste('df', i, sep = ''), >> read.table(paste("tot-LinkedList",i*100,"query.log",sep=''), >> header=TRUE)) >> >> >> dnames <- paste('df', low:n, sep = '') >> l <- vector('list', n) >> for(i in seq_along(dnames)) l[[i]] <- with(get(dnames[i]), Send + Receive) >> ml <- melt(l) >> >> dsum <- ddply(ml, 'L1', summarise, mins = min(value), meds >> median(value), >> maxs = max(value)) >> >> >> p <- ggplot(ml, aes(x = L1*100, y = value)) + >> geom_point(alpha = 0.2) + >> geom_point(data = dsum, aes(y = mins), shape = 1, size = 3, >> solid=TRUE, colour='blue') + >> geom_point(data = dsum, aes(y = meds), shape = 2, size = 3, >> solid=TRUE, colour='blue') + >> geom_point(data = dsum, aes(y = maxs), shape = 3, size = 3, >> solid=TRUE, colour='blue') + >> geom_smooth(data = dsum, aes(y = mins)) + >> geom_smooth(data = dsum, aes(y = meds)) + >> geom_smooth(data = dsum, aes(y = maxs)) + >> opts(axis.text.x = theme_text(size = 7, angle = 90, hjust = 1), title >> = 'Linked List Query Costs Increasing Network Size') + >> xlab('Network Complexity (nodes)') + ylab('Battery Cost (uJ)') + >> >> --END-- >> >> And this works great, except that I think that I am not being very R'y, >> since now I want to add a legend saying that circle (i.e. shape 1) is the >> minimum, and shape 2 is the med, and shape 3 is max. >> >> I'd also like to be able to move the legend to the top left part of the >> plot since that area is empty anyways. >> >> Is there any way that I can do it easily? >> >> Thanks >> Ian >> >> >> >> >> >> On 11 July 2010 10:29, Ian Bentley <ian.bentley@gmail.com> wrote: >> >>> Thanks to both of you! >>> >>> >>> I was able to get exactly the plot I was looking for! >>> >>> Ian >>> >>> On 11 July 2010 09:30, Hadley Wickham <hadley@rice.edu> wrote: >>> >>>> Hi Ian, >>>> >>>> Have a look at the examples in http://had.co.nz/ggplot2/geom_tile.html >>>> for some ideas on how to do this with ggplot2. >>>> >>>> Hadley >>>> >>>> On Sat, Jul 10, 2010 at 8:10 PM, Ian Bentley <ian.bentley@gmail.com> >>>> wrote: >>>> > Hi all, >>>> > >>>> > Thanks for the really great help I've received on this board in the >>>> past. >>>> > >>>> > I have a very particular graph that I'm trying to plot, and I'm not >>>> really >>>> > sure how to do it. I think I should be able to use ggplot for this, >>>> but I'm >>>> > not really sure how. >>>> > >>>> > I have a data.frame which contains fifty sub frames containing one >>>> hundred >>>> > data points each. >>>> > >>>> > I can do a histogram of each of these sub frames individually, and see >>>> the >>>> > distribution. I can also plot the mean & standard deviation of the >>>> fifty >>>> > together in one plot, where the x axis identifies the subframe to >>>> which it >>>> > refers. >>>> > >>>> > What I'd like to do is combine these two things, so that I have a 2 -d >>>> > graph. >>>> > >>>> > The x axis specifies the sub-frame. >>>> > The y axis is just the data. >>>> > >>>> > Each x column plots the minimum of the data in the sub frame, the >>>> maximum, >>>> > and the median, as points. AND each x column also displays histogram >>>> data, >>>> > so that the y values which have more density in the subframe are >>>> darker, and >>>> > the ones with less density are lighter. >>>> > >>>> > I know this is fairly particular, and may not be possible, but it >>>> would be >>>> > really great for me! >>>> > >>>> > If anyone can help - thanks! >>>> > >>>> > -- >>>> > Ian Bentley >>>> > M.Sc. Candidate >>>> > Queen's University >>>> > Kingston, Ontario >>>> > >>>> > [[alternative HTML version deleted]] >>>> > >>>> > ______________________________________________ >>>> > R-help@r-project.org mailing list >>>> > https://stat.ethz.ch/mailman/listinfo/r-help >>>> > PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> > and provide commented, minimal, self-contained, reproducible code. >>>> > >>>> >>>> >>>> >>>> -- >>>> Assistant Professor / Dobelman Family Junior Chair >>>> Department of Statistics / Rice University >>>> http://had.co.nz/ >>>> >>> >>> >>> >>> -- >>> Ian Bentley >>> M.Sc. Candidate >>> Queen's University >>> Kingston, Ontario >>> >> >> >> >> -- >> Ian Bentley >> M.Sc. Candidate >> Queen's University >> Kingston, Ontario >> > >-- Ian Bentley M.Sc. Candidate Queen's University Kingston, Ontario [[alternative HTML version deleted]]