Hi, I'm working with a data.frame containing values between 0 and 22000. Most of the values are actually between 0 and 50 and the high ones are outliers. I want to generate a boxplot and since the outliers are extremely high, I need to scale the y scale logarithmically. Otherwise one wouldn't really see the boxes of the boxplot. boxplot(dat, log="y", ylim=c(0, max(dat))) Trying the above doesn't work, since the y scale has to be positive. But when I generate the boxplot with ylim=c(1, max(dat)) it doesn't properly generate the whiskers or beginning of the boxes, because some of the mins and first quantiles are 0. Can anybody help and tell me how I can generate a logarithmic y scale starting at 0? Thanks in advance, -- Anne Skoeries [[alternative HTML version deleted]]
What about "starting" the data by adding some small amount to the 0's? Perhaps something like mysample <- data.frame(aa = sample(c("A","B","C"), 20, replace=TRUE), bb = sample(0:9, 20, replace=TRUE)) ifelse(mysample$bb==0,.1, mysample$bb) though you may wish to make .1 much smaller. --- On Thu, 8/20/09, Anne Skoeries <home at anne-skoeries.de> wrote:> From: Anne Skoeries <home at anne-skoeries.de> > Subject: [R] boxplot with log="y" and values starting at 0 > To: r-help at r-project.org > Received: Thursday, August 20, 2009, 9:15 AM > Hi, > > I'm working with a data.frame containing values between 0 > and 22000.? > Most of the values are actually between 0 and 50 and the > high ones are? > outliers. > I want to generate a boxplot and since the outliers are > extremely? > high, I? need to scale the y scale logarithmically. > Otherwise one? > wouldn't really see the boxes of the boxplot. > > boxplot(dat, log="y", ylim=c(0, max(dat))) > > Trying the above doesn't work, since the y scale has to be > positive. > > But when I generate the boxplot with > ylim=c(1, max(dat)) > it doesn't properly generate the whiskers or beginning of > the boxes,? > because some of the mins and first quantiles are 0. > > Can anybody help and tell me how I can generate a > logarithmic y scale? > starting at 0? > > Thanks in advance, > -- > Anne Skoeries > > > > ??? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. >__________________________________________________________________ Looking for the perfect gift? Give the gift of Flickr! http://www.flickr.com/gift/
On 20/08/09 14:15, Anne Skoeries wrote:> Hi, > > I'm working with a data.frame containing values between 0 and 22000. > Most of the values are actually between 0 and 50 and the high ones are > outliers. > I want to generate a boxplot and since the outliers are extremely > high, I need to scale the y scale logarithmically. Otherwise one > wouldn't really see the boxes of the boxplot. > > boxplot(dat, log="y", ylim=c(0, max(dat))) > > Trying the above doesn't work, since the y scale has to be positive. > > But when I generate the boxplot with > ylim=c(1, max(dat)) > it doesn't properly generate the whiskers or beginning of the boxes, > because some of the mins and first quantiles are 0. > > Can anybody help and tell me how I can generate a logarithmic y scale > starting at 0? >I think that is impossible, unless you redefine mathematics and geometry. Sadly R only supports a relatively usual form of mathematics where log(0) is by convention -Inf, and the graphics is basically Euclidean so you can't draw infinities easily. You could try filing a bug report.... What is min(dat)? If that is zero, then you can't use a log scale. If it is small but positive, then you can use that for your ylim. But your data set is a large range and therefore intrinsically hard to visualize. Consider some other way of presenting the data. What is the reader supposed to learn from / do with the data you show? Hope this helps a little. Allan