Hi, I'm working with a data.frame containing values between 0 and 22000. Most of the values are actually between 0 and 50 and the high ones are outliers. I want to generate a boxplot and since the outliers are extremely high, I need to scale the y scale logarithmically. Otherwise one wouldn't really see the boxes of the boxplot. boxplot(dat, log="y", ylim=c(0, max(dat))) Trying the above doesn't work, since the y scale has to be positive. But when I generate the boxplot with ylim=c(1, max(dat)) it doesn't properly generate the whiskers or beginning of the boxes, because some of the mins and first quantiles are 0. Can anybody help and tell me how I can generate a logarithmic y scale starting at 0? Thanks in advance, -- Anne Skoeries [[alternative HTML version deleted]]
What about "starting" the data by adding some small amount to the
0's?
Perhaps something like
mysample <- data.frame(aa =
sample(c("A","B","C"), 20, replace=TRUE),
bb = sample(0:9, 20, replace=TRUE))
ifelse(mysample$bb==0,.1, mysample$bb)
though you may wish to make .1 much smaller.
--- On Thu, 8/20/09, Anne Skoeries <home at anne-skoeries.de> wrote:
> From: Anne Skoeries <home at anne-skoeries.de>
> Subject: [R] boxplot with log="y" and values starting at 0
> To: r-help at r-project.org
> Received: Thursday, August 20, 2009, 9:15 AM
> Hi,
>
> I'm working with a data.frame containing values between 0
> and 22000.?
> Most of the values are actually between 0 and 50 and the
> high ones are?
> outliers.
> I want to generate a boxplot and since the outliers are
> extremely?
> high, I? need to scale the y scale logarithmically.
> Otherwise one?
> wouldn't really see the boxes of the boxplot.
>
> boxplot(dat, log="y", ylim=c(0, max(dat)))
>
> Trying the above doesn't work, since the y scale has to be
> positive.
>
> But when I generate the boxplot with
> ylim=c(1, max(dat))
> it doesn't properly generate the whiskers or beginning of
> the boxes,?
> because some of the mins and first quantiles are 0.
>
> Can anybody help and tell me how I can generate a
> logarithmic y scale?
> starting at 0?
>
> Thanks in advance,
> --
> Anne Skoeries
>
>
>
> ??? [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>
__________________________________________________________________
Looking for the perfect gift? Give the gift of Flickr!
http://www.flickr.com/gift/
On 20/08/09 14:15, Anne Skoeries wrote:> Hi, > > I'm working with a data.frame containing values between 0 and 22000. > Most of the values are actually between 0 and 50 and the high ones are > outliers. > I want to generate a boxplot and since the outliers are extremely > high, I need to scale the y scale logarithmically. Otherwise one > wouldn't really see the boxes of the boxplot. > > boxplot(dat, log="y", ylim=c(0, max(dat))) > > Trying the above doesn't work, since the y scale has to be positive. > > But when I generate the boxplot with > ylim=c(1, max(dat)) > it doesn't properly generate the whiskers or beginning of the boxes, > because some of the mins and first quantiles are 0. > > Can anybody help and tell me how I can generate a logarithmic y scale > starting at 0? >I think that is impossible, unless you redefine mathematics and geometry. Sadly R only supports a relatively usual form of mathematics where log(0) is by convention -Inf, and the graphics is basically Euclidean so you can't draw infinities easily. You could try filing a bug report.... What is min(dat)? If that is zero, then you can't use a log scale. If it is small but positive, then you can use that for your ylim. But your data set is a large range and therefore intrinsically hard to visualize. Consider some other way of presenting the data. What is the reader supposed to learn from / do with the data you show? Hope this helps a little. Allan