Software zur Erkennung von "Spam" auf dem Rechner hypatia.math.ethz.ch hat die eingegangene E-mail als m??gliche "Spam"-Nachricht identifiziert. Die urspr??ngliche Nachricht wurde an diesen Bericht angeh??ngt, so dass Sie sie anschauen k??nnen (falls es doch eine legitime E-Mail ist) oder ??hnliche unerw??nschte Nachrichten in Zukunft markieren k??nnen. Bei Fragen zu diesem Vorgang wenden Sie sich bitte an the administrator of that system Vorschau: Hello everybody, I want to use R to generate plots from categorial data. The data contains results from OCR scans over images with are preprocessed by different image filtering techniques. A small sample data set looks as following: [...] Inhaltsanalyse im Detail: (5.5 Punkte, 5.0 ben??tigt) Pkte Regelname Beschreibung ---- ---------------------- -------------------------------------------------- 0.0 DKIM_POLICY_SIGNSOME Domain Keys Identified Mail: policy says domain signs some mails 1.0 BAYES_60 BODY: Spamwahrscheinlichkeit nach Bayes-Test: 60-80% [score: 0.7481] 4.5 AWL AWL: From: address is in the auto white-list -------------- next part -------------- An embedded message was scrubbed... From: "Christoph Krammer" <ck at altaica.de> Subject: Plots from categorial data Date: Fri, 29 Jun 2007 21:32:39 +0200 Size: 3027 Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070629/fafb8ed3/attachment.mht
Hello everybody, Since my first message was caught by the spam filter, I just try to do it again: I want to use R to generate plots from categorial data. The data contains results from OCR scans over images with are preprocessed by different image filtering techniques. A small sample data set looks as following:> data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T) > dataocrtool filter_setting avg.hit. 1 FineReader 2x1 0.383 2 FineReader 2x2 0.488 3 FineReader 3x2 0.268 4 FineReader 3x3 0.198 5 FineReader 4x3 0.081 6 FineReader 4x4 0.056 7 gocr 2x1 0.153 8 gocr 2x2 0.102 9 gocr 3x2 0.047 10 gocr 3x3 0.052 11 gocr 4x3 0.014 12 gocr 4x4 0.002 13 ocrad 2x1 0.085 14 ocrad 2x2 0.094 15 ocrad 3x2 0.045 16 ocrad 3x3 0.050 17 ocrad 4x3 0.025 18 ocrad 4x4 0.009 I now want to draw a plot with the categories (filter_setting) as X axis, and the avg_hit as Y axis. There should be lines for each ocrtool. But when I draw a plot, the resulting plot always contains bars, even if I specify type="n".> plot(data$filter_setting, data$avg.hit., type="n")When I only plot the categories, without data, there appear strange grey (but empty) boxes.> plot(data$filter_setting, type="n")Who do I get a clean white box to draw the different lines in? Thanks and regards, Christoph --- Christoph Krammer Student University of Mannheim Laboratory for Dependable Distributed Systems A5, 6 68131 Mannheim Germany
Perhaps this will do what you want: library(ggplot2) qplot(filter_setting, avg.hit, data=data, colour=ocrtool, geom="line") find out more about ggplot2 at http://had.co.nz/ggplot2 Hadley On 7/1/07, Christoph Krammer <ck at altaica.de> wrote:> Hello everybody, > > Since my first message was caught by the spam filter, I just try to do it > again: > > I want to use R to generate plots from categorial data. The data contains > results from OCR scans over images with are preprocessed by different image > filtering techniques. A small sample data set looks as following: > > > data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T) > > data > ocrtool filter_setting avg.hit. > 1 FineReader 2x1 0.383 > 2 FineReader 2x2 0.488 > 3 FineReader 3x2 0.268 > 4 FineReader 3x3 0.198 > 5 FineReader 4x3 0.081 > 6 FineReader 4x4 0.056 > 7 gocr 2x1 0.153 > 8 gocr 2x2 0.102 > 9 gocr 3x2 0.047 > 10 gocr 3x3 0.052 > 11 gocr 4x3 0.014 > 12 gocr 4x4 0.002 > 13 ocrad 2x1 0.085 > 14 ocrad 2x2 0.094 > 15 ocrad 3x2 0.045 > 16 ocrad 3x3 0.050 > 17 ocrad 4x3 0.025 > 18 ocrad 4x4 0.009 > > > I now want to draw a plot with the categories (filter_setting) as X axis, > and the avg_hit as Y axis. There should be lines for each ocrtool. > > But when I draw a plot, the resulting plot always contains bars, even if I > specify type="n". > > plot(data$filter_setting, data$avg.hit., type="n") > > When I only plot the categories, without data, there appear strange grey > (but empty) boxes. > > plot(data$filter_setting, type="n") > > Who do I get a clean white box to draw the different lines in? > > Thanks and regards, > Christoph > > --- > Christoph Krammer > Student > > University of Mannheim > Laboratory for Dependable Distributed Systems A5, 6 > 68131 Mannheim > Germany > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Christoph Krammer wrote:> Hello everybody, > > Since my first message was caught by the spam filter, I just try to do it > again: > > I want to use R to generate plots from categorial data. The data contains > results from OCR scans over images with are preprocessed by different image > filtering techniques. A small sample data set looks as following: > > >>data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T) >>data > > ocrtool filter_setting avg.hit. > 1 FineReader 2x1 0.383 > 2 FineReader 2x2 0.488 > 3 FineReader 3x2 0.268 > 4 FineReader 3x3 0.198 > 5 FineReader 4x3 0.081 > 6 FineReader 4x4 0.056 > 7 gocr 2x1 0.153 > 8 gocr 2x2 0.102 > 9 gocr 3x2 0.047 > 10 gocr 3x3 0.052 > 11 gocr 4x3 0.014 > 12 gocr 4x4 0.002 > 13 ocrad 2x1 0.085 > 14 ocrad 2x2 0.094 > 15 ocrad 3x2 0.045 > 16 ocrad 3x3 0.050 > 17 ocrad 4x3 0.025 > 18 ocrad 4x4 0.009 > > > I now want to draw a plot with the categories (filter_setting) as X axis, > and the avg_hit as Y axis. There should be lines for each ocrtool. > > But when I draw a plot, the resulting plot always contains bars, even if I > specify type="n". > >>plot(data$filter_setting, data$avg.hit., type="n") > > > When I only plot the categories, without data, there appear strange grey > (but empty) boxes. > >>plot(data$filter_setting, type="n") > > > Who do I get a clean white box to draw the different lines in? >Hi Christoph, How about this? plot(as.numeric(krammer$filter_setting[1:6]),krammer$avg_hit[1:6], type="b",col=2,ylim=c(0,0.5),main="OCR performance", xlab="Filter setting",ylab="Average hits",axes=FALSE) points(as.numeric(krammer$filter_setting[7:12]),krammer$avg_hit[7:12], type="b",col=3) points(as.numeric(krammer$filter_setting[13:18]),krammer$avg_hit[13:18], type="b",col=4) box() axis(1,at=1:6,labels=c("2x1","2x2","3x2","3x3","4x3","4x4")) axis(2) Jim
Hello Hadley, Thanks a lot for your help. I got the plot I want out of this module with a slightly more complicated command. But now, I have an additional problem: In the given case, the "filtersetting" column contains letters, so R takes the values as categories. But I have other filters, which only have numeric categories like "0.125", "0.25", "1", and so on. But there is no real "distance" between these values, so the data is still categorial. But if I draw a plot from this data, the result is a plot with axis labels like 0.2, 0.4, 0.6, ... How do I tell R to treat the numbers in the filtersetting column as categories? Thanks and regards, Christoph -----Urspr?ngliche Nachricht----- Von: hadley wickham [mailto:h.wickham at gmail.com] Gesendet: Sonntag, 1. Juli 2007 12:21 An: Christoph Krammer Cc: r-help at stat.math.ethz.ch Betreff: Re: [R] Plots from categorial data Perhaps this will do what you want: library(ggplot2) qplot(filter_setting, avg.hit, data=data, colour=ocrtool, geom="line") find out more about ggplot2 at http://had.co.nz/ggplot2 Hadley On 7/1/07, Christoph Krammer <ck at altaica.de> wrote:> Hello everybody, > > Since my first message was caught by the spam filter, I just try to do > it > again: > > I want to use R to generate plots from categorial data. The data > contains results from OCR scans over images with are preprocessed by > different image filtering techniques. A small sample data set looks asfollowing:> > > data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T) > > data > ocrtool filter_setting avg.hit. > 1 FineReader 2x1 0.383 > 2 FineReader 2x2 0.488 > 3 FineReader 3x2 0.268 > 4 FineReader 3x3 0.198 > 5 FineReader 4x3 0.081 > 6 FineReader 4x4 0.056 > 7 gocr 2x1 0.153 > 8 gocr 2x2 0.102 > 9 gocr 3x2 0.047 > 10 gocr 3x3 0.052 > 11 gocr 4x3 0.014 > 12 gocr 4x4 0.002 > 13 ocrad 2x1 0.085 > 14 ocrad 2x2 0.094 > 15 ocrad 3x2 0.045 > 16 ocrad 3x3 0.050 > 17 ocrad 4x3 0.025 > 18 ocrad 4x4 0.009 > > > I now want to draw a plot with the categories (filter_setting) as X > axis, and the avg_hit as Y axis. There should be lines for each ocrtool. > > But when I draw a plot, the resulting plot always contains bars, even > if I specify type="n". > > plot(data$filter_setting, data$avg.hit., type="n") > > When I only plot the categories, without data, there appear strange > grey (but empty) boxes. > > plot(data$filter_setting, type="n") > > Who do I get a clean white box to draw the different lines in? > > Thanks and regards, > Christoph > > --- > Christoph Krammer > Student > > University of Mannheim > Laboratory for Dependable Distributed Systems A5, 6 > 68131 Mannheim > Germany > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 7/1/07, Christoph Krammer <ck at altaica.de> wrote:> Hello Hadley, > > Thanks a lot for your help. I got the plot I want out of this module with a > slightly more complicated command. > > But now, I have an additional problem: > > In the given case, the "filtersetting" column contains letters, so R takes > the values as categories. But I have other filters, which only have numeric > categories like "0.125", "0.25", "1", and so on. But there is no real > "distance" between these values, so the data is still categorial. But if I > draw a plot from this data, the result is a plot with axis labels like 0.2, > 0.4, 0.6, ... > > How do I tell R to treat the numbers in the filtersetting column as > categories?Just make it a factor: qplot(factor(filter_setting), avg.hit, data=data, colour=ocrtool, geom="line") Hadley
On 7/1/07, Jim Lemon <jim at bitwrit.com.au> wrote:> Christoph Krammer wrote: > > Hello everybody, > > > > Since my first message was caught by the spam filter, I just try to do it > > again: > > > > I want to use R to generate plots from categorial data. The data contains > > results from OCR scans over images with are preprocessed by different > image > > filtering techniques. A small sample data set looks as following: > > > > > >>data <- read.csv("d:/tmp_da/sql_data/filter_d_tool.csv", header=T) > >>data > > > > ocrtool filter_setting avg.hit. > > 1 FineReader 2x1 0.383 > > 2 FineReader 2x2 0.488 > > 3 FineReader 3x2 0.268 > > 4 FineReader 3x3 0.198 > > 5 FineReader 4x3 0.081 > > 6 FineReader 4x4 0.056 > > 7 gocr 2x1 0.153 > > 8 gocr 2x2 0.102 > > 9 gocr 3x2 0.047 > > 10 gocr 3x3 0.052 > > 11 gocr 4x3 0.014 > > 12 gocr 4x4 0.002 > > 13 ocrad 2x1 0.085 > > 14 ocrad 2x2 0.094 > > 15 ocrad 3x2 0.045 > > 16 ocrad 3x3 0.050 > > 17 ocrad 4x3 0.025 > > 18 ocrad 4x4 0.009 > > > > > > I now want to draw a plot with the categories (filter_setting) as X axis, > > and the avg_hit as Y axis. There should be lines for each ocrtool. > > > > But when I draw a plot, the resulting plot always contains bars, even if I > > specify type="n". > > > >>plot(data$filter_setting, data$avg.hit., type="n") > > > > > > When I only plot the categories, without data, there appear strange grey > > (but empty) boxes. > > > >>plot(data$filter_setting, type="n") > > > > > > Who do I get a clean white box to draw the different lines in? > > > Hi Christoph, > > How about this? > > plot(as.numeric(krammer$filter_setting[1:6]),krammer$avg_hit[1:6], > type="b",col=2,ylim=c(0,0.5),main="OCR performance", > xlab="Filter setting",ylab="Average hits",axes=FALSE) > points(as.numeric(krammer$filter_setting[7:12]),krammer$avg_hit[7:12], > type="b",col=3) > points(as.numeric(krammer$filter_setting[13:18]),krammer$avg_hit[13:18], > type="b",col=4) > box() > axis(1,at=1:6,labels=c("2x1","2x2","3x2","3x3","4x3","4x4")) > axis(2)And this is mostly equivalent to with(krammer, interaction.plot(filter_setting, ocrtool, avg_hit)) or (with the original names) with(data, interaction.plot(filter_setting, ocrtool, avg.hit.)) -Deepayan