Jan Vanvinkenroye
2014-Oct-29 16:56 UTC
[R] Fwd: Combining stacked bar charts for logfile analysis
Anfang der weitergeleiteten Nachricht:
Von: Jan Vanvinkenroye <jan.vanvinkenroye at tik.uni-stuttgart.de>
Datum: 29. Oktober 2014 17:52:06 MEZ
Betreff: Combining stacked bar charts for logfile analysis
An: r-help at r-project.org
Hello Everyone,
in order to assess webserver response time i would like to combine some
information from
a apache logfile. [1] This is my first project using R and I would be very
gratefull if someone
could help me or point me in the right direction :):
So far I managed to read the file to a dataframe, factorize the response time
(duration_microseconds) to
three discrete classes ("gut", "ok", "schlecht")
<=50000,<=200000ms,<20000ms.
barplot(table(access_log$bewertung), beside = FALSE, width = 1,
xlab="Response Time", ylab="Percentage",
col=c("green", "yellow", "red"))
gives me an aggregated percentage of response time of every request in the
logfile and
qplot(time, duration_microseconds, data=access_log, shape=bewertung)
returns plot of the response times over time.
How can i combine both plots with a stacked bar chart/hour/day? The given
example only contains only 20 lines
the original log file serveral thousand. A plot of this information led to a
somehow crowded (=mostly black)
plot.
[1] my data.frame
access_log <-
structure(list(host = structure(c(7L, 7L, 7L, 7L, 7L, 7L, 7L,
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label =
c("126.88.69.199",
"141.43.100.201", "141.58.109.90",
"141.58.110.210", "0.0.0.0",
"141.58.1", "1.1.1.1"), class = "factor"), time =
c("2014-07-17 16:25:02",
"2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:08",
"2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:12",
"2014-07-17 16:25:13", "2014-07-17 16:25:13",
"2014-07-17 16:25:12",
"2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:02",
"2014-07-17 16:25:08", "2014-07-17 16:25:02",
"2014-07-17 16:25:02",
"2014-07-17 16:25:12", "2014-07-17 16:25:13",
"2014-07-17 16:25:13",
"2014-07-17 16:25:12"), time_zone = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Label = "+0200", class = "factor"), request =
structure(c(4L,
4L, 4L, 4L, 1L, 1L, 3L, 4L, 1L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L,
4L, 1L, 2L), .Label = c("GET /home/ HTTP/1.1", "GET
/home/bildergalerie/Beratung.jpg HTTP/1.1",
"GET /home/css/realm_xhtml_2.0.css HTTP/1.1", "GET /home/r.html
HTTP/1.1"
), class = "factor"), status = structure(c(2L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L), .Label =
c("200",
"302"), class = "factor"), bytes = structure(c(1L, 1L, 1L,
1L,
3L, 3L, 4L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L, 3L, 2L
), .Label = c("-", "109640", "27930",
"856"), class = "factor"),
referal = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L,
3L, 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L), .Label = c("-",
"http://en.wikipedia.org/wiki/University_of_Stuttgart",
"http://www.uni-stuttgart.de/home/"
), class = "factor"), browser = structure(c(2L, 2L, 3L, 1L,
3L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L,
3L), .Label = c("libwww-perl/5.805", "Mozilla/5.0 (compatible;
MSIE 9.0; Windows NT 6.1; Trident/5.0)",
"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML,
like Gecko) Chrome/38.0.2095.0 Safari/537.36"
), class = "factor"), duration_seconds = c(0, 0, 0, 0, 8,
9, 0, 0, 0, 1, 0, 0, 0, 0, 8, 9, 0, 0, 0, 1), duration_microseconds =
c(11263,
2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, 1080564,
11263, 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138,
1080564), bewertung = structure(c(1L, 1L, 1L, 1L, 3L, 3L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 2L), .Label =
c("gut",
"ok", "schlecht"), class = "factor")), .Names =
c("host",
"time", "time_zone", "request",
"status", "bytes", "referal",
"browser", "duration_seconds",
"duration_microseconds", "bewertung"
), row.names = c(NA, 20L), class = "data.frame")
---
mit freundlichen Gr??en
Jan Vanvinkenroye
Jan Vanvinkenroye, Dipl. P?d., Evasys- / Vitero Adminstration, Forschung &
Evaluation
Informations- und Kommunikationszentrum der Universit?t Stuttgart (IZUS)
Technische Informations- und Kommunikationsdienste (TIK-Dienste, ehem. RUS)
Abteilung f?r Neue Medien in Forschung und Lehre (NFL)
Allmandring 30a ? 70550 Stuttgart ? Tel +49(0)711-685-87325 ? Fax
+49(0)711-685-77325
jan.vanvinkenroye at tik.uni-stuttgart.de ? http://www.izus.uni-stuttgart.de/
Richard M. Heiberger
2014-Oct-29 17:53 UTC
[R] Fwd: Combining stacked bar charts for logfile analysis
Jan,
Thank you for posting a reproducible example.
This is my first pass at providing a stacked bar chart by time. I have placed
schlecht on the negative side and both ok and gut on the positive side.
I don't know what you mean by percent from this data snippet.
I show how to produce a likert plot using both counts or percents.
You might want time on a different granularity,
There are many more options in likert, see ?likert and demo(likert) for details.
Time.bewertung <- with(access_log, table(as.POSIXct(access_log$time),
bewertung))
install.packages("HH") ## if you don't have it yet
library(HH)
likert(Time.bewertung[,3:1], ReferenceZero=1.5)
likert(Time.bewertung[,3:1], ReferenceZero=1.5, as.percent=TRUE)
Rich
On Wed, Oct 29, 2014 at 9:56 AM, Jan Vanvinkenroye
<jan.vanvinkenroye at tik.uni-stuttgart.de> wrote:>
>
> Anfang der weitergeleiteten Nachricht:
>
> Von: Jan Vanvinkenroye <jan.vanvinkenroye at tik.uni-stuttgart.de>
> Datum: 29. Oktober 2014 17:52:06 MEZ
> Betreff: Combining stacked bar charts for logfile analysis
> An: r-help at r-project.org
>
> Hello Everyone,
>
> in order to assess webserver response time i would like to combine some
information from
> a apache logfile. [1] This is my first project using R and I would be very
gratefull if someone
> could help me or point me in the right direction :):
>
>
> So far I managed to read the file to a dataframe, factorize the response
time (duration_microseconds) to
> three discrete classes ("gut", "ok",
"schlecht") <=50000,<=200000ms,<20000ms.
>
> barplot(table(access_log$bewertung), beside = FALSE, width = 1,
xlab="Response Time", ylab="Percentage",
col=c("green", "yellow", "red"))
>
> gives me an aggregated percentage of response time of every request in the
logfile and
>
> qplot(time, duration_microseconds, data=access_log, shape=bewertung)
>
> returns plot of the response times over time.
>
>
> How can i combine both plots with a stacked bar chart/hour/day? The given
example only contains only 20 lines
> the original log file serveral thousand. A plot of this information led to
a somehow crowded (=mostly black)
> plot.
>
>
>
>
> [1] my data.frame
> access_log <-
> structure(list(host = structure(c(7L, 7L, 7L, 7L, 7L, 7L, 7L,
> 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label =
c("126.88.69.199",
> "141.43.100.201", "141.58.109.90",
"141.58.110.210", "0.0.0.0",
> "141.58.1", "1.1.1.1"), class = "factor"),
time = c("2014-07-17 16:25:02",
> "2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:08",
> "2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:12",
> "2014-07-17 16:25:13", "2014-07-17 16:25:13",
"2014-07-17 16:25:12",
> "2014-07-17 16:25:02", "2014-07-17 16:25:02",
"2014-07-17 16:25:02",
> "2014-07-17 16:25:08", "2014-07-17 16:25:02",
"2014-07-17 16:25:02",
> "2014-07-17 16:25:12", "2014-07-17 16:25:13",
"2014-07-17 16:25:13",
> "2014-07-17 16:25:12"), time_zone = structure(c(1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
> ), .Label = "+0200", class = "factor"), request =
structure(c(4L,
> 4L, 4L, 4L, 1L, 1L, 3L, 4L, 1L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L,
> 4L, 1L, 2L), .Label = c("GET /home/ HTTP/1.1", "GET
/home/bildergalerie/Beratung.jpg HTTP/1.1",
> "GET /home/css/realm_xhtml_2.0.css HTTP/1.1", "GET
/home/r.html HTTP/1.1"
> ), class = "factor"), status = structure(c(2L, 2L, 2L, 2L, 1L,
> 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L), .Label =
c("200",
> "302"), class = "factor"), bytes = structure(c(1L, 1L,
1L, 1L,
> 3L, 3L, 4L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L, 3L, 2L
> ), .Label = c("-", "109640", "27930",
"856"), class = "factor"),
> referal = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L,
> 3L, 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L), .Label = c("-",
> "http://en.wikipedia.org/wiki/University_of_Stuttgart",
"http://www.uni-stuttgart.de/home/"
> ), class = "factor"), browser = structure(c(2L, 2L, 3L, 1L,
> 3L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L,
> 3L), .Label = c("libwww-perl/5.805", "Mozilla/5.0
(compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)",
> "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/38.0.2095.0 Safari/537.36"
> ), class = "factor"), duration_seconds = c(0, 0, 0, 0, 8,
> 9, 0, 0, 0, 1, 0, 0, 0, 0, 8, 9, 0, 0, 0, 1), duration_microseconds =
c(11263,
> 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, 1080564,
> 11263, 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138,
> 1080564), bewertung = structure(c(1L, 1L, 1L, 1L, 3L, 3L,
> 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 2L), .Label =
c("gut",
> "ok", "schlecht"), class = "factor")),
.Names = c("host",
> "time", "time_zone", "request",
"status", "bytes", "referal",
> "browser", "duration_seconds",
"duration_microseconds", "bewertung"
> ), row.names = c(NA, 20L), class = "data.frame")
>
>
>
>
>
>
>
> ---
>
> mit freundlichen Gr??en
>
> Jan Vanvinkenroye
>
> Jan Vanvinkenroye, Dipl. P?d., Evasys- / Vitero Adminstration, Forschung
& Evaluation
> Informations- und Kommunikationszentrum der Universit?t Stuttgart (IZUS)
> Technische Informations- und Kommunikationsdienste (TIK-Dienste, ehem. RUS)
> Abteilung f?r Neue Medien in Forschung und Lehre (NFL)
> Allmandring 30a ? 70550 Stuttgart ? Tel +49(0)711-685-87325 ? Fax
+49(0)711-685-77325
> jan.vanvinkenroye at tik.uni-stuttgart.de ?
http://www.izus.uni-stuttgart.de/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.