Jan Vanvinkenroye
2014-Oct-29 16:56 UTC
[R] Fwd: Combining stacked bar charts for logfile analysis
Anfang der weitergeleiteten Nachricht: Von: Jan Vanvinkenroye <jan.vanvinkenroye at tik.uni-stuttgart.de> Datum: 29. Oktober 2014 17:52:06 MEZ Betreff: Combining stacked bar charts for logfile analysis An: r-help at r-project.org Hello Everyone, in order to assess webserver response time i would like to combine some information from a apache logfile. [1] This is my first project using R and I would be very gratefull if someone could help me or point me in the right direction :): So far I managed to read the file to a dataframe, factorize the response time (duration_microseconds) to three discrete classes ("gut", "ok", "schlecht") <=50000,<=200000ms,<20000ms. barplot(table(access_log$bewertung), beside = FALSE, width = 1, xlab="Response Time", ylab="Percentage", col=c("green", "yellow", "red")) gives me an aggregated percentage of response time of every request in the logfile and qplot(time, duration_microseconds, data=access_log, shape=bewertung) returns plot of the response times over time. How can i combine both plots with a stacked bar chart/hour/day? The given example only contains only 20 lines the original log file serveral thousand. A plot of this information led to a somehow crowded (=mostly black) plot. [1] my data.frame access_log <- structure(list(host = structure(c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("126.88.69.199", "141.43.100.201", "141.58.109.90", "141.58.110.210", "0.0.0.0", "141.58.1", "1.1.1.1"), class = "factor"), time = c("2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:08", "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:12", "2014-07-17 16:25:13", "2014-07-17 16:25:13", "2014-07-17 16:25:12", "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:08", "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:12", "2014-07-17 16:25:13", "2014-07-17 16:25:13", "2014-07-17 16:25:12"), time_zone = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L ), .Label = "+0200", class = "factor"), request = structure(c(4L, 4L, 4L, 4L, 1L, 1L, 3L, 4L, 1L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L, 4L, 1L, 2L), .Label = c("GET /home/ HTTP/1.1", "GET /home/bildergalerie/Beratung.jpg HTTP/1.1", "GET /home/css/realm_xhtml_2.0.css HTTP/1.1", "GET /home/r.html HTTP/1.1" ), class = "factor"), status = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L), .Label = c("200", "302"), class = "factor"), bytes = structure(c(1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L, 3L, 2L ), .Label = c("-", "109640", "27930", "856"), class = "factor"), referal = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L, 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L), .Label = c("-", "http://en.wikipedia.org/wiki/University_of_Stuttgart", "http://www.uni-stuttgart.de/home/" ), class = "factor"), browser = structure(c(2L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, 3L), .Label = c("libwww-perl/5.805", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2095.0 Safari/537.36" ), class = "factor"), duration_seconds = c(0, 0, 0, 0, 8, 9, 0, 0, 0, 1, 0, 0, 0, 0, 8, 9, 0, 0, 0, 1), duration_microseconds = c(11263, 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, 1080564, 11263, 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, 1080564), bewertung = structure(c(1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 2L), .Label = c("gut", "ok", "schlecht"), class = "factor")), .Names = c("host", "time", "time_zone", "request", "status", "bytes", "referal", "browser", "duration_seconds", "duration_microseconds", "bewertung" ), row.names = c(NA, 20L), class = "data.frame") --- mit freundlichen Gr??en Jan Vanvinkenroye Jan Vanvinkenroye, Dipl. P?d., Evasys- / Vitero Adminstration, Forschung & Evaluation Informations- und Kommunikationszentrum der Universit?t Stuttgart (IZUS) Technische Informations- und Kommunikationsdienste (TIK-Dienste, ehem. RUS) Abteilung f?r Neue Medien in Forschung und Lehre (NFL) Allmandring 30a ? 70550 Stuttgart ? Tel +49(0)711-685-87325 ? Fax +49(0)711-685-77325 jan.vanvinkenroye at tik.uni-stuttgart.de ? http://www.izus.uni-stuttgart.de/
Richard M. Heiberger
2014-Oct-29 17:53 UTC
[R] Fwd: Combining stacked bar charts for logfile analysis
Jan, Thank you for posting a reproducible example. This is my first pass at providing a stacked bar chart by time. I have placed schlecht on the negative side and both ok and gut on the positive side. I don't know what you mean by percent from this data snippet. I show how to produce a likert plot using both counts or percents. You might want time on a different granularity, There are many more options in likert, see ?likert and demo(likert) for details. Time.bewertung <- with(access_log, table(as.POSIXct(access_log$time), bewertung)) install.packages("HH") ## if you don't have it yet library(HH) likert(Time.bewertung[,3:1], ReferenceZero=1.5) likert(Time.bewertung[,3:1], ReferenceZero=1.5, as.percent=TRUE) Rich On Wed, Oct 29, 2014 at 9:56 AM, Jan Vanvinkenroye <jan.vanvinkenroye at tik.uni-stuttgart.de> wrote:> > > Anfang der weitergeleiteten Nachricht: > > Von: Jan Vanvinkenroye <jan.vanvinkenroye at tik.uni-stuttgart.de> > Datum: 29. Oktober 2014 17:52:06 MEZ > Betreff: Combining stacked bar charts for logfile analysis > An: r-help at r-project.org > > Hello Everyone, > > in order to assess webserver response time i would like to combine some information from > a apache logfile. [1] This is my first project using R and I would be very gratefull if someone > could help me or point me in the right direction :): > > > So far I managed to read the file to a dataframe, factorize the response time (duration_microseconds) to > three discrete classes ("gut", "ok", "schlecht") <=50000,<=200000ms,<20000ms. > > barplot(table(access_log$bewertung), beside = FALSE, width = 1, xlab="Response Time", ylab="Percentage", col=c("green", "yellow", "red")) > > gives me an aggregated percentage of response time of every request in the logfile and > > qplot(time, duration_microseconds, data=access_log, shape=bewertung) > > returns plot of the response times over time. > > > How can i combine both plots with a stacked bar chart/hour/day? The given example only contains only 20 lines > the original log file serveral thousand. A plot of this information led to a somehow crowded (=mostly black) > plot. > > > > > [1] my data.frame > access_log <- > structure(list(host = structure(c(7L, 7L, 7L, 7L, 7L, 7L, 7L, > 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L), .Label = c("126.88.69.199", > "141.43.100.201", "141.58.109.90", "141.58.110.210", "0.0.0.0", > "141.58.1", "1.1.1.1"), class = "factor"), time = c("2014-07-17 16:25:02", > "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:08", > "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:12", > "2014-07-17 16:25:13", "2014-07-17 16:25:13", "2014-07-17 16:25:12", > "2014-07-17 16:25:02", "2014-07-17 16:25:02", "2014-07-17 16:25:02", > "2014-07-17 16:25:08", "2014-07-17 16:25:02", "2014-07-17 16:25:02", > "2014-07-17 16:25:12", "2014-07-17 16:25:13", "2014-07-17 16:25:13", > "2014-07-17 16:25:12"), time_zone = structure(c(1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L > ), .Label = "+0200", class = "factor"), request = structure(c(4L, > 4L, 4L, 4L, 1L, 1L, 3L, 4L, 1L, 2L, 4L, 4L, 4L, 4L, 1L, 1L, 3L, > 4L, 1L, 2L), .Label = c("GET /home/ HTTP/1.1", "GET /home/bildergalerie/Beratung.jpg HTTP/1.1", > "GET /home/css/realm_xhtml_2.0.css HTTP/1.1", "GET /home/r.html HTTP/1.1" > ), class = "factor"), status = structure(c(2L, 2L, 2L, 2L, 1L, > 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L), .Label = c("200", > "302"), class = "factor"), bytes = structure(c(1L, 1L, 1L, 1L, > 3L, 3L, 4L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L, 3L, 2L > ), .Label = c("-", "109640", "27930", "856"), class = "factor"), > referal = structure(c(1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, > 3L, 1L, 1L, 2L, 1L, 2L, 1L, 3L, 1L, 1L, 3L), .Label = c("-", > "http://en.wikipedia.org/wiki/University_of_Stuttgart", "http://www.uni-stuttgart.de/home/" > ), class = "factor"), browser = structure(c(2L, 2L, 3L, 1L, > 3L, 2L, 3L, 2L, 1L, 3L, 2L, 2L, 3L, 1L, 3L, 2L, 3L, 2L, 1L, > 3L), .Label = c("libwww-perl/5.805", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)", > "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2095.0 Safari/537.36" > ), class = "factor"), duration_seconds = c(0, 0, 0, 0, 8, > 9, 0, 0, 0, 1, 0, 0, 0, 0, 8, 9, 0, 0, 0, 1), duration_microseconds = c(11263, > 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, 1080564, > 11263, 2386, 1626, 1970, 8944261, 9474883, 1018, 2953, 73138, > 1080564), bewertung = structure(c(1L, 1L, 1L, 1L, 3L, 3L, > 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 1L, 1L, 1L, 2L), .Label = c("gut", > "ok", "schlecht"), class = "factor")), .Names = c("host", > "time", "time_zone", "request", "status", "bytes", "referal", > "browser", "duration_seconds", "duration_microseconds", "bewertung" > ), row.names = c(NA, 20L), class = "data.frame") > > > > > > > > --- > > mit freundlichen Gr??en > > Jan Vanvinkenroye > > Jan Vanvinkenroye, Dipl. P?d., Evasys- / Vitero Adminstration, Forschung & Evaluation > Informations- und Kommunikationszentrum der Universit?t Stuttgart (IZUS) > Technische Informations- und Kommunikationsdienste (TIK-Dienste, ehem. RUS) > Abteilung f?r Neue Medien in Forschung und Lehre (NFL) > Allmandring 30a ? 70550 Stuttgart ? Tel +49(0)711-685-87325 ? Fax +49(0)711-685-77325 > jan.vanvinkenroye at tik.uni-stuttgart.de ? http://www.izus.uni-stuttgart.de/ > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.