[hmm, after thinking a bit I decided it would be more appropriate here, in stable@] Dear colleagues, any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror) Thanks in advance. Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------
On Mon, Oct 29, 2007 at 11:13:09AM +0300, Dmitry Morozovsky wrote:> > [hmm, after thinking a bit I decided it would be more appropriate here, in > stable@] > > Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)Store it on a memory file system and take periodic snapshots. The format is hopeless for large numbers of updates. The ganglia port's startup scripts show an example of doing this. -- Brooks -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20071029/93c898b9/attachment.pgp
* Dmitry Morozovsky <marck@rinet.ru> [071029 12:44] wrote:> > [hmm, after thinking a bit I decided it would be more appropriate here, in > stable@] > > Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)More ram? Turn off atime? Hash the data files into multiple directories to avoid having 2k files in one dir. Not sure how rrd tool works internally, but it might make sense to see if you can use some layering library to force it to cache some open files per process or something. giving a better synopsis on how rrd records data could help us help you. -Alfred
Have you tried the latest beta which has had quite a rework in terms of IO. Regards Steve ----- Original Message ----- From: "Dmitry Morozovsky" <marck@rinet.ru>> > [hmm, after thinking a bit I decided it would be more appropriate here, in > stable@] > > Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)===============================================This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.
> [hmm, after thinking a bit I decided it would be more appropriate here, in > stable@] > > Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes > even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)For example, update algorythm can be changed. Try to not update RRD files simultaneously, queue update data instead (with timestamps), for example, in memory, and periodically do a "bulk update" using a single rrdupdate call for all queue items related to single RRD file. This saves I/O a lot, for example, in my NMS (TclMon) I use simular scheme, and now it updates >50K RRD files with 5-10 variables each. I/O load is 100-150 tps with 1-1.5MB/s throughput. -- Oleg Derevenetz <oleg@vsi.ru> OOD3-RIPE Phone: +7 4732 539880 Fax: +7 4732 531415 http://www.vsi.ru CenterTelecom Voronezh ISP http://isp.vsi.ru
On Mon, Oct 29, 2007 at 11:13:09AM +0300, Dmitry Morozovsky wrote:> any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)Ideas: 1) Stop using rrdtool? This comment is somewhat in jest -- sometimes you can't avoid using it because it's part of something like cricket, cacti, etc., but simultaneously, rrdtool is quite atrocious as far as softwares go. I think it's popular because MRTG has a long-standing track record, thus Tobi's software is well-known. 2) Consider alternative software such as: * http://torrus.org/ * http://www.dynw.com/iog/ * A suite/library on Sourceforge somewhere which I cannot remember the name of, but acted as a data-over-time storage/graphing/plotting alternative to RRDtool. The name of the program was 3 letters, and we used to have a port for it, and may still, if I could remember the name of it. * Write your own (consider using SVG-based data, which you hand a web browser, and let the browser render the results) 3) Consider load balancing the polling (distributing the polling tasks across 2-4 boxes, then store the results on an NFS share) -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
On Tue, Oct 30, 2007 at 05:58:00AM -0700, Jeremy Chadwick wrote: <snip>> 2) Consider alternative software such as: > > * http://torrus.org/ > * http://www.dynw.com/iog/ > * A suite/library on Sourceforge somewhere which I cannot remember the > name of, but acted as a data-over-time storage/graphing/plotting > alternative to RRDtool. The name of the program was 3 letters, and > we used to have a port for it, and may still, if I could remember > the name of it.RTG? "http://rtg.sourceforge.net/" -- I have made it a rule never to smoke more that one cigar at a time. -- Mark Twain Mike Hall San Juan Island, WA System Admin - Rock Island Technology Solutions <mikeh@rockisland.com> System Admin - riverside.org, ssdd.org <mhall@riverside.org>
Dmitry Morozovsky wrote on Monday, October 29, 2007 9:13 AM:> any hints to tune rrdtool with ~30k rrd files (approx 2k target > devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or > sometimes even 3 mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)This is how rrdtool behaves. The best you can do is avoid paging of the "hot" blocks of the RRD files out of buffer cache by supplying sufficient RAM. Other options such as noatime etc. merely have minor effects. Personally, I run a 70k+ RRD file box by queueing the requests first and writing them to the database files in bulk, at the expense of an artificial delay of a couple of minutes. Disk space is some 6 GB so using a RAM disk might be an option, at the risk of losing data... Probably about 100k RRDs is what you can get out of current hardware. Note specifically that SMP has no effect. Helge Atos Origin GmbH, Theodor-Althoff-Str. 47, D-45133 Essen, Postfach 100 123, D-45001 Essen Telefon: +49 201 4305 0, Fax: +49 201 4305 689095, www.atosorigin.de ING Bank AG, Frankfurt/Main: Konto 001 014 0937, BLZ 500 210 00, Swift / BIC INGBDEFF, IBAN DE74 5002 1000 0010 1409 37 Gesch?ftsf?hrer: Wilbert Kieboom, Handelsregister Essen HRB 19354, Ust.-ID.-Nr.: DE147861238
Dmitry Morozovsky wrote:> [hmm, after thinking a bit I decided it would be more appropriate here, in > stable@] > > Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes even 3 > mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)If you're willing to try RELENG_7, gjournal might help you, if you put the journal on a separate drive (i.e. separate from these two). Other than that, maybe you could set the rrd files on an async file system and then take periodic snapshots from that. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 250 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20071101/5f75fb18/signature.pgp
> Dear colleagues, > > any hints to tune rrdtool with ~30k rrd files (approx 2k target > devices)? > > machine is mostly IO-bound, showing 100% disk load with 8 or sometimes > even 3 mB/s, 300-400 tps (it's 2 SATA300 disks in gmirror)Long and short of it, rrd sucks. I'm not sure whats sucks worse though, the architecture, the code, the data format, the fact that you have to exec open seek seek seek seek close for every one of those rrd files, or the 10000 character long command line you get to throw at it to generate complex graphs. :) I was working on a complete rewrite of the rrd update code a couple months back, but got distracted by other bigger projects and haven't had time to finish it up. If anyone is interested, there is some alpha code (aka do not run this on any .rrd files you value) at: http://sourceforge.net/projects/librrd/ The goal was to implement an efficient C API which wouldn't open/close the file with every cycle, use entirely mmap() (in stock code even when you say to use mmap, there is still a ton of legacy code which seeks around unnecessarily), and implement fine grained read/write locking for concurrency. Personally though, I suspect the vast majority of the speed improvements would come from just general code improvement rather than any specific technique (e.g. removing 5 levels of indirection which are completely unnecessary to accomplish a step, and which only exist because the code is a running hack based on small contributions that "make it work" from 100 different people who primarily write perl, all without any overall design). Pretty sure the remaining bug is in the CDP (consolidated data points) code which creates missing data points in the event you've waited longer than a PDP interval between updates. But more to the point, everything in rrd_update_cdp() is something I haven't yet reverse engineered to figure out what the actual goal is. If you can figure this out, you can probably rewrite that entire block in 1/5th the code, like everywhere else. :) Please let me know if there is any interest in this, I'll be happy to help provide info on what I've done so far, until I find more free time. :) -- Richard A Steenbergen <ras@e-gerbil.net> http://www.e-gerbil.net/ras GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)