Dmitry Pryanishnikov
2006-Apr-27 14:08 UTC
RELENG_4 -> 5 -> 6: significant performance regression
Hello! I've done simple (yet, I hope, reality-reflecting) performance benchmarking different STABLE branches (4 vs 5 vs 6) using the following hardware: CPU: Pentium II/Pentium II Xeon/Celeron (334.09-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x665 Stepping = 5 Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PA T,PSE36,MMX,FXSR> real memory = 134152192 (127 MB) ... rl0: <RealTek 8139 10/100BaseTX> port 0xe800-0xe8ff mem 0xdc101000-0xdc1010ff irq 5 at device 20.0 on pci0 ... fxp0: <Intel 82559 Pro/100 Ethernet> port 0xe400-0xe43f mem 0xdc100000-0xdc100fff,0xdc000000-0xdc0fffff irq 7 at device 19.0 on pci0 ... ad0: 76351MB <SAMSUNG SP0802N TK100-24> at ata0-master UDMA33 and just restoring precompiled 4/5/6-STABLE to the same HDD partition. I've used the following kernel config for 4-STABLE: ident TEST machine i386 maxusers 32 makeoptions CONF_CFLAGS=-fno-builtin makeoptions DEBUG=-g options INCLUDE_CONFIG_FILE cpu I686_CPU options COMPAT_43 options USER_LDT options SYSVSHM options SYSVSEM options SYSVMSG options INVARIANTS options INVARIANT_SUPPORT options USERCONFIG options INET options FAST_IPSEC options IPSEC_FILTERGIF pseudo-device ether pseudo-device vlan 1 pseudo-device loop pseudo-device bpf pseudo-device ppp 8 options PPP_BSDCOMP options PPP_DEFLATE options PPP_FILTER options IPFIREWALL options IPFW2 options IPFIREWALL_VERBOSE options IPFIREWALL_VERBOSE_LIMIT=100 options IPFIREWALL_FORWARD options IPDIVERT options IPSTEALTH options ICMP_BANDLIM options DUMMYNET options FFS options FFS_ROOT options SOFTUPDATES options QUOTA options P1003_1B options _KPOSIX_PRIORITY_SCHEDULING options _KPOSIX_VERSION=199309L pseudo-device pty pseudo-device crypto device isa device atkbdc0 at isa? port IO_KBD device atkbd0 at atkbdc? irq 1 device psm0 at atkbdc? irq 12 device vga0 at isa? pseudo-device splash device sc0 at isa? options SC_HISTORY_SIZE=1000 options SC_TWOBUTTON_MOUSE device npx0 at nexus? port IO_NPX flags 0x0 irq 13 device ata device atadisk options ATA_STATIC_ID device fdc0 at isa? port IO_FD1 irq 6 drq 2 device fd0 at fdc0 drive 0 device fd1 at fdc0 drive 1 device sio0 at isa? port IO_COM1 irq 4 device sio1 at isa? port IO_COM2 irq 3 device pci and slightly modified it for 5/6-STABLE, here is the diff ("<" = 4-only option, ">" - 5/6-only):> options SCHED_4BSD< options USER_LDT < options USERCONFIG < pseudo-device ether < pseudo-device vlan 1 < pseudo-device loop < pseudo-device bpf < pseudo-device ppp 8> device ether > device loop > device bpf< options IPFW2> options IPFIREWALL_FORWARD_EXTENDED< options ICMP_BANDLIM < options FFS_ROOT < options P1003_1B < options _KPOSIX_VERSION=199309L < pseudo-device pty < pseudo-device crypto> device pty > device crypto< device atkbdc0 at isa? port IO_KBD < device atkbd0 at atkbdc? irq 1 < device psm0 at atkbdc? irq 12 < device vga0 at isa? < pseudo-device splash < device sc0 at isa? ---> device atkbdc > device atkbd > device psm > options KBD_INSTALL_CDEV > device vga > device splash > device sc< device npx0 at nexus? port IO_NPX flags 0x0 irq 13> device npx< device fdc0 at isa? port IO_FD1 irq 6 drq 2 < device fd0 at fdc0 drive 0 < device fd1 at fdc0 drive 1 < device sio0 at isa? port IO_COM1 irq 4 < device sio1 at isa? port IO_COM2 irq 3 Also I've set kern.hz="100" in /boot/loader.conf for every system. I've effectively excluded ipfw from the game by using 'add 1 pass all from any to any' rule. I hope, I've compared apples with apples this way. For every x-STABLE, I've received large ISO image via FTP in binary mode twice: using rl NIC and using fxp one, both in 10baseT mode (got approx. 1 Mbyte/s transfer rate). I've noted CPU utilization which gave "systat -vm 1" once numbers have stabilized. Here are the results (average numbers, %User and %Nice are close to zero): %Sys %Intr %Idl RELENG_4 + rl0 14 14 72 RELENG_4 + fxp0 14 10 76 RELENG_5 + rl0 40 30 30 RELENG_5 + fxp0 35 25 40 RELENG_6 + rl0 45 40 15 RELENG_6 + fxp0 45 35 20 I've tried to verify these numbers by running 'md5 -t' in parallel with download and measuring wall time: "time md5 -t". Indeed, under RELENG_4 I've got 43 sec on wall clock time for this benchmark vs 2:01 for RELENG_5 and 2:05 under RELENG_6 (I don't understand why difference is so low between 5 and 6 here). I would call these numbers discouraging. Actually such high CPU usage during the relatively simple processing to HDD of _only_ 10 Mbit/s traffic will surely prevent deployment of 6-STABLE on many not-very-powerful production servers. Am I missing something simple regarding compile-time or runtime optimization? Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: dmitry@atlantis.dp.ua nic-hdl: LYNX-RIPE
Kris Kennaway
2006-Apr-27 18:13 UTC
RELENG_4 -> 5 -> 6: significant performance regression
On Thu, Apr 27, 2006 at 05:08:11PM +0300, Dmitry Pryanishnikov wrote:> makeoptions CONF_CFLAGS=-fno-builtinNon-default option; this may conceivably affect performance.> options INVARIANTS > options INVARIANT_SUPPORTThese definitely effect performance, much more in 5.x and 6.x (at the 10-20% level) than 4.x.> options QUOTAThis definitely effects performance on 6.x since it makes your filesystem giant-locked, which may also interfere with your network processing. Please retry without. Also make sure there are no other diagnostic messages at boot time about e.g. mpsafenet being forced to 0. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060427/75aa36ba/attachment.pgp
Kris Kennaway a ?crit :> On Thu, Apr 27, 2006 at 05:08:11PM +0300, Dmitry Pryanishnikov wrote: > > >> options QUOTA > > This definitely effects performance on 6.x since it makes your > filesystem giant-locked, which may also interfere with your network > processing.Why would QUOTA affect performance more on 6.x than 4.x ? I would like to understand because i think a system cannot be secure without QUOTA R.
Kris Kennaway
2006-Apr-27 18:36 UTC
RELENG_4 -> 5 -> 6: significant performance regression
On Thu, Apr 27, 2006 at 08:26:06PM +0200, ml@sd2i.com wrote:> Kris Kennaway a ?crit : > >On Thu, Apr 27, 2006 at 05:08:11PM +0300, Dmitry Pryanishnikov wrote: > > > > > >>options QUOTA > > > >This definitely effects performance on 6.x since it makes your > >filesystem giant-locked, which may also interfere with your network > >processing. > > Why would QUOTA affect performance more on 6.x than 4.x ? I would like > to understand because i think a system cannot be secure without QUOTAIt makes filesystem writes acquire Giant, which blocks other kernel code that needs to also acquire Giant. When the need to acquire Giant was removed from the mainstream UFS code in 6.0 it was an enormous performance improvement. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060427/d9c277b3/attachment.pgp
Robert Watson
2006-May-02 17:12 UTC
RELENG_4 -> 5 -> 6: significant performance regression
On Thu, 27 Apr 2006, Dmitry Pryanishnikov wrote:> options INVARIANTS > options INVARIANT_SUPPORTIn FreeBSD 5.x and FreeBSD 6.x, the INVARIANTS option has been significantly expanded to test a much larger set of invariants, and also incorporate kernel use-after-free checking, which involves memory scrubbing. This is great for catching bugs, but it will have a significant performance impact, especially for kernel-intensive loads. Robert N M Watson