I''m tracking performance on the machine I installed yesterday. mutt running on one Xen instance, accessing via imap to another instance, accessing via nfs the maildir in another instances, seems little laggy when moving up and down the message index list. Network latency seems low < 30ms on average. So I was tracking vmstat. On the mutt instances is seems reasonable: [nic@shell:~] vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 144464 4 127748 0 0 0 0 36 6 0 0 100 0 0 0 0 144464 4 127748 0 0 0 0 87 67 1 0 99 0 0 0 0 144464 4 127748 0 0 0 0 90 83 0 0 100 0 0 0 0 144464 4 127748 0 0 0 0 27 14 0 0 100 0 0 0 0 144464 4 127748 0 0 0 0 10 7 0 0 100 0 0 0 0 144400 4 127748 0 0 0 19 77 56 0 0 100 0 However on the dom0 instance (which doesn''t run any of the above services, just the bridge) is seems very high: nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 12 10640 40 31168 0 0 17 25 94 19 0 1 99 0 0 0 12 10576 40 31168 0 0 0 17 170616 16 0 0 100 0 0 0 12 10616 40 31168 0 0 0 0 171948 10 0 0 100 0 0 0 12 10680 40 31168 0 0 0 0 171134 10 0 0 100 0 0 0 12 10680 40 31168 0 0 0 3 169175 11 0 0 100 0 0 0 12 10680 40 31168 0 0 0 15 173097 20 0 0 100 0 Is this level of interrupts reasonable? This currently a UP Xen 2.8 machine with 5 domX instances running without a large amount of load. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> nic@stateless:~$ vmstat 3 > procs -----------memory---------- ---swap-- -----io---- > --system-- ----cpu---- > 0 0 12 10680 40 31168 0 0 0 15 > 173097 20 0 0 100 0 > > Is this level of interrupts reasonable?55k interrupts a second on a supposedly idle machine is way too many. Please can you post the ouput of ''cat /proc/interupts'' a few seconds appart. Have you any USB devices connected? Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Tue, Mar 08, 2005 at 07:52:39AM -0000, Ian Pratt wrote:> > 55k interrupts a second on a supposedly idle machine is way too many. > Please can you post the ouput of ''cat /proc/interupts'' a few seconds > appart.nic@stateless:~$ cat /proc/interrupts > interrupts.1 ; sleep 5 ; cat /proc/interrupts > interrupts.2 nic@stateless:~$ diff -u interrupts.1 interrupts.2 --- interrupts.1 2005-03-08 21:42:41.000000000 +1300 +++ interrupts.2 2005-03-08 21:42:46.000000000 +1300 @@ -3,11 +3,11 @@ ... - 22: 349736 Phys-irq ioc0 - 24: 1575524 Phys-irq eth0 + 22: 349743 Phys-irq ioc0 + 24: 1575580 Phys-irq eth0 ... -130: 693599538 Dynamic-irq timer +130: 694438303 Dynamic-irq timer .... Baseline: CPU0 1: 1092 Phys-irq i8042 8: 4 Phys-irq rtc 11: 0 Phys-irq ohci_hcd 15: 11 Phys-irq ide1 22: 349736 Phys-irq ioc0 24: 1575524 Phys-irq eth0 128: 1 Dynamic-irq misdirect 129: 358 Dynamic-irq ctrl-if 130: 693599538 Dynamic-irq timer 131: 0 Dynamic-irq console 132: 0 Dynamic-irq net-be-dbg 133: 18922 Dynamic-irq blkif-backend 134: 15098 Dynamic-irq vif1.0 135: 10915 Dynamic-irq vif1.1 136: 115583 Dynamic-irq blkif-backend 137: 664133 Dynamic-irq vif10.0 138: 5 Dynamic-irq vif10.1 139: 17192 Dynamic-irq blkif-backend 140: 51367 Dynamic-irq vif9.0 141: 57060 Dynamic-irq vif9.1 142: 31778 Dynamic-irq blkif-backend 143: 19907 Dynamic-irq vif4.0 144: 637287 Dynamic-irq vif4.1 145: 16274 Dynamic-irq blkif-backend 146: 16437 Dynamic-irq vif7.0 147: 27508 Dynamic-irq vif7.1 NMI: 0 ERR: 0> Have you any USB devices connected?Nope. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On 8 Mar 2005, at 08:50, Nicholas Lee wrote:>> 55k interrupts a second on a supposedly idle machine is way too many. >> Please can you post the ouput of ''cat /proc/interupts'' a few seconds >> appart. > > nic@stateless:~$ cat /proc/interrupts > interrupts.1 ; sleep 5 ; cat > /proc/interrupts > interrupts.2 > nic@stateless:~$ diff -u interrupts.1 interrupts.2This can happen if you block/unblock very frequently. The timer interrupt from Xen isn''t entirely tick-based -- you also get a timer interrupt every time you are rescheduled. So the large number of timer interrupts indicates lots of unblocking. Really we should hold-off the timer interrupt if the domain was descheduled for less than a jiffy. :-) -- Keir ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Tue, Mar 08, 2005 at 09:14:55AM +0000, Keir Fraser wrote:> > This can happen if you block/unblock very frequently. The timer > interrupt from Xen isn''t entirely tick-based -- you also get a timer > interrupt every time you are rescheduled.By block/unblock I assume you mean context switching between different domains. Is this level normal? Is there a method to track down exactly is causing the block/unblocking?> So the large number of timer interrupts indicates lots of unblocking. > Really we should hold-off the timer interrupt if the domain was > descheduled for less than a jiffy. :-)Is there a way to fix this at the moment? Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> > This can happen if you block/unblock very frequently. The timer > > interrupt from Xen isn''t entirely tick-based -- you also > get a timer > > interrupt every time you are rescheduled. > > By block/unblock I assume you mean context switching between different > domains. Is this level normal? Is there a method to track down exactly > is causing the block/unblocking? > > > So the large number of timer interrupts indicates lots of > unblocking. > > Really we should hold-off the timer interrupt if the domain was > > descheduled for less than a jiffy. :-) > > Is there a way to fix this at the moment?Since processing timer interrupts are cheap there''s no urgent fix required. The real question is why are you blocking/unblocking at a rate of 55k/second. What are your domains doing? What interrupt rates do they see? Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Digging into this a little more. Once I''ve shut down domUs and a lot of the services (including xend) on dom0, the interrupt level is unchanged and still high. nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 98512 104 21328 0 0 53 16 71426 49 0 1 97 1 0 0 0 98520 104 21328 0 0 0 1 183415 16 0 0 100 0 0 0 0 98520 104 21328 0 0 0 0 185419 10 0 0 100 0 0 0 0 98520 104 21328 0 0 0 0 183671 12 0 0 100 0 0 0 0 97872 104 21328 0 0 0 67 170898 47 0 0 100 0 .... So obiovusly the domUs aren''t causing the problem. In fact in a new rebooted machine with hardly anything running the same is occuring. Seems like some spinlock out of control. Now if I start compiling xen (make in xen-2.0bk) the following occurs: nic@stateless:~$ vmstat 3 .... 0 1 0 3296 32 102832 0 0 2653 53 73099 658 20 17 11 53 0 1 0 3168 32 101036 0 0 839 5092 71982 558 26 25 5 45 3 1 0 3616 32 97076 0 0 5101 139 36210 811 32 37 2 29 0 1 0 26904 32 95108 0 0 1665 263 67692 832 27 25 3 45 0 1 0 8408 32 99000 0 0 1297 129 55256 399 26 13 8 53 0 1 0 4696 32 98140 0 0 396 83 22446 277 57 19 4 20 1 0 0 3416 32 98476 0 0 795 1169 88617 436 22 12 4 62 1 0 0 19800 32 91668 0 0 364 21 18907 178 65 23 0 12 1 0 0 22552 32 88176 0 0 424 470 14323 204 65 23 1 11 1 0 0 23768 32 86052 0 0 140 283 3218 181 77 19 0 4 1 0 0 35736 32 86260 0 0 28 244 215 119 71 27 0 2 1 0 0 22040 32 86572 0 0 71 475 14472 148 72 17 4 7 3 0 0 21216 32 86272 0 0 85 55 592 239 67 32 0 1 1 0 0 18528 32 86452 0 0 31 1238 19669 131 59 29 10 2 1 0 0 16816 32 86988 0 0 79 4721 9065 106 66 25 1 8 1 0 0 15664 32 81284 0 0 16 43 1786 93 79 20 0 1 1 0 0 26048 32 81452 0 0 25 53 325 81 71 29 0 0 1 0 0 17616 32 81624 0 0 33 21 1337 115 87 13 0 1 1 0 0 41056 32 77984 0 0 73 326 1898 140 79 18 0 2 1 0 0 25248 32 78144 0 0 32 233 503 87 87 13 0 1 1 0 0 33832 32 78304 0 0 19 11 116 58 78 22 0 0 1 0 0 45736 32 78532 0 0 43 114 1477 75 80 18 0 1 1 0 0 40680 32 78760 0 0 43 32 215 63 78 22 0 0 1 0 0 29224 32 78924 0 0 25 162 250 88 79 21 0 1 1 0 0 39976 32 79124 0 0 29 269 5719 90 70 26 3 0 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 38816 32 80432 0 0 404 24 9220 119 68 22 1 9 1 0 0 16672 32 80984 0 0 32 90 2226 82 75 23 0 2 1 0 0 23840 32 81788 0 0 213 21 4700 106 75 18 0 7 1 0 0 31136 32 82100 0 0 52 98 3469 88 74 23 0 3 1 0 0 32288 32 82356 0 0 43 251 3546 123 76 22 0 2 Interrupts drop down: nic@stateless:~$ diff -u interrupts.3 interrupts.4 --- interrupts.3 2005-03-09 00:37:51.000000000 +1300 +++ interrupts.4 2005-03-09 00:37:56.000000000 +1300 @@ -2,26 +2,26 @@ 1: 9 Phys-irq i8042 8: 4 Phys-irq rtc 15: 11 Phys-irq ide1 - 22: 45384 Phys-irq ioc0 - 24: 82320 Phys-irq eth0 + 22: 45707 Phys-irq ioc0 + 24: 82403 Phys-irq eth0 128: 1 Dynamic-irq misdirect 129: 102 Dynamic-irq ctrl-if -130: 121707669 Dynamic-irq timer +130: 121719090 Dynamic-irq timer 131: 0 Dynamic-irq console 132: 0 Dynamic-irq net-be-dbg -133: 724 Dynamic-irq blkif-backend +133: 726 Dynamic-irq blkif-backend 134: 25 Dynamic-irq vif1.0 135: 177 Dynamic-irq vif1.1 -136: 2352 Dynamic-irq blkif-backend -137: 4300 Dynamic-irq vif2.0 +136: 2586 Dynamic-irq blkif-backend +137: 5008 Dynamic-irq vif2.0 138: 1 Dynamic-irq vif2.1 139: 1936 Dynamic-irq blkif-backend 140: 2 Dynamic-irq vif5.0 -141: 4439 Dynamic-irq vif5.1 +141: 5151 Dynamic-irq vif5.1 142: 626 Dynamic-irq blkif-backend -143: 451 Dynamic-irq vif4.0 -144: 640 Dynamic-irq vif4.1 -145: 901 Dynamic-irq blkif-backend +143: 452 Dynamic-irq vif4.0 +144: 641 Dynamic-irq vif4.1 +145: 904 Dynamic-irq blkif-backend 146: 3 Dynamic-irq vif6.0 147: 97 Dynamic-irq vif6.1 NMI: 0 then start back up with the compile job is completed: 1 0 0 29848 56 67280 0 0 452 203 6867 397 62 34 0 4 1 0 0 29208 56 67520 0 0 451 33 11491 382 51 43 1 4 2 0 0 32472 56 67664 0 0 399 205 3197 370 53 46 0 2 1 0 0 29528 56 67908 0 0 416 64 5640 354 43 53 0 4 1 0 0 31320 56 68396 0 0 328 218 1633 326 46 53 1 1 1 0 0 22488 56 68612 0 0 192 183 2367 215 57 40 2 1 1 0 0 24984 56 68688 0 0 1987 61 2169 186 40 59 0 1 1 0 0 29976 56 69716 0 0 1805 161 3892 292 45 53 2 0 1 0 0 28136 56 70052 0 0 1972 108 11184 318 40 54 1 5 1 0 0 27504 56 70256 0 0 2363 660 4377 305 31 66 1 2 1 0 0 28144 56 70360 0 0 2396 186 2604 273 29 68 1 2 1 0 0 22256 56 70512 0 0 2277 396 1514 254 31 67 0 2 1 0 0 31664 56 70640 0 0 1860 482 11381 245 23 70 0 6 3 0 0 34736 56 70972 0 0 408 133 10282 352 36 58 3 2 1 0 0 17200 56 80856 0 0 647 566 45363 510 40 31 8 22 0 1 0 3072 56 98700 0 0 521 2509 56155 453 49 16 20 16 0 0 0 2752 56 99360 0 0 109 152 173391 94 0 1 83 16 Seems like a bug to me. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> In fact in a new rebooted machine with hardly anything > running the same > is occuring. Seems like some spinlock out of control.I doubt its anything to do with spinlocks, but this issue is going to be much easier to figure out if it occurs on a freshly booted machine with just a dom0, no xend (hence no bridge). Please can you confirm that this is the case. Is it just the timer interrupt line that''s going up fast? (BTW: what is ioc0?) Exactly what kernel are you using? Have you modified the config? What hardware are you using (including any USB devices)? Thanks, Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Hi Nicholas, On Tue, Mar 08, 2005 at 07:52:26PM +1300, Nicholas Lee wrote:> However on the dom0 instance (which doesn''t run any of the above > services, just the bridge) is seems very high: > > nic@stateless:~$ vmstat 3 > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 0 0 12 10640 40 31168 0 0 17 25 94 19 0 1 99 0 > 0 0 12 10576 40 31168 0 0 0 17 170616 16 0 0 100 0 > 0 0 12 10616 40 31168 0 0 0 0 171948 10 0 0 100 0 > 0 0 12 10680 40 31168 0 0 0 0 171134 10 0 0 100 0 > 0 0 12 10680 40 31168 0 0 0 3 169175 11 0 0 100 0 > 0 0 12 10680 40 31168 0 0 0 15 173097 20 0 0 100 0 > > > Is this level of interrupts reasonable?No. An up-to-date x86 machine can do something like ~200000 interrupts per sec at 100% CPU load. Regards, -- Kurt Garloff <kurt@garloff.de> [Koeln, DE] Physics:Plasma modeling <garloff@plasimo.phys.tue.nl> [TU Eindhoven, NL] Linux: SUSE Labs (Director) <garloff@suse.de> [Novell Inc]
On Tue, Mar 08, 2005 at 11:07:53AM -0000, Ian Pratt wrote:> Since processing timer interrupts are cheap there''s no urgent fix > required. > The real question is why are you blocking/unblocking at a rate of > 55k/second.Seems to be affecting my interactive latency though. I was going to test a headless NX desktop install, I''ll probably hold off on that for a little while.> What are your domains doing? What interrupt rates do they see?NFS, qmail, apache/php, imap and openvpn are the main services. Very low load. Two imap sessions, one openvpn sessions. Not more than 2000 emails per day. CRM114 and clamav virus/spam scanning. mutt. Postgres and mysql, on very low loads. vmstat intr figures for the guest domU domains is usually <30. I''m going reduce a dom0 kernel configure to the minimal and see how that functions. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Nicholas Lee wrote:> I''m going reduce a dom0 kernel configure to the minimal and see how that > functions.Or just stop the various services/apps one by one and monitor the difference.. thanks, Nivedita ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> On Tue, Mar 08, 2005 at 11:07:53AM -0000, Ian Pratt wrote: > > Since processing timer interrupts are cheap there''s no urgent fix > > required. > > The real question is why are you blocking/unblocking at a rate of > > 55k/second. > > Seems to be affecting my interactive latency though. I was going to > test a headless NX desktop install, I''ll probably hold off on > that for a > little while.55k interrupts/second will certainly make things seem sluggish. There''s something bad happening on your system.> NFS, qmail, apache/php, imap and openvpn are the main > services. Very low > load. Two imap sessions, one openvpn sessions. Not more than > 2000 emails > per day. CRM114 and clamav virus/spam scanning. mutt. > > Postgres and mysql, on very low loads. > > vmstat intr figures for the guest domU domains is usually <30. > > I''m going reduce a dom0 kernel configure to the minimal and > see how that > functions.It would be very helpful if you could see whether you can reproduce this with just dom0, or with one of our stock 2.0-testing kernels. I''d like to get to the bottom of this before announcing 2.0.5. Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Tue, Mar 08, 2005 at 03:22:19PM -0800, Nivedita Singhvi wrote:> Or just stop the various services/apps one by one and monitor > the difference..Doesn''t seem to make any difference. I''ll try be more systematic about it though, and do some more testing. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Tue, Mar 08, 2005 at 12:50:58PM -0000, Ian Pratt wrote:> > I doubt its anything to do with spinlocks, but this issue is going to be > much easier to figure out if it occurs on a freshly booted machine with > just a dom0, no xend (hence no bridge).I''ll build a minimal config kernal and test it this evening.> Please can you confirm that this is the case. Is it just the timer > interrupt line that''s going up fast? > (BTW: what is ioc0?)LSI1030 Raid control status interface. Jan 17 14:33:03 localhost kernel: Fusion MPT base driver 2.05.11.03 Jan 17 14:33:03 localhost kernel: Copyright (c) 1999-2003 LSI Logic Corporation Jan 17 14:33:03 localhost kernel: mptbase: Initiating ioc0 bringup Jan 17 14:33:03 localhost kernel: ioc0: 53C1030: Capabilities={Initiator} Jan 17 14:33:03 localhost kernel: mptbase: 1 MPT adapter found, 1 installed. for: nic@stateless:/usr/src/sys/mpt-status-1.0$ sudo ./mpt-status ioc0 vol 0 type IM, 2 phy, 136 GB, flags ENABLED, state OPTIMAL ioc0 phy 0 IBM-ESXS ST3146807LC FN B25H, 136 GB, state ONLINE ioc0 phy 1 IBM-ESXS ST3146807LC FN B25H, 136 GB, state ONLINE nic@stateless:/usr/src/sys/mpt-status-1.0$ cat /proc/mpt/ioc0/summary ioc0: LSI53C1030, FwRev=01032715h, Ports=1, MaxQ=222, IRQ=22> Exactly what kernel are you using? Have you modified the config?2.6.10-xen0. Yes. I''ll send though the diff seperately.> What hardware are you using (including any USB devices)?IBM X336 with hardware scsi raid1. In colo with the only thing attached being a spider web cable (*). I don''t have figures for when it was in my lab and attached to a keyboard. Pretty plain basic system. (*) Breaks out a keyboard, vga and mouse ports from a special plug this series of 1U IBM x-servers have. Nothing attached to it. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 01:48:25PM +1300, Nicholas Lee wrote:> On Tue, Mar 08, 2005 at 12:50:58PM -0000, Ian Pratt wrote: > > > > I doubt its anything to do with spinlocks, but this issue is going to be > > much easier to figure out if it occurs on a freshly booted machine with > > just a dom0, no xend (hence no bridge). > > I''ll build a minimal config kernal and test it this evening.Default kernel with the attach difference from default. Same problem. nic@stateless:~$ w 15:51:08 up 1 min, 1 user, load average: 0.27, 0.11, 0.03 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT nic pts/0 mdr11-port271.je 15:51 0.00s 0.00s 0.00s w nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 215176 104 15304 0 0 484 39 148476 135 3 1 92 4 0 0 0 215176 104 15304 0 0 0 0 190424 9 0 0 100 0 0 0 0 215192 104 15304 0 0 0 3 186709 12 0 0 100 0 0 0 0 215192 104 15304 0 0 0 0 190614 8 0 0 100 0 0 0 0 215192 104 15304 0 0 0 10 187672 15 0 0 100 0 0 0 0 215192 104 15304 0 0 0 0 188719 8 0 0 100 0 nic@stateless:~$ cat /proc/interrupts > interrupts.5 ; sleep 5 ; cat /proc/interrupts > interrupts.6 nic@stateless:~$ diff -u interrupts.5 interrupts.6 --- interrupts.5 2005-03-09 16:24:22.961649499 +1300 +++ interrupts.6 2005-03-09 16:24:27.971732310 +1300 @@ -1,24 +1,24 @@ CPU0 1: 8 Phys-irq i8042 15: 11 Phys-irq ide1 - 22: 27836 Phys-irq ioc0 - 24: 31155 Phys-irq eth0 + 22: 27846 Phys-irq ioc0 + 24: 31214 Phys-irq eth0 128: 1 Dynamic-irq misdirect 129: 207 Dynamic-irq ctrl-if -130: 232875313 Dynamic-irq timer +130: 233870383 Dynamic-irq timer 131: 0 Dynamic-irq console 132: 0 Dynamic-irq net-be-dbg 133: 1397 Dynamic-irq blkif-backend 134: 439 Dynamic-irq vif6.0 135: 145 Dynamic-irq vif6.1 -136: 11591 Dynamic-irq blkif-backend +136: 11599 Dynamic-irq blkif-backend 137: 45161 Dynamic-irq vif7.0 138: 2 Dynamic-irq vif7.1 -139: 2425 Dynamic-irq blkif-backend +139: 2427 Dynamic-irq blkif-backend 140: 133 Dynamic-irq vif8.0 141: 33852 Dynamic-irq vif8.1 142: 1732 Dynamic-irq blkif-backend -143: 81 Dynamic-irq vif9.0 +143: 82 Dynamic-irq vif9.0 144: 295 Dynamic-irq vif9.1 145: 1827 Dynamic-irq blkif-backend 146: 1503 Dynamic-irq vif10.0 Still 200,000 interrupts per sec. List of default processes at run time: ic@stateless:~$ ps awx PID TTY STAT TIME COMMAND 1 ? S 0:00 init [2] 2 ? SN 0:00 [ksoftirqd/0] 3 ? S< 0:00 [events/0] 4 ? S< 0:00 [khelper] 15 ? S< 0:00 [kblockd/0] 92 ? S 0:00 [pdflush] 93 ? S 0:00 [pdflush] 95 ? S< 0:00 [aio/0] 94 ? S 0:00 [kswapd0] 96 ? S< 0:00 [xfslogd/0] 97 ? S< 0:00 [xfsdatad/0] 98 ? S 0:00 [xfsbufd] 681 ? S 0:00 [kseriod] 729 ? S 0:00 [xenblkd] 744 ? S< 0:00 [ata/0] 751 ? S 0:00 [scsi_eh_0] 769 ? S< 0:00 [kmirrord/0] 771 ? S 0:00 [xfssyncd] 990 ? S 0:00 [kjournald] 991 ? S 0:00 [xfssyncd] 1479 ? Ss 0:00 /sbin/syslogd 1482 ? Ss 0:00 /sbin/klogd 1512 ? Ss 0:00 /usr/sbin/exim4 -bd -q30m 1518 ? Ss 0:00 /usr/sbin/inetd 1539 ? Ss 0:00 /usr/sbin/sshd 1544 ? SLs 0:00 /usr/sbin/ntpd -p /var/run/ntpd.pid 1547 ? Ss 0:00 /usr/sbin/atd 1552 ? Ss 0:00 /usr/sbin/cron 1560 tty1 Ss+ 0:00 /sbin/getty 38400 tty1 1562 tty2 Ss+ 0:00 /sbin/getty 38400 tty2 1563 tty3 Ss+ 0:00 /sbin/getty 38400 tty3 1564 tty4 Ss+ 0:00 /sbin/getty 38400 tty4 1566 tty5 Ss+ 0:00 /sbin/getty 38400 tty5 1567 tty6 Ss+ 0:00 /sbin/getty 38400 tty6 1568 ? Ss 0:00 /bin/sh /command/svscanboot 1580 ? S 0:00 svscan /service 1581 ? S 0:00 readproctitle service errors: ............................................................. .............. 1582 ? S 0:00 supervise tinydns 1583 ? S 0:00 supervise log 1584 ? S 0:00 supervise dnscache 1585 ? S 0:00 supervise log 1586 ? S 0:00 /usr/bin/dnscache 1587 ? S 0:00 multilog t ./main 1589 ? S 0:00 /usr/bin/tinydns 1588 ? S 0:00 multilog t ./main 1590 ? Ss 0:00 sshd: nic [priv] 1592 ? S 0:00 sshd: nic [priv] 1594 ? S 0:00 sshd: nic@pts/0 1595 pts/0 Ss 0:00 -bash 1600 pts/0 R+ 0:00 ps awx Modules: nic@stateless:~$ lsmod Module Size Used by ip_tables 17024 0 nic@stateless:~$ sudo rmmod ip_tables Password: nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 215056 104 15484 0 0 339 30 159736 100 2 1 94 3 0 0 0 215056 104 15484 0 0 0 0 186941 11 0 0 100 0 0 0 0 215056 104 15484 0 0 0 0 188010 8 0 0 100 0 0 0 0 215064 104 15484 0 0 0 8 188835 11 0 0 100 0 The main differences are XFS and the MPT driver. Testing the kernel without XFS would be difficult, as root is formated with XFS. IP Tables is pretty much required for obvious reasons. (Console and XFRD listening by default on *:.) One other piece of hardware info. This machine is currently only UP. I was intending to add a second processor at some later stage. With Xen running and writting this message. (This is my mail host.) nic@stateless:~/sys/iptables$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 159072 172 56692 0 0 219 44 18568 83 1 1 97 1 0 0 0 159072 172 56692 0 0 0 22 200525 25 0 0 100 0 0 0 0 159096 172 56692 0 0 0 18 198147 29 0 0 100 0 0 0 0 159096 172 56692 0 0 0 40 194533 29 0 0 100 0 0 0 0 159096 172 56692 0 0 0 78 202158 17 0 0 100 0 0 0 0 159096 172 56692 0 0 0 0 198422 30 0 0 100 0 0 0 0 159160 172 56692 0 0 0 13 197980 24 0 0 100 0 .. Same intr level, additional context switches as you''d expect with multiple hosts/processes running. However, it does seem like I''ve remove the "Badness in local_bh_enable" problem. At least it hasn''t flooded the logs at all. Nicholas
> > > I doubt its anything to do with spinlocks, but this issue > is going to be > > > much easier to figure out if it occurs on a freshly > booted machine with > > > just a dom0, no xend (hence no bridge). > > > > I''ll build a minimal config kernal and test it this evening. > > Default kernel with the attach difference from default. > > Same problem.At what point does the high interrupt rate start happening? Does it happen with just dom0 running? When you start xend? When you start dom1? Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 08:03:49AM -0000, Ian Pratt wrote:> At what point does the high interrupt rate start happening? > Does it happen with just dom0 running? When you start xend? When you > start dom1?As soon as I ssh into the machine. With just the services shown in ''ps axw'' list running. ie. no xend. How stable is testing at the moment? Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> On Wed, Mar 09, 2005 at 08:03:49AM -0000, Ian Pratt wrote: > > At what point does the high interrupt rate start happening? > > Does it happen with just dom0 running? When you start xend? When you > > start dom1? > > As soon as I ssh into the machine. With just the services shown in ''ps > axw'' list running. ie. no xend.With just a mostly idle dom0 running there''s no way should be getting 50k interrupts a second. I think it must be being caused by a bad interaction with one of your hardware devices. Could you try out the beta of the graphical demo CD I posted a few days back and see if you get the same problem. Since its running off CD it won''t have a driver for your scsi card, which might eliminate that as the candidate. I''m also very interested in the USB setup on the machine. Can you boot it with any usb modules moved out of the way so the kernel can''t load them?> How stable is testing at the moment?I would defnitely use 2.0-testing over 2.0.4 at the moment -- we''re on the verge of releasing 2.0.5 but I want to understand your interupt storm issue. Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 08:47:16AM -0000, Ian Pratt wrote:> > With just a mostly idle dom0 running there''s no way should be getting > 50k interrupts a second. I think it must be being caused by a badActually its 200k intr per sec.> interaction with one of your hardware devices. Could you try out the > beta of the graphical demo CD I posted a few days back and see if youUnfortunately not easily. Its sitting in an ISP colo across town.> get the same problem. Since its running off CD it won''t have a driver > for your scsi card, which might eliminate that as the candidate. I''m > also very interested in the USB setup on the machine. Can you boot it > with any usb modules moved out of the way so the kernel can''t load them?Default kernel doesn''t have any USB modules compiled in: nic@stateless:~$ ls -lR /lib/modules/2.6.10-xen0-stateless/ | grep ko -rw-r--r-- 1 root root 3082 2005-03-09 16:14 crc32c.ko -rw-r--r-- 1 root root 15315 2005-03-09 16:14 des.ko -rw-r--r-- 1 root root 5208 2005-03-09 16:14 md5.ko -rw-r--r-- 1 root root 9917 2005-03-09 16:14 sha1.ko -rw-r--r-- 1 root root 7030 2005-03-09 16:14 exportfs.ko -rw-r--r-- 1 root root 43294 2005-03-09 16:14 fat.ko -rw-r--r-- 1 root root 10080 2005-03-09 16:14 msdos.ko -rw-r--r-- 1 root root 103391 2005-03-09 16:14 nfsd.ko -rw-r--r-- 1 root root 14306 2005-03-09 16:14 vfat.ko -rw-r--r-- 1 root root 7946 2005-03-09 16:14 ip_conntrack_ftp.ko -rw-r--r-- 1 root root 48907 2005-03-09 16:14 ip_conntrack.ko -rw-r--r-- 1 root root 5917 2005-03-09 16:14 ip_nat_ftp.ko -rw-r--r-- 1 root root 5148 2005-03-09 16:14 iptable_filter.ko -rw-r--r-- 1 root root 26330 2005-03-09 16:14 iptable_nat.ko -rw-r--r-- 1 root root 21843 2005-03-09 16:14 ip_tables.ko -rw-r--r-- 1 root root 3715 2005-03-09 16:14 ipt_conntrack.ko -rw-r--r-- 1 root root 3140 2005-03-09 16:14 ipt_iprange.ko -rw-r--r-- 1 root root 4761 2005-03-09 16:14 ipt_MASQUERADE.ko -rw-r--r-- 1 root root 7924 2005-03-09 16:14 ipt_REJECT.ko -rw-r--r-- 1 root root 3159 2005-03-09 16:14 ipt_state.ko> > How stable is testing at the moment? > > I would defnitely use 2.0-testing over 2.0.4 at the moment -- we''re on > the verge of releasing 2.0.5 but I want to understand your interupt > storm issue.Ok, compling this now. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> > get the same problem. Since its running off CD it won''t > have a driver > > for your scsi card, which might eliminate that as the candidate. I''m > > also very interested in the USB setup on the machine. Can > you boot it > > with any usb modules moved out of the way so the kernel > can''t load them? > > Default kernel doesn''t have any USB modules compiled in:Things do rather point at your fusion mpt scsi card driver. It''s odd that it''s the timer interrupt that counts fast rather than the associated device interrupt, but this could be because its always setting an ''add_timer'' to go off in the very near future. Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 09:38:44AM -0000, Ian Pratt wrote:> Things do rather point at your fusion mpt scsi card driver. It''s odd > that it''s the timer interrupt that counts fast rather than the > associated device interrupt, but this could be because its always > setting an ''add_timer'' to go off in the very near future.nic@stateless:/usr/src/xen/xen-2.0-testing.bk/linux-2.6.10-xen0/drivers/message/fusion$ grep -rs add_time . ./mptbase.c: add_timer(&pCfg->timer); ./mptbase.c: add_timer(&pCfg->timer); ./mptctl.c: add_timer(&ioc->ioctl->timer); ./mptctl.c: add_timer(&ioctl->TMtimer); ./mptctl.c: add_timer(&ioc->ioctl->timer); ./mptscsih.c: * and add_timer ./mptscsih.c: add_timer(&hd->TMtimer); ./mptscsih.c: add_timer(&hd->timer); ./mptscsih.c: add_timer(&hd->timer); Which is likely to be the one to look at closer? Not being a kernel expert myself. What about XFS? Could that cause this issue? I''ll see if I can get out there tomorrow afternoon and try the text based CD. Testing has the same problem: ERROR: cannot use unconfigured serial port COM1 __ __ ____ ___ \ \/ /___ _ __ |___ \ / _ \ \ // _ \ ''_ \ __) || | | | / \ __/ | | | / __/ | |_| | /_/\_\___|_| |_| |_____(_)___/ http://www.cl.cam.ac.uk/netos/xen University of Cambridge Computer Laboratory Xen version 2.0 (nic@) (gcc version 3.3.5 (Debian 1:3.3.5-8)) Wed Mar 9 21:00:54 NZDT 2005 Latest ChangeSet: 2005/03/09 02:02:39 1.1768 422e593fP_MDJ47j5LhtS8fQOVuyAQ nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 215240 104 15328 0 0 179 36 150999 137 4 1 92 2 0 0 0 215240 104 15328 0 0 0 0 202363 11 0 0 100 0 0 0 0 215248 104 15328 0 0 0 239 205301 49 0 0 100 0 0 0 0 215248 104 15328 0 0 0 0 202289 8 0 0 100 0 0 0 0 215248 104 15328 0 0 0 6 202663 13 0 0 100 0 0 0 0 215248 104 15328 0 0 0 163 204469 44 0 0 100 0 0 0 0 215248 104 15328 0 0 0 0 205706 8 0 0 100 0 This is again a clean state, with no xend only the processes from the previous ''ps awx'' running. MPT is in the kernel, but not mptctl. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
I just tried booting the demo CD on a machine with an MPT Fusion card (a sun V20z) and it worked fine. Even doing a ''find . | xargs cat >/dev/null'' I only got 10k interrupts a second. It''ll be interesting to hear what it does on your machine. Ian> -----Original Message----- > From: Nicholas Lee [mailto:nic-lists@plumtree.co.nz] > Sent: 09 March 2005 10:59 > To: Ian Pratt > Cc: xen-devel@lists.sourceforge.net; ian.pratt@cl.cam.ac.uk > Subject: Re: [Xen-devel] Interrupt levels > > On Wed, Mar 09, 2005 at 09:38:44AM -0000, Ian Pratt wrote: > > Things do rather point at your fusion mpt scsi card driver. It''s odd > > that it''s the timer interrupt that counts fast rather than the > > associated device interrupt, but this could be because its always > > setting an ''add_timer'' to go off in the very near future. > > nic@stateless:/usr/src/xen/xen-2.0-testing.bk/linux-2.6.10-xen0/drivers/message/fusion$ grep -rs add_time .> ./mptbase.c: add_timer(&pCfg->timer); > ./mptbase.c: add_timer(&pCfg->timer); > ./mptctl.c: > add_timer(&ioc->ioctl->timer); > ./mptctl.c: add_timer(&ioctl->TMtimer); > ./mptctl.c: add_timer(&ioc->ioctl->timer); > ./mptscsih.c: * and add_timer > ./mptscsih.c: add_timer(&hd->TMtimer); > ./mptscsih.c: add_timer(&hd->timer); > ./mptscsih.c: add_timer(&hd->timer); > > > Which is likely to be the one to look at closer? Not being a kernel > expert myself. > > > What about XFS? Could that cause this issue? > > > I''ll see if I can get out there tomorrow afternoon and try the text > based CD. > > Testing has the same problem: > > ERROR: cannot use unconfigured serial port COM1 > __ __ ____ ___ > \ \/ /___ _ __ |___ \ / _ \ > \ // _ \ ''_ \ __) || | | | > / \ __/ | | | / __/ | |_| | > /_/\_\___|_| |_| |_____(_)___/ > > http://www.cl.cam.ac.uk/netos/xen > University of Cambridge Computer Laboratory > > Xen version 2.0 (nic@) (gcc version 3.3.5 (Debian > 1:3.3.5-8)) Wed Mar 9 21:00:54 NZDT 2005 > Latest ChangeSet: 2005/03/09 02:02:39 1.1768 > 422e593fP_MDJ47j5LhtS8fQOVuyAQ > > > nic@stateless:~$ vmstat 3 > procs -----------memory---------- ---swap-- -----io---- > --system-- ----cpu---- > r b swpd free buff cache si so bi bo in > cs us sy id wa > 0 0 0 215240 104 15328 0 0 179 36 > 150999 137 4 1 92 2 > 0 0 0 215240 104 15328 0 0 0 0 > 202363 11 0 0 100 0 > 0 0 0 215248 104 15328 0 0 0 239 > 205301 49 0 0 100 0 > 0 0 0 215248 104 15328 0 0 0 0 > 202289 8 0 0 100 0 > 0 0 0 215248 104 15328 0 0 0 6 > 202663 13 0 0 100 0 > 0 0 0 215248 104 15328 0 0 0 163 > 204469 44 0 0 100 0 > 0 0 0 215248 104 15328 0 0 0 0 > 205706 8 0 0 100 0 > > This is again a clean state, with no xend only the processes from the > previous ''ps awx'' running. > > > MPT is in the kernel, but not mptctl. > > Nicholas >------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 03:20:41PM -0000, Ian Pratt wrote:> > I just tried booting the demo CD on a machine with an MPT Fusion card (a > sun V20z) and it worked fine.Have you tried running Xen from the MPT card in a similar setup as mine? I''ve got xencd-base_xen-2.0.4_20050225T220000.iso and the latest release: xencd 1.0rc01. Which one did you use? Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> On Wed, Mar 09, 2005 at 03:20:41PM -0000, Ian Pratt wrote: > > > > I just tried booting the demo CD on a machine with an MPT > Fusion card (a > > sun V20z) and it worked fine. > > Have you tried running Xen from the MPT card in a similar > setup as mine? > > I''ve got > xencd-base_xen-2.0.4_20050225T220000.iso > > and the latest release: > > xencd 1.0rc01. > > Which one did you use?http://www.cl.cam.ac.uk/netos/xen/downloads/xendemo-2.0-beta1.iso Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Wed, Mar 09, 2005 at 09:33:06PM -0000, Ian Pratt wrote:> http://www.cl.cam.ac.uk/netos/xen/downloads/xendemo-2.0-beta1.isoSince I''m not sitting on JAnet at the moment and just a proxie DSL connection I''ll have to skipping downloading that. Which kernel is it running? Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> -----Original Message----- > From: Nicholas Lee [mailto:nic-lists@plumtree.co.nz] > Sent: 09 March 2005 22:13 > To: Ian Pratt > Cc: xen-devel@lists.sourceforge.net; ian.pratt@cl.cam.ac.uk > Subject: Re: [Xen-devel] Interrupt levels > > On Wed, Mar 09, 2005 at 09:33:06PM -0000, Ian Pratt wrote: > > http://www.cl.cam.ac.uk/netos/xen/downloads/xendemo-2.0-beta1.iso > > Since I''m not sitting on JAnet at the moment and just a proxie DSL > connection I''ll have to skipping downloading that. > > Which kernel is it running?2.0-testing from a few days ago. Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
I was thinking about it this morning, thinking that interrupts coming from the MPT control would probably have an affect on disk IO, and thus a very noticable affect on system performace. Then I realised maybe the other no standard config item that comes up at boot was a second bridge. Unfortunately I had to wait until this afternoon so I could get into the colo, and my DSL link is down at the moment. nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 4312 200 85008 0 0 10 13 412 17 0 0 100 0 0 0 0 4312 200 85008 0 0 0 0 196987 14 0 0 100 0 nic@stateless:~$ sudo ifdown internal-br nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 4776 200 85012 0 0 10 13 455 17 0 0 100 0 0 0 0 4776 200 85012 0 0 0 51 48 26 0 0 100 0 0 0 0 4776 200 85012 0 0 0 0 40 14 0 0 100 0 The exact problem is ''bridge hello time'' being set to zero. When I when I switch between ''0'' and ''1'' hello time via: auto internal-br iface internal-br inet static address 10.8.0.254 netmask 255.255.0.0 network 10.8.0.0 broadcast 10.8.255.255 bridge_ports eth1 bridge_fd 0 bridge_hello 1 bridge_stp off load switches: nic@stateless:~$ sudo vi /etc/network/interfaces [1]+ Stopped sudo vi /etc/network/interfaces nic@stateless:~$ sudo ifup internal-br Waiting for internal-br to get ready (MAXWAIT is 2 seconds). nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 5504 176 84000 0 0 10 13 0 17 0 0 100 0 0 0 0 5520 176 84000 0 0 0 4 176557 17 0 0 100 0 0 0 0 5520 176 84000 0 0 0 60 171743 28 0 0 100 0 nic@stateless:~$ sudo ifdown internal-br nic@stateless:~$ fg [1]+ Stopped sudo vi /etc/network/interfaces nic@stateless:~$ sudo ifup internal-br Waiting for internal-br to get ready (MAXWAIT is 2 seconds). nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 5504 176 84012 0 0 10 13 43 17 0 0 100 0 0 0 0 5528 176 84012 0 0 0 0 68 13 0 0 100 0 0 0 0 5544 176 84012 0 0 0 13 71 15 0 0 100 0 vmstat with some load on the machine. (Guest running mutt loading folder via imap/nfs.) nic@stateless:~/sys/xen$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 0 9088 208 84740 0 0 12 14 43 18 0 0 100 0 0 0 0 9088 208 84740 0 0 527 47 949 403 0 0 100 0 0 0 0 9096 208 84740 0 0 561 1 913 408 0 1 99 0 0 0 0 9096 208 84740 0 0 536 15 982 402 0 1 99 0 0 0 0 9104 208 84740 0 0 659 37 941 309 0 3 97 0 0 0 0 9040 208 84740 0 0 2679 68 2163 412 0 3 97 0 0 0 0 9040 208 84740 0 0 2751 5 2178 403 0 2 98 0 0 0 0 8912 208 84740 0 0 3020 31 2389 387 0 3 97 0 0 0 0 8912 208 84740 0 0 3592 43 3125 906 0 1 99 0 0 0 0 8912 208 84740 0 0 2427 201 1909 356 0 3 97 0 0 0 0 8912 208 84740 0 0 1157 89 1279 327 0 4 96 0 0 0 0 8912 208 84740 0 0 124 497 415 128 0 3 97 0 0 0 0 8912 208 84740 0 0 0 605 542 244 0 7 93 0 0 0 0 8912 208 84740 0 0 0 82 415 114 0 3 97 0 0 0 0 8912 208 84740 0 0 0 395 2596 43 0 19 81 0 Idle: nic@stateless:~$ vmstat 3 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 7832 208 84748 0 0 13 14 45 18 0 0 100 0 0 0 0 7832 208 84748 0 0 0 12 68 24 0 0 100 0 0 0 0 7840 208 84748 0 0 0 3 61 15 0 0 100 0 0 0 0 7840 208 84748 0 0 0 0 64 12 0 0 100 0 For completeness: nic@stateless:~$ dmesg | grep eth1 eth1: Tigon3 [partno(BCM95703A30) rev 1002 PHY(5703)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:d5:66:6d eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1] device eth1 entered promiscuous mode device eth1 left promiscuous mode internal-br: port 1(eth1) entering disabled state device eth1 entered promiscuous mode Note, I used a hello time of 0 with UML on a standard host kernel. [1] Although I never noticed this level of interrupts previously. xen-br0 seems to default to non-zero hello time: nic@stateless:/proc/sys$ sudo brctl showstp xen-br0 xen-br0 bridge id 8000.000d60d5666c designated root 8000.000d60d5666c root port 0 path cost 0 max age 20.00 bridge max age 20.00 hello time 2.00 bridge hello time 2.00 forward delay 0.00 bridge forward delay 0.00 ageing time 300.00 hello timer 1.22 tcn timer 0.00 topology change timer 0.00 gc timer 62.60 flags internal-br0 with hello set to 1 is: nic@stateless:/proc/sys$ sudo brctl showstp internal-br internal-br bridge id 8000.000d60d5666d designated root 8000.000d60d5666d root port 0 path cost 0 max age 20.00 bridge max age 20.00 hello time 1.00 bridge hello time 1.00 forward delay 0.00 bridge forward delay 0.00 ageing time 300.00 hello timer 0.44 tcn timer 0.00 topology change timer 0.00 gc timer 36.98 flags After discovered, I figrued it wasn''t worth trying the LiveCD. [1] Settings I''ve used for UML come from http://edeca.net/articles/bridging/create-bridge.html Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> The exact problem is ''bridge hello time'' being set to zero. > > When I when I switch between ''0'' and ''1'' hello time via: > > auto internal-br > iface internal-br inet static > address 10.8.0.254 > netmask 255.255.0.0 > network 10.8.0.0 > broadcast 10.8.255.255 > bridge_ports eth1 > bridge_fd 0 > bridge_hello 1 > bridge_stp offYep, this is a problem that''s cropped up several times before. I would argue strongly that it''s a bug in the bridge code to add a timer for the current jiffies value. On native I think you get away with is as the timer won''t fire until the next jiffie. On Xen, you''ll enter Xen and then bounce straight back out as the time has already passed. I think we may have to hack arch xen to round to the next jiffie to match the native behaviour. However, the bridge''s behaviour is still pretty evil -- you''ll still end up executing the code HZ (100/1000) times a second, and the intention of the user was probably to disable execution of the code altogether. It won''t slay the machine (like executing int 200k times a second), but its not ideal. Ian ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Thu, Mar 10, 2005 at 02:28:46AM -0000, Ian Pratt wrote:> Yep, this is a problem that''s cropped up several times before. I would > argue strongly that it''s a bug in the bridge code to add a timer for the > current jiffies value.You guys need a decent community oriented FAQ/Wiki. If this is a known issue I probably would have been able to self-diagnose.> However, the bridge''s behaviour is still pretty evil -- you''ll still end > up executing the code HZ (100/1000) times a second, and the intention of > the user was probably to disable execution of the code altogether. It > won''t slay the machine (like executing int 200k times a second), but its > not ideal.Thanks for the help and staying on top of this. I''m glad its not something major like a hardware driver bug. Nicholas ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Thu, 2005-03-10 at 18:41 +1300, Nicholas Lee wrote:> You guys need a decent community oriented FAQ/Wiki. If this is a known > issue I probably would have been able to self-diagnose.Watch this space - will be available very very soon. ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel