Matt Ayres
2006-Apr-07 18:25 UTC
[Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
Ok, so we all know that guest network I/O is slow when the system CPU''s are being utilized extensively whether it be from dom0 or from other guests. Lots of people have written about this and I can post concrete tests if required. I''m just looking for one of the Xen developers to acknowledge that they have been able to replicate the problem and it is indeed being worked on or will be sometime in the near future. No one has acknowledged any of the previous threads on either list so I want to make sure it is an outstanding issue that is not being overlooked. Thank you, Matt Ayres (Just ask for my tests and I''ll paste the output / file a bug, but I think it''s been an established bug and didn''t want to make this e-mail too verbose.) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Apr-08 07:56 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
On 7 Apr 2006, at 19:25, Matt Ayres wrote:> Ok, so we all know that guest network I/O is slow when the system > CPU''s are being utilized extensively whether it be from dom0 or from > other guests. Lots of people have written about this and I can post > concrete tests if required. > > I''m just looking for one of the Xen developers to acknowledge that > they have been able to replicate the problem and it is indeed being > worked on or will be sometime in the near future. No one has > acknowledged any of the previous threads on either list so I want to > make sure it is an outstanding issue that is not being overlooked.It depends on the setup but poor scheduling is the main reason for poor network performance, usually. SEDF seems to have some problems with real-time domains (like domain0 with its default scheduling parameters) and gives them all the CPU they want -- this is obviously going to be bad if a client domain is scheduled on the same CPU. Since UDP has no flow control, dom0 can keep itself busy generating or forwarding UDP packets to the domU that get dropped continually in netback driver. DomU will hardly ever get scheduled. Even in the case of TCP, any drops will be interpreted as congestion and transmit rate will be cut. Basically I think the SEDF scheduler needs cleaning up: probably by removing the mass of confusing conditionally compiled options and then focusing on the remaining code that is actually compiled in. Another option is to try specifying the BVT scheduler and see if that works better. Or try setting dom0 to have non-real-time guarantees. Or give dom0 its own hyperthread on an Intel system (strongly recommended if it''s possible). Apart from that, if you really are genuinely loading up CPUs with CPU-intensive workloads, and expecting them also to be able to process a significant amount of network traffic then something has to give. You can only run CPUs at 100%. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2006-Apr-10 10:09 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
On Sat, Apr 08, 2006 at 08:56:19AM +0100, Keir Fraser wrote:> > On 7 Apr 2006, at 19:25, Matt Ayres wrote: > > >Ok, so we all know that guest network I/O is slow when the system > >CPU''s are being utilized extensively whether it be from dom0 or from > >other guests. Lots of people have written about this and I can post > >concrete tests if required. > > > >I''m just looking for one of the Xen developers to acknowledge that > >they have been able to replicate the problem and it is indeed being > >worked on or will be sometime in the near future. No one has > >acknowledged any of the previous threads on either list so I want to > >make sure it is an outstanding issue that is not being overlooked. > > It depends on the setup but poor scheduling is the main reason for poor > network performance, usually. SEDF seems to have some problems with > real-time domains (like domain0 with its default scheduling parameters) > and gives them all the CPU they want -- this is obviously going to be > bad if a client domain is scheduled on the same CPU. Since UDP has no > flow control, dom0 can keep itself busy generating or forwarding UDP > packets to the domU that get dropped continually in netback driver. > DomU will hardly ever get scheduled. Even in the case of TCP, any drops > will be interpreted as congestion and transmit rate will be cut. > > Basically I think the SEDF scheduler needs cleaning up: probably by > removing the mass of confusing conditionally compiled options and then > focusing on the remaining code that is actually compiled in. Another > option is to try specifying the BVT scheduler and see if that works > better. Or try setting dom0 to have non-real-time guarantees. Or give > dom0 its own hyperthread on an Intel system (strongly recommended if > it''s possible). >Has anyone already tried this? I''d like to know if dedicating own hyperthread for dom0 helps to fix these network performance problems..> Apart from that, if you really are genuinely loading up CPUs with > CPU-intensive workloads, and expecting them also to be able to process > a significant amount of network traffic then something has to give. You > can only run CPUs at 100%. > > -- Keir > >-- Pasi ^ . . Linux / - \ Choice.of.the .Next.Generation. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Matt Ayres
2006-Apr-13 13:52 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
Keir Fraser wrote:> > On 7 Apr 2006, at 19:25, Matt Ayres wrote: > >> Ok, so we all know that guest network I/O is slow when the system >> CPU''s are being utilized extensively whether it be from dom0 or from >> other guests. Lots of people have written about this and I can post >> concrete tests if required. >> >> I''m just looking for one of the Xen developers to acknowledge that >> they have been able to replicate the problem and it is indeed being >> worked on or will be sometime in the near future. No one has >> acknowledged any of the previous threads on either list so I want to >> make sure it is an outstanding issue that is not being overlooked. > > It depends on the setup but poor scheduling is the main reason for poor > network performance, usually. SEDF seems to have some problems with > real-time domains (like domain0 with its default scheduling parameters) > and gives them all the CPU they want -- this is obviously going to be > bad if a client domain is scheduled on the same CPU.You hit it right here. I did some thinking and informal tests and came to a conclusion. The SEDF is the "new kid" on the block and it also the default, hence everyone is using it. In many cases (such as mine) people are just using SEDF with the weight. Also, extratime seems to be broken (according to Stephen in an old post) and doesn''t work well with heavy I/O. It especially doesn''t do well when dom0 does anything else but provide block and network device access, even when it is tuned in proportion to the other VM weights. Another argument is that the SEDF scheduler is just TOO good at what it does, in that case it needs some work done to be more flexible. Users should consider and test both schedulers before making a decision on which to use, there is no clear "winner". Why am I replying? I did my tests. BVT is nowhere near as strict as SEDF in it''s "while 1" tests as far as allocating CPU to domains, but it seems to do a good enough job of providing a proportional share based on weight (duh) in a real world production environment. It also fixed my network throughput problem. Thanks, Matt _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2006-Apr-19 17:15 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
On Thu, Apr 13, 2006 at 09:52:19AM -0400, Matt Ayres wrote:> > > Keir Fraser wrote: > > > >On 7 Apr 2006, at 19:25, Matt Ayres wrote: > > > >>Ok, so we all know that guest network I/O is slow when the system > >>CPU''s are being utilized extensively whether it be from dom0 or from > >>other guests. Lots of people have written about this and I can post > >>concrete tests if required. > >> > >>I''m just looking for one of the Xen developers to acknowledge that > >>they have been able to replicate the problem and it is indeed being > >>worked on or will be sometime in the near future. No one has > >>acknowledged any of the previous threads on either list so I want to > >>make sure it is an outstanding issue that is not being overlooked. > > > >It depends on the setup but poor scheduling is the main reason for poor > >network performance, usually. SEDF seems to have some problems with > >real-time domains (like domain0 with its default scheduling parameters) > >and gives them all the CPU they want -- this is obviously going to be > >bad if a client domain is scheduled on the same CPU. > > You hit it right here. I did some thinking and informal tests and came > to a conclusion. The SEDF is the "new kid" on the block and it also the > default, hence everyone is using it. In many cases (such as mine) > people are just using SEDF with the weight. Also, extratime seems to be > broken (according to Stephen in an old post) and doesn''t work well with > heavy I/O. It especially doesn''t do well when dom0 does anything else > but provide block and network device access, even when it is tuned in > proportion to the other VM weights. > > Another argument is that the SEDF scheduler is just TOO good at what it > does, in that case it needs some work done to be more flexible. Users > should consider and test both schedulers before making a decision on > which to use, there is no clear "winner". > > Why am I replying? I did my tests. BVT is nowhere near as strict as > SEDF in it''s "while 1" tests as far as allocating CPU to domains, but it > seems to do a good enough job of providing a proportional share based on > weight (duh) in a real world production environment. It also fixed my > network throughput problem. > > Thanks, > Matt >It seems BVT was the recommended CPU scheduler in Xen 2. I think I''ll have to try it too.. I hope it will fix the network throughput problems I''m seeing in Xen 3. Are there any downsides in using BVT scheduler in Xen 3.0 ? Why was the default changed from BVT -> SEDF ? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anand Gupta
2006-Apr-19 19:46 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
How does one change from SEDF -> BVT ? On 4/19/06, Pasi Kärkkäinen <pasik@iki.fi> wrote:> > On Thu, Apr 13, 2006 at 09:52:19AM -0400, Matt Ayres wrote: > > > > > > Keir Fraser wrote: > > > > > >On 7 Apr 2006, at 19:25, Matt Ayres wrote: > > > > > >>Ok, so we all know that guest network I/O is slow when the system > > >>CPU''s are being utilized extensively whether it be from dom0 or from > > >>other guests. Lots of people have written about this and I can post > > >>concrete tests if required. > > >> > > >>I''m just looking for one of the Xen developers to acknowledge that > > >>they have been able to replicate the problem and it is indeed being > > >>worked on or will be sometime in the near future. No one has > > >>acknowledged any of the previous threads on either list so I want to > > >>make sure it is an outstanding issue that is not being overlooked. > > > > > >It depends on the setup but poor scheduling is the main reason for poor > > >network performance, usually. SEDF seems to have some problems with > > >real-time domains (like domain0 with its default scheduling parameters) > > >and gives them all the CPU they want -- this is obviously going to be > > >bad if a client domain is scheduled on the same CPU. > > > > You hit it right here. I did some thinking and informal tests and came > > to a conclusion. The SEDF is the "new kid" on the block and it also the > > default, hence everyone is using it. In many cases (such as mine) > > people are just using SEDF with the weight. Also, extratime seems to be > > broken (according to Stephen in an old post) and doesn''t work well with > > heavy I/O. It especially doesn''t do well when dom0 does anything else > > but provide block and network device access, even when it is tuned in > > proportion to the other VM weights. > > > > Another argument is that the SEDF scheduler is just TOO good at what it > > does, in that case it needs some work done to be more flexible. Users > > should consider and test both schedulers before making a decision on > > which to use, there is no clear "winner". > > > > Why am I replying? I did my tests. BVT is nowhere near as strict as > > SEDF in it''s "while 1" tests as far as allocating CPU to domains, but it > > seems to do a good enough job of providing a proportional share based on > > weight (duh) in a real world production environment. It also fixed my > > network throughput problem. > > > > Thanks, > > Matt > > > > It seems BVT was the recommended CPU scheduler in Xen 2. I think I''ll have > to try it too.. I hope it will fix the network throughput problems I''m > seeing in Xen 3. > > Are there any downsides in using BVT scheduler in Xen 3.0 ? Why was the > default changed from BVT -> SEDF ? > > -- Pasi > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- regards, Anand Gupta _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rob Gardner
2006-Apr-19 19:55 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
Anand Gupta wrote:> How does one change from SEDF -> BVT ? >Put sched=bvt on the xen boot command line in grub.conf Rob _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anand Gupta
2006-Apr-19 20:12 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
Thanks. On 4/20/06, Rob Gardner <rob.gardner@hp.com> wrote:> > Anand Gupta wrote: > > How does one change from SEDF -> BVT ? > > > > Put sched=bvt on the xen boot command line in grub.conf > > Rob > > >-- regards, Anand Gupta _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Anand Gupta
2006-Apr-19 20:14 UTC
Re: [Xen-devel] Slow guest network I/O when CPU is pegged - Looking for acknowledgement from developers
On 4/19/06, Pasi Kärkkäinen <pasik@iki.fi> wrote:> > Are there any downsides in using BVT scheduler in Xen 3.0 ? Why was the > default changed from BVT -> SEDF ? >I would like to know that as well. What i have seen from the mailing lists, SEDF is the most talked about and recommended one... -- regards, Anand Gupta _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel