XiaYubin
2009-Oct-31 11:02 UTC
[Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
Hi, all, As I''m doing some research in cooperative scheduling between Xen and guest domain, I want to know how many ways can Xen trigger a context switch inside an HVM guest domain (which runs Windows in my case). Do I have to write a driver (like balloon-driver)? Or a user process is enough? Or there is an even simpler way? All your suggestions are appreciated. Thanks! :) -- Yubin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2009-Oct-31 15:20 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
Context switching is a choice the guest OS has to make, and how that''s done will differ based on the operating system. I think if you''re thinking about modifying the guest scheduler, you''re probably better off starting with Linux. Even if there''s a way to convince Windows to call schedule() to pick a new process, I''m not sure you''ll be able to tell it *which* process to choose. As far as mechanism on Xen''s side, it would be easy enough to allocate a "reschedule" event channel for the guest, such that whenever you want to trigger a guest reschedule, just raise the event channel. -George On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote:> Hi, all, > > As I''m doing some research in cooperative scheduling between Xen and > guest domain, I want to know how many ways can Xen trigger a context > switch inside an HVM guest domain (which runs Windows in my case). Do > I have to write a driver (like balloon-driver)? Or a user process is > enough? Or there is an even simpler way? > > All your suggestions are appreciated. Thanks! :) > > -- > Yubin > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
XiaYubin
2009-Nov-01 05:54 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
Hi, George, Thank you for your reply. Actually, I''m looking for a generic mechanism of cooperative scheduling. The independence of guest OS can make such mechanism more convincing and practical, just like the balloon driver does. Maybe you are wondering why I asked such a wired question, let me describe it with more details. My current work is based on "Task-aware VM scheduling", which is published on VEE''09. By monitoring CR3 changing at VMM level, Xen can get information of tasks'' CPU consumption to identify CPU hogs and I/O tasks. Therefore, the task-aware mechanism offers a more fine-grained scheduler than the original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O tasks in a mixed style. Imagine there are n VMs. One of them, named mix-VM, runs two tasks: cpuhog and iotask (network). The other VMs, named CPU-VM, run just cpuhog. All VMs are using PV driver ( GPLPV driver for Windows). Here''s what supposed to happen when iotask receiving an network packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an inter-domain event to mix-VM, which is likely to be in run-queue. Xen then schedules it to run immediately and set its state to preempting-state. Right after that, the mix-VM *should* schedules iotask to process the incoming packet, and then schedules cpuhog after processing. When the CR3 is changing to cpuhog, Xen knows that the mix-VM has finished I/O processing (here we assume that the priority of cpuhog is usually lower than iotask in most OS), and schedules the mix-VM out to finish its preempting-state. Therefore, the mix-VM can preempt other VMs to process I/O ASAP, while making the preempting time as short as possible to keep fairness. The point is: cpuhog should not run in preempting-state. However, a problem arises when the mix-VM sending packets. When iotask sends an amount of data (using TCP protocol), it will block and wait to be waked up after guest kernel sending all the data, which may be split into thousands of TCP packets. The mix-VM will receives an ACK packet every time it sending a packet, which makes it enter preempting-state. Note that at this moment, the CR3 of mix-VM is cpuhog''s (as the only running process). After the guest kernel processing the ACK packet and sending next packet, it switches to user mode, which means the cpuhog gets to run in preempting-state. The point is: as there is no CR3-changing, Xen has no way to run. One way is to add a hook at user/kernel mode switching, then Xen can catch the moment when cpuhog gets to run. However, this way costs too much. Another way is to force a VM to schedule when it entering preempting-state. Therefore, it will trap to Xen when CR3 is changed, and Xen can finish its preempting-state when it schedules cpuhog to run. That''s why I want to trigger guest context switch from Xen. I don''t really care *which* process it will switch to, I just want to get Xen a chance to run. The point is: is there a better/simpler way to solve this problem? Hope I described the problem clearly. And would you please show more details about the thought of "reschedule event channel"? Thanks! -- Yubin On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> Context switching is a choice the guest OS has to make, and how that''s > done will differ based on the operating system. I think if you''re > thinking about modifying the guest scheduler, you''re probably better > off starting with Linux. Even if there''s a way to convince Windows to > call schedule() to pick a new process, I''m not sure you''ll be able to > tell it *which* process to choose. > > As far as mechanism on Xen''s side, it would be easy enough to allocate > a "reschedule" event channel for the guest, such that whenever you > want to trigger a guest reschedule, just raise the event channel. > > -George > > On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote: >> Hi, all, >> >> As I''m doing some research in cooperative scheduling between Xen and >> guest domain, I want to know how many ways can Xen trigger a context >> switch inside an HVM guest domain (which runs Windows in my case). Do >> I have to write a driver (like balloon-driver)? Or a user process is >> enough? Or there is an even simpler way? >> >> All your suggestions are appreciated. Thanks! :) >> >> -- >> Yubin >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
James (song wei)
2009-Nov-02 09:16 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
Could you add a timer when watch the changes of CR3 to prevent one task exhaust too long cpu time without changing CR3. - James (Song Wei) XiaYubin wrote:> > Hi, George, > > Thank you for your reply. Actually, I''m looking for a generic > mechanism of cooperative scheduling. The independence of guest OS can > make such mechanism more convincing and practical, just like the > balloon driver does. > > Maybe you are wondering why I asked such a wired question, let me > describe it with more details. My current work is based on "Task-aware > VM scheduling", which is published on VEE''09. By monitoring CR3 > changing at VMM level, Xen can get information of tasks'' CPU > consumption to identify CPU hogs and I/O tasks. Therefore, the > task-aware mechanism offers a more fine-grained scheduler than the > original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O > tasks in a mixed style. > > Imagine there are n VMs. One of them, named mix-VM, runs two tasks: > cpuhog and iotask (network). The other VMs, named CPU-VM, run just > cpuhog. All VMs are using PV driver ( GPLPV driver for Windows). > > Here''s what supposed to happen when iotask receiving an network > packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an > inter-domain event to mix-VM, which is likely to be in run-queue. Xen > then schedules it to run immediately and set its state to > preempting-state. Right after that, the mix-VM *should* schedules > iotask to process the incoming packet, and then schedules cpuhog after > processing. When the CR3 is changing to cpuhog, Xen knows that the > mix-VM has finished I/O processing (here we assume that the priority > of cpuhog is usually lower than iotask in most OS), and schedules the > mix-VM out to finish its preempting-state. Therefore, the mix-VM can > preempt other VMs to process I/O ASAP, while making the preempting > time as short as possible to keep fairness. The point is: cpuhog > should not run in preempting-state. > > However, a problem arises when the mix-VM sending packets. When iotask > sends an amount of data (using TCP protocol), it will block and wait > to be waked up after guest kernel sending all the data, which may be > split into thousands of TCP packets. The mix-VM will receives an ACK > packet every time it sending a packet, which makes it enter > preempting-state. Note that at this moment, the CR3 of mix-VM is > cpuhog''s (as the only running process). After the guest kernel > processing the ACK packet and sending next packet, it switches to user > mode, which means the cpuhog gets to run in preempting-state. The > point is: as there is no CR3-changing, Xen has no way to run. > > One way is to add a hook at user/kernel mode switching, then Xen can > catch the moment when cpuhog gets to run. However, this way costs too > much. Another way is to force a VM to schedule when it entering > preempting-state. Therefore, it will trap to Xen when CR3 is changed, > and Xen can finish its preempting-state when it schedules cpuhog to > run. That''s why I want to trigger guest context switch from Xen. I > don''t really care *which* process it will switch to, I just want to > get Xen a chance to run. The point is: is there a better/simpler way > to solve this problem? > > Hope I described the problem clearly. And would you please show more > details about the thought of "reschedule event channel"? Thanks! > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >-- View this message in context: http://old.nabble.com/How-can-Xen-trigger-a-context-switch-in-an-HVM-guest-domain--tp26141418p26156633.html Sent from the Xen - Dev mailing list archive at Nabble.com. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2009-Nov-02 16:05 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
OK, so you want to allow a VM to run so that it can do packet processing in the kernel, but once it''s done in the kernel you want to preempt the VM again. An idea I was going to try out is that if a VM receives an interrupt (possibly only certain interrupts, like network), let it run for a very short amount of time (say, 1ms or 500us). That should be enough for it to do its basic packet processing (or audio processing, video processing, whatever). True, you''re going to run the "cpu hog" during that time, but that will be debited against time he''ll run later. (I haven''t tested this idea yet. It may work better with some credit algorithms than others.) The problem with inducing a guest to call schedule(): * It may not have any other runnable processes, or it may choose the same process to run again; so it may not switch the cr3 anyway. * The only reliable way to do it without some kind of paravirtualization (if even a kernel driver) would be to give it a timer interrupt, which may mess up other things on the system, such as the system time. If you''re really keen to preempt on return to userspace, you could try something like the following. Before delivering the interrupt, note the EIP the guest is at. If it''s in user space, set a hardware breakpoint at that address. Then deliver the interrupt. If the guest calls schedule(), you can catch the CR3 switch; if it returns to the same process, it will hit the breakpoint. Two possible problems: * For reasons of ancient history, the iret instruction may set the RF flag in the EFLAGS register, which will cause the breakpoint not to fire after the guest iret. You may need to decode the instruction and set the breakpoint at the instruction after, or something like that. * I believe windows doens''t do a cr3 switch if it does a *thread* switch. If so, on a thread switch you''ll get neither the CR3 switch nor the breakpoint (since the other thread is probably running somewhere else). Peace, -George On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote:> Hi, George, > > Thank you for your reply. Actually, I''m looking for a generic > mechanism of cooperative scheduling. The independence of guest OS can > make such mechanism more convincing and practical, just like the > balloon driver does. > > Maybe you are wondering why I asked such a wired question, let me > describe it with more details. My current work is based on "Task-aware > VM scheduling", which is published on VEE''09. By monitoring CR3 > changing at VMM level, Xen can get information of tasks'' CPU > consumption to identify CPU hogs and I/O tasks. Therefore, the > task-aware mechanism offers a more fine-grained scheduler than the > original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O > tasks in a mixed style. > > Imagine there are n VMs. One of them, named mix-VM, runs two tasks: > cpuhog and iotask (network). The other VMs, named CPU-VM, run just > cpuhog. All VMs are using PV driver ( GPLPV driver for Windows). > > Here''s what supposed to happen when iotask receiving an network > packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an > inter-domain event to mix-VM, which is likely to be in run-queue. Xen > then schedules it to run immediately and set its state to > preempting-state. Right after that, the mix-VM *should* schedules > iotask to process the incoming packet, and then schedules cpuhog after > processing. When the CR3 is changing to cpuhog, Xen knows that the > mix-VM has finished I/O processing (here we assume that the priority > of cpuhog is usually lower than iotask in most OS), and schedules the > mix-VM out to finish its preempting-state. Therefore, the mix-VM can > preempt other VMs to process I/O ASAP, while making the preempting > time as short as possible to keep fairness. The point is: cpuhog > should not run in preempting-state. > > However, a problem arises when the mix-VM sending packets. When iotask > sends an amount of data (using TCP protocol), it will block and wait > to be waked up after guest kernel sending all the data, which may be > split into thousands of TCP packets. The mix-VM will receives an ACK > packet every time it sending a packet, which makes it enter > preempting-state. Note that at this moment, the CR3 of mix-VM is > cpuhog''s (as the only running process). After the guest kernel > processing the ACK packet and sending next packet, it switches to user > mode, which means the cpuhog gets to run in preempting-state. The > point is: as there is no CR3-changing, Xen has no way to run. > > One way is to add a hook at user/kernel mode switching, then Xen can > catch the moment when cpuhog gets to run. However, this way costs too > much. Another way is to force a VM to schedule when it entering > preempting-state. Therefore, it will trap to Xen when CR3 is changed, > and Xen can finish its preempting-state when it schedules cpuhog to > run. That''s why I want to trigger guest context switch from Xen. I > don''t really care *which* process it will switch to, I just want to > get Xen a chance to run. The point is: is there a better/simpler way > to solve this problem? > > Hope I described the problem clearly. And would you please show more > details about the thought of "reschedule event channel"? Thanks! > > -- > Yubin > > On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: >> Context switching is a choice the guest OS has to make, and how that''s >> done will differ based on the operating system. I think if you''re >> thinking about modifying the guest scheduler, you''re probably better >> off starting with Linux. Even if there''s a way to convince Windows to >> call schedule() to pick a new process, I''m not sure you''ll be able to >> tell it *which* process to choose. >> >> As far as mechanism on Xen''s side, it would be easy enough to allocate >> a "reschedule" event channel for the guest, such that whenever you >> want to trigger a guest reschedule, just raise the event channel. >> >> -George >> >> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote: >>> Hi, all, >>> >>> As I''m doing some research in cooperative scheduling between Xen and >>> guest domain, I want to know how many ways can Xen trigger a context >>> switch inside an HVM guest domain (which runs Windows in my case). Do >>> I have to write a driver (like balloon-driver)? Or a user process is >>> enough? Or there is an even simpler way? >>> >>> All your suggestions are appreciated. Thanks! :) >>> >>> -- >>> Yubin >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
XiaYubin
2009-Nov-03 01:43 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
James and George, thank you both! The breakpoint way is interesting, I don''t event think of it :) OK, I''m going to use a simpler way to verify my idea first. Before the preempting-state VM runs, I will set a timer to make Xen get to run every 100us (maybe longer for the first iteration). The timer-handler will check if the preempting VM is in kernel-mode or user-mode. If it is in user-mode with cpu-hog''s CR3, then it will be scheduled out. Meanwhile, if the iteration goes beyond some threshold (say 5 times), the VM will also be scheduled out. This way seems much simpler than the one using breakpoint, and more accurate than the one using 1ms-timer. It may bring some overhead, but the preemption is not supposed to occur frequently and the fairness is more important. The thread problem also exists in Linux platform. Currently I have no good idea to identify different threads from the hypervisor''s perspective. I have a dream that one day those OS guys will export this information to VMM, a dream that one day our children will live in a world where virtualization rules. I have a dream today :) Thanks! -- Yubin On Tue, Nov 3, 2009 at 12:05 AM, George Dunlap <George.Dunlap@eu.citrix.com> wrote:> OK, so you want to allow a VM to run so that it can do packet > processing in the kernel, but once it''s done in the kernel you want to > preempt the VM again. > > An idea I was going to try out is that if a VM receives an interrupt > (possibly only certain interrupts, like network), let it run for a > very short amount of time (say, 1ms or 500us). That should be enough > for it to do its basic packet processing (or audio processing, video > processing, whatever). True, you''re going to run the "cpu hog" during > that time, but that will be debited against time he''ll run later. (I > haven''t tested this idea yet. It may work better with some credit > algorithms than others.) > > The problem with inducing a guest to call schedule(): > * It may not have any other runnable processes, or it may choose the > same process to run again; so it may not switch the cr3 anyway. > * The only reliable way to do it without some kind of > paravirtualization (if even a kernel driver) would be to give it a > timer interrupt, which may mess up other things on the system, such as > the system time. > > If you''re really keen to preempt on return to userspace, you could try > something like the following. Before delivering the interrupt, note > the EIP the guest is at. If it''s in user space, set a hardware > breakpoint at that address. Then deliver the interrupt. If the guest > calls schedule(), you can catch the CR3 switch; if it returns to the > same process, it will hit the breakpoint. > > Two possible problems: > * For reasons of ancient history, the iret instruction may set the RF > flag in the EFLAGS register, which will cause the breakpoint not to > fire after the guest iret. You may need to decode the instruction and > set the breakpoint at the instruction after, or something like that. > * I believe windows doens''t do a cr3 switch if it does a *thread* > switch. If so, on a thread switch you''ll get neither the CR3 switch > nor the breakpoint (since the other thread is probably running > somewhere else). > > Peace, > -George > > On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote: >> Hi, George, >> >> Thank you for your reply. Actually, I''m looking for a generic >> mechanism of cooperative scheduling. The independence of guest OS can >> make such mechanism more convincing and practical, just like the >> balloon driver does. >> >> Maybe you are wondering why I asked such a wired question, let me >> describe it with more details. My current work is based on "Task-aware >> VM scheduling", which is published on VEE''09. By monitoring CR3 >> changing at VMM level, Xen can get information of tasks'' CPU >> consumption to identify CPU hogs and I/O tasks. Therefore, the >> task-aware mechanism offers a more fine-grained scheduler than the >> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O >> tasks in a mixed style. >> >> Imagine there are n VMs. One of them, named mix-VM, runs two tasks: >> cpuhog and iotask (network). The other VMs, named CPU-VM, run just >> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows). >> >> Here''s what supposed to happen when iotask receiving an network >> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an >> inter-domain event to mix-VM, which is likely to be in run-queue. Xen >> then schedules it to run immediately and set its state to >> preempting-state. Right after that, the mix-VM *should* schedules >> iotask to process the incoming packet, and then schedules cpuhog after >> processing. When the CR3 is changing to cpuhog, Xen knows that the >> mix-VM has finished I/O processing (here we assume that the priority >> of cpuhog is usually lower than iotask in most OS), and schedules the >> mix-VM out to finish its preempting-state. Therefore, the mix-VM can >> preempt other VMs to process I/O ASAP, while making the preempting >> time as short as possible to keep fairness. The point is: cpuhog >> should not run in preempting-state. >> >> However, a problem arises when the mix-VM sending packets. When iotask >> sends an amount of data (using TCP protocol), it will block and wait >> to be waked up after guest kernel sending all the data, which may be >> split into thousands of TCP packets. The mix-VM will receives an ACK >> packet every time it sending a packet, which makes it enter >> preempting-state. Note that at this moment, the CR3 of mix-VM is >> cpuhog''s (as the only running process). After the guest kernel >> processing the ACK packet and sending next packet, it switches to user >> mode, which means the cpuhog gets to run in preempting-state. The >> point is: as there is no CR3-changing, Xen has no way to run. >> >> One way is to add a hook at user/kernel mode switching, then Xen can >> catch the moment when cpuhog gets to run. However, this way costs too >> much. Another way is to force a VM to schedule when it entering >> preempting-state. Therefore, it will trap to Xen when CR3 is changed, >> and Xen can finish its preempting-state when it schedules cpuhog to >> run. That''s why I want to trigger guest context switch from Xen. I >> don''t really care *which* process it will switch to, I just want to >> get Xen a chance to run. The point is: is there a better/simpler way >> to solve this problem? >> >> Hope I described the problem clearly. And would you please show more >> details about the thought of "reschedule event channel"? Thanks! >> >> -- >> Yubin >> >> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap >> <George.Dunlap@eu.citrix.com> wrote: >>> Context switching is a choice the guest OS has to make, and how that''s >>> done will differ based on the operating system. I think if you''re >>> thinking about modifying the guest scheduler, you''re probably better >>> off starting with Linux. Even if there''s a way to convince Windows to >>> call schedule() to pick a new process, I''m not sure you''ll be able to >>> tell it *which* process to choose. >>> >>> As far as mechanism on Xen''s side, it would be easy enough to allocate >>> a "reschedule" event channel for the guest, such that whenever you >>> want to trigger a guest reschedule, just raise the event channel. >>> >>> -George >>> >>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote: >>>> Hi, all, >>>> >>>> As I''m doing some research in cooperative scheduling between Xen and >>>> guest domain, I want to know how many ways can Xen trigger a context >>>> switch inside an HVM guest domain (which runs Windows in my case). Do >>>> I have to write a driver (like balloon-driver)? Or a user process is >>>> enough? Or there is an even simpler way? >>>> >>>> All your suggestions are appreciated. Thanks! :) >>>> >>>> -- >>>> Yubin >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2009-Nov-03 11:51 UTC
Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
When I first started doing performance analysis, the sedf scheduler was using a 500us timeslice, which (in my estimates) caused the first-gen VMX-capable processors to spend at least 5% of their time handling vmenters and vmexits. Obviously performance has increased somewhat since then, but they''re still not free. :-) -George On Tue, Nov 3, 2009 at 1:43 AM, XiaYubin <xiayubin@gmail.com> wrote:> James and George, thank you both! The breakpoint way is interesting, I > don''t event think of it :) > > OK, I''m going to use a simpler way to verify my idea first. Before the > preempting-state VM runs, I will set a timer to make Xen get to run > every 100us (maybe longer for the first iteration). The timer-handler > will check if the preempting VM is in kernel-mode or user-mode. If it > is in user-mode with cpu-hog''s CR3, then it will be scheduled out. > Meanwhile, if the iteration goes beyond some threshold (say 5 times), > the VM will also be scheduled out. This way seems much simpler than > the one using breakpoint, and more accurate than the one using > 1ms-timer. It may bring some overhead, but the preemption is not > supposed to occur frequently and the fairness is more important. > > The thread problem also exists in Linux platform. Currently I have no > good idea to identify different threads from the hypervisor''s > perspective. I have a dream that one day those OS guys will export > this information to VMM, a dream that one day our children will live > in a world where virtualization rules. I have a dream today :) > > Thanks! > > -- > Yubin > > On Tue, Nov 3, 2009 at 12:05 AM, George Dunlap > <George.Dunlap@eu.citrix.com> wrote: >> OK, so you want to allow a VM to run so that it can do packet >> processing in the kernel, but once it''s done in the kernel you want to >> preempt the VM again. >> >> An idea I was going to try out is that if a VM receives an interrupt >> (possibly only certain interrupts, like network), let it run for a >> very short amount of time (say, 1ms or 500us). That should be enough >> for it to do its basic packet processing (or audio processing, video >> processing, whatever). True, you''re going to run the "cpu hog" during >> that time, but that will be debited against time he''ll run later. (I >> haven''t tested this idea yet. It may work better with some credit >> algorithms than others.) >> >> The problem with inducing a guest to call schedule(): >> * It may not have any other runnable processes, or it may choose the >> same process to run again; so it may not switch the cr3 anyway. >> * The only reliable way to do it without some kind of >> paravirtualization (if even a kernel driver) would be to give it a >> timer interrupt, which may mess up other things on the system, such as >> the system time. >> >> If you''re really keen to preempt on return to userspace, you could try >> something like the following. Before delivering the interrupt, note >> the EIP the guest is at. If it''s in user space, set a hardware >> breakpoint at that address. Then deliver the interrupt. If the guest >> calls schedule(), you can catch the CR3 switch; if it returns to the >> same process, it will hit the breakpoint. >> >> Two possible problems: >> * For reasons of ancient history, the iret instruction may set the RF >> flag in the EFLAGS register, which will cause the breakpoint not to >> fire after the guest iret. You may need to decode the instruction and >> set the breakpoint at the instruction after, or something like that. >> * I believe windows doens''t do a cr3 switch if it does a *thread* >> switch. If so, on a thread switch you''ll get neither the CR3 switch >> nor the breakpoint (since the other thread is probably running >> somewhere else). >> >> Peace, >> -George >> >> On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@gmail.com> wrote: >>> Hi, George, >>> >>> Thank you for your reply. Actually, I''m looking for a generic >>> mechanism of cooperative scheduling. The independence of guest OS can >>> make such mechanism more convincing and practical, just like the >>> balloon driver does. >>> >>> Maybe you are wondering why I asked such a wired question, let me >>> describe it with more details. My current work is based on "Task-aware >>> VM scheduling", which is published on VEE''09. By monitoring CR3 >>> changing at VMM level, Xen can get information of tasks'' CPU >>> consumption to identify CPU hogs and I/O tasks. Therefore, the >>> task-aware mechanism offers a more fine-grained scheduler than the >>> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O >>> tasks in a mixed style. >>> >>> Imagine there are n VMs. One of them, named mix-VM, runs two tasks: >>> cpuhog and iotask (network). The other VMs, named CPU-VM, run just >>> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows). >>> >>> Here''s what supposed to happen when iotask receiving an network >>> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an >>> inter-domain event to mix-VM, which is likely to be in run-queue. Xen >>> then schedules it to run immediately and set its state to >>> preempting-state. Right after that, the mix-VM *should* schedules >>> iotask to process the incoming packet, and then schedules cpuhog after >>> processing. When the CR3 is changing to cpuhog, Xen knows that the >>> mix-VM has finished I/O processing (here we assume that the priority >>> of cpuhog is usually lower than iotask in most OS), and schedules the >>> mix-VM out to finish its preempting-state. Therefore, the mix-VM can >>> preempt other VMs to process I/O ASAP, while making the preempting >>> time as short as possible to keep fairness. The point is: cpuhog >>> should not run in preempting-state. >>> >>> However, a problem arises when the mix-VM sending packets. When iotask >>> sends an amount of data (using TCP protocol), it will block and wait >>> to be waked up after guest kernel sending all the data, which may be >>> split into thousands of TCP packets. The mix-VM will receives an ACK >>> packet every time it sending a packet, which makes it enter >>> preempting-state. Note that at this moment, the CR3 of mix-VM is >>> cpuhog''s (as the only running process). After the guest kernel >>> processing the ACK packet and sending next packet, it switches to user >>> mode, which means the cpuhog gets to run in preempting-state. The >>> point is: as there is no CR3-changing, Xen has no way to run. >>> >>> One way is to add a hook at user/kernel mode switching, then Xen can >>> catch the moment when cpuhog gets to run. However, this way costs too >>> much. Another way is to force a VM to schedule when it entering >>> preempting-state. Therefore, it will trap to Xen when CR3 is changed, >>> and Xen can finish its preempting-state when it schedules cpuhog to >>> run. That''s why I want to trigger guest context switch from Xen. I >>> don''t really care *which* process it will switch to, I just want to >>> get Xen a chance to run. The point is: is there a better/simpler way >>> to solve this problem? >>> >>> Hope I described the problem clearly. And would you please show more >>> details about the thought of "reschedule event channel"? Thanks! >>> >>> -- >>> Yubin >>> >>> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap >>> <George.Dunlap@eu.citrix.com> wrote: >>>> Context switching is a choice the guest OS has to make, and how that''s >>>> done will differ based on the operating system. I think if you''re >>>> thinking about modifying the guest scheduler, you''re probably better >>>> off starting with Linux. Even if there''s a way to convince Windows to >>>> call schedule() to pick a new process, I''m not sure you''ll be able to >>>> tell it *which* process to choose. >>>> >>>> As far as mechanism on Xen''s side, it would be easy enough to allocate >>>> a "reschedule" event channel for the guest, such that whenever you >>>> want to trigger a guest reschedule, just raise the event channel. >>>> >>>> -George >>>> >>>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@gmail.com> wrote: >>>>> Hi, all, >>>>> >>>>> As I''m doing some research in cooperative scheduling between Xen and >>>>> guest domain, I want to know how many ways can Xen trigger a context >>>>> switch inside an HVM guest domain (which runs Windows in my case). Do >>>>> I have to write a driver (like balloon-driver)? Or a user process is >>>>> enough? Or there is an even simpler way? >>>>> >>>>> All your suggestions are appreciated. Thanks! :) >>>>> >>>>> -- >>>>> Yubin >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>>>> >>>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel