Stefano Stabellini
2012-Jan-27 12:26 UTC
[PATCH v2 0/5] prevent Qemu from waking up needlessly
Hi all, this small patch series prevents Qemu from waking up needlessly on Xen several times a second in order to check some timers. The first patch stops Qemu from emulating the PIT on Xen, the second patch disables the rtc_clock entirely. The third patch makes use of a new mechanism to receive buffered io event notifications from Xen, so that Qemu doesn''t need to check the buffered io page for data 10 times a sec for the entire life of the VM. The fourth patch changes qemu_next_alarm_deadline only to check the expire time of a clock if it is enabled. Finally the last patch increases the default select timeout to 1h: nothing should rely on the select timeout to be 1sec, so we might as well increase it to 1h. Shortlog and diffstat follow: Stefano Stabellini (5): xen: do not initialize the interval timer emulator xen: disable rtc_clock xen: introduce an event channel for buffered io event notifications qemu_next_alarm_deadline: check the expire time of a clock only if it is enabled qemu_calculate_timeout: increase minimum timeout to 1h hw/pc.c | 7 +++++-- qemu-timer.c | 12 +++++------- xen-all.c | 43 +++++++++++++++++++++++++++++++++++++------ 3 files changed, 47 insertions(+), 15 deletions(-) A git tree available here: git://xenbits.xen.org/people/sstabellini/qemu-dm.git timers-2 Cheers, Stefano
Stefano Stabellini
2012-Jan-27 12:26 UTC
[PATCH v2 1/5] xen: do not initialize the interval timer emulator
PIT is emulated by the hypervisor so we don''t need to emulate it in Qemu: this patch prevents Qemu from waking up needlessly at PIT_FREQ on Xen. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- hw/pc.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 85304cf..7a7ce98 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -43,6 +43,7 @@ #include "ui/qemu-spice.h" #include "memory.h" #include "exec-memory.h" +#include "arch_init.h" /* output Bochs bios info messages */ //#define DEBUG_BIOS @@ -1130,7 +1131,7 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi, DriveInfo *fd[MAX_FD]; qemu_irq rtc_irq = NULL; qemu_irq *a20_line; - ISADevice *i8042, *port92, *vmmouse, *pit; + ISADevice *i8042, *port92, *vmmouse, *pit = NULL; qemu_irq *cpu_exit_irq; register_ioport_write(0x80, 1, 1, ioport80_write, NULL); @@ -1151,7 +1152,9 @@ void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi, qemu_register_boot_set(pc_boot_set, *rtc_state); - pit = pit_init(isa_bus, 0x40, 0); + if (!xen_available()) { + pit = pit_init(isa_bus, 0x40, 0); + } pcspk_init(pit); for(i = 0; i < MAX_SERIAL_PORTS; i++) { -- 1.7.2.5
rtc_clock is only used by the RTC emulator (mc146818rtc.c), however Xen has its own RTC emulator in the hypervisor so we can disable it. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/xen-all.c b/xen-all.c index d1fc597..bf183f7 100644 --- a/xen-all.c +++ b/xen-all.c @@ -513,6 +513,7 @@ void xen_vcpu_init(void) qemu_register_reset(xen_reset_vcpu, first_cpu); xen_reset_vcpu(first_cpu); } + qemu_clock_enable(rtc_clock, false); } /* get the ioreq packets from share mem */ -- 1.7.2.5
Stefano Stabellini
2012-Jan-27 12:26 UTC
[PATCH v2 3/5] xen: introduce an event channel for buffered io event notifications
Use the newly introduced HVM_PARAM_BUFIOREQ_EVTCHN to receive notifications for buffered io events. After the first notification is received leave the event channel masked and setup a timer to process the rest of the batch. Once we have completed processing the batch, unmask the event channel and delete the timer. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- xen-all.c | 42 ++++++++++++++++++++++++++++++++++++------ 1 files changed, 36 insertions(+), 6 deletions(-) diff --git a/xen-all.c b/xen-all.c index bf183f7..bfec4c1 100644 --- a/xen-all.c +++ b/xen-all.c @@ -77,6 +77,8 @@ typedef struct XenIOState { QEMUTimer *buffered_io_timer; /* the evtchn port for polling the notification, */ evtchn_port_t *ioreq_local_port; + /* evtchn local port for buffered io */ + evtchn_port_t bufioreq_local_port; /* the evtchn fd for polling */ XenEvtchn xce_handle; /* which vcpu we are serving */ @@ -545,6 +547,12 @@ static ioreq_t *cpu_get_ioreq(XenIOState *state) evtchn_port_t port; port = xc_evtchn_pending(state->xce_handle); + if (port == state->bufioreq_local_port) { + qemu_mod_timer(state->buffered_io_timer, + BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock)); + return NULL; + } + if (port != -1) { for (i = 0; i < smp_cpus; i++) { if (state->ioreq_local_port[i] == port) { @@ -693,16 +701,18 @@ static void handle_ioreq(ioreq_t *req) } } -static void handle_buffered_iopage(XenIOState *state) +static int handle_buffered_iopage(XenIOState *state) { buf_ioreq_t *buf_req = NULL; ioreq_t req; int qw; if (!state->buffered_io_page) { - return; + return 0; } + memset(&req, 0x00, sizeof(req)); + while (state->buffered_io_page->read_pointer != state->buffered_io_page->write_pointer) { buf_req = &state->buffered_io_page->buf_ioreq[ state->buffered_io_page->read_pointer % IOREQ_BUFFER_SLOT_NUM]; @@ -727,15 +737,21 @@ static void handle_buffered_iopage(XenIOState *state) xen_mb(); state->buffered_io_page->read_pointer += qw ? 2 : 1; } + + return req.count; } static void handle_buffered_io(void *opaque) { XenIOState *state = opaque; - handle_buffered_iopage(state); - qemu_mod_timer(state->buffered_io_timer, - BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock)); + if (handle_buffered_iopage(state)) { + qemu_mod_timer(state->buffered_io_timer, + BUFFER_IO_MAX_DELAY + qemu_get_clock_ms(rt_clock)); + } else { + qemu_del_timer(state->buffered_io_timer); + xc_evtchn_unmask(state->xce_handle, state->bufioreq_local_port); + } } static void cpu_handle_ioreq(void *opaque) @@ -865,7 +881,6 @@ static void xen_main_loop_prepare(XenIOState *state) state->buffered_io_timer = qemu_new_timer_ms(rt_clock, handle_buffered_io, state); - qemu_mod_timer(state->buffered_io_timer, qemu_get_clock_ms(rt_clock)); if (evtchn_fd != -1) { qemu_set_fd_handler(evtchn_fd, cpu_handle_ioreq, NULL, state); @@ -917,6 +932,7 @@ int xen_hvm_init(void) { int i, rc; unsigned long ioreq_pfn; + unsigned long bufioreq_evtchn; XenIOState *state; state = g_malloc0(sizeof (XenIOState)); @@ -966,6 +982,20 @@ int xen_hvm_init(void) state->ioreq_local_port[i] = rc; } + rc = xc_get_hvm_param(xen_xc, xen_domid, HVM_PARAM_BUFIOREQ_EVTCHN, + &bufioreq_evtchn); + if (rc < 0) { + fprintf(stderr, "failed to get HVM_PARAM_BUFIOREQ_EVTCHN\n"); + return -1; + } + rc = xc_evtchn_bind_interdomain(state->xce_handle, xen_domid, + (uint32_t)bufioreq_evtchn); + if (rc == -1) { + fprintf(stderr, "bind interdomain ioctl error %d\n", errno); + return -1; + } + state->bufioreq_local_port = rc; + /* Init RAM management */ xen_map_cache_init(); xen_ram_init(ram_size); -- 1.7.2.5
Stefano Stabellini
2012-Jan-27 12:26 UTC
[PATCH v2 4/5] qemu_next_alarm_deadline: check the expire time of a clock only if it is enabled
Also delta in qemu_next_alarm_deadline is a 64 bit value so set the default to INT64_MAX instead of INT32_MAX. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- qemu-timer.c | 10 ++++------ 1 files changed, 4 insertions(+), 6 deletions(-) diff --git a/qemu-timer.c b/qemu-timer.c index cd026c6..648db1d 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -106,23 +106,21 @@ static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) static int64_t qemu_next_alarm_deadline(void) { - int64_t delta; + int64_t delta = INT64_MAX; int64_t rtdelta; - if (!use_icount && vm_clock->active_timers) { + if (!use_icount && vm_clock->enabled && vm_clock->active_timers) { delta = vm_clock->active_timers->expire_time - qemu_get_clock_ns(vm_clock); - } else { - delta = INT32_MAX; } - if (host_clock->active_timers) { + if (host_clock->enabled && host_clock->active_timers) { int64_t hdelta = host_clock->active_timers->expire_time - qemu_get_clock_ns(host_clock); if (hdelta < delta) { delta = hdelta; } } - if (rt_clock->active_timers) { + if (rt_clock->enabled && rt_clock->active_timers) { rtdelta = (rt_clock->active_timers->expire_time - qemu_get_clock_ns(rt_clock)); if (rtdelta < delta) { -- 1.7.2.5
Stefano Stabellini
2012-Jan-27 12:26 UTC
[PATCH v2 5/5] qemu_calculate_timeout: increase minimum timeout to 1h
There is no reason why the minimum timeout should be 1sec, it could easily be 1h and we would save lots of cpu cycles. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- qemu-timer.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/qemu-timer.c b/qemu-timer.c index 648db1d..b792a32 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -844,6 +844,6 @@ fail: int qemu_calculate_timeout(void) { - return 1000; + return 1000*60*60; } -- 1.7.2.5
On 01/27/2012 01:26 PM, Stefano Stabellini wrote:> rtc_clock is only used by the RTC emulator (mc146818rtc.c), however Xen > has its own RTC emulator in the hypervisor so we can disable it. > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > --- > xen-all.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/xen-all.c b/xen-all.c > index d1fc597..bf183f7 100644 > --- a/xen-all.c > +++ b/xen-all.c > @@ -513,6 +513,7 @@ void xen_vcpu_init(void) > qemu_register_reset(xen_reset_vcpu, first_cpu); > xen_reset_vcpu(first_cpu); > } > + qemu_clock_enable(rtc_clock, false); > } > > /* get the ioreq packets from share mem */Depending on "-rtc clock=vm" or "-rtc clock=rt", this may not be true. Why do you need to instantiate an RTC at all? Paolo
Paolo Bonzini
2012-Jan-27 14:42 UTC
Re: [PATCH v2 4/5] qemu_next_alarm_deadline: check the expire time of a clock only if it is enabled
On 01/27/2012 01:26 PM, Stefano Stabellini wrote:> Also delta in qemu_next_alarm_deadline is a 64 bit value so set the > default to INT64_MAX instead of INT32_MAX. > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > --- > qemu-timer.c | 10 ++++------ > 1 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/qemu-timer.c b/qemu-timer.c > index cd026c6..648db1d 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -106,23 +106,21 @@ static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) > > static int64_t qemu_next_alarm_deadline(void) > { > - int64_t delta; > + int64_t delta = INT64_MAX;I''m worried of overflows elsewhere...> int64_t rtdelta; > > - if (!use_icount&& vm_clock->active_timers) { > + if (!use_icount&& vm_clock->enabled&& vm_clock->active_timers) { > delta = vm_clock->active_timers->expire_time - > qemu_get_clock_ns(vm_clock); > - } else { > - delta = INT32_MAX; > } > - if (host_clock->active_timers) { > + if (host_clock->enabled&& host_clock->active_timers) { > int64_t hdelta = host_clock->active_timers->expire_time - > qemu_get_clock_ns(host_clock); > if (hdelta< delta) { > delta = hdelta; > } > } > - if (rt_clock->active_timers) { > + if (rt_clock->enabled&& rt_clock->active_timers) { > rtdelta = (rt_clock->active_timers->expire_time - > qemu_get_clock_ns(rt_clock)); > if (rtdelta< delta) {The patch is otherwise okay, but without looking more closely at the callers I''m not quite ready to give my Reviewed-by. Paolo
Paolo Bonzini
2012-Jan-27 14:43 UTC
Re: [PATCH v2 5/5] qemu_calculate_timeout: increase minimum timeout to 1h
On 01/27/2012 01:26 PM, Stefano Stabellini wrote:> There is no reason why the minimum timeout should be 1sec, it could > easily be 1h and we would save lots of cpu cycles. > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > --- > qemu-timer.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/qemu-timer.c b/qemu-timer.c > index 648db1d..b792a32 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -844,6 +844,6 @@ fail: > > int qemu_calculate_timeout(void) > { > - return 1000; > + return 1000*60*60; > } >This might break Windows networking, but I''m going to fix it before 1.1 anyway. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Paolo
Stefano Stabellini
2012-Jan-27 15:43 UTC
Re: [PATCH v2 4/5] qemu_next_alarm_deadline: check the expire time of a clock only if it is enabled
On Fri, 27 Jan 2012, Paolo Bonzini wrote:> On 01/27/2012 01:26 PM, Stefano Stabellini wrote: > > Also delta in qemu_next_alarm_deadline is a 64 bit value so set the > > default to INT64_MAX instead of INT32_MAX. > > > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > > --- > > qemu-timer.c | 10 ++++------ > > 1 files changed, 4 insertions(+), 6 deletions(-) > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > index cd026c6..648db1d 100644 > > --- a/qemu-timer.c > > +++ b/qemu-timer.c > > @@ -106,23 +106,21 @@ static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) > > > > static int64_t qemu_next_alarm_deadline(void) > > { > > - int64_t delta; > > + int64_t delta = INT64_MAX; > > I''m worried of overflows elsewhere...I think that you are right: mm_rearm_timer and win32_rearm_timer would overflow. I''ll just repost the patch using INT32_MAX.
Stefano Stabellini
2012-Jan-27 15:44 UTC
Re: [PATCH v2 5/5] qemu_calculate_timeout: increase minimum timeout to 1h
On Fri, 27 Jan 2012, Paolo Bonzini wrote:> On 01/27/2012 01:26 PM, Stefano Stabellini wrote: > > There is no reason why the minimum timeout should be 1sec, it could > > easily be 1h and we would save lots of cpu cycles. > > > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > > --- > > qemu-timer.c | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > index 648db1d..b792a32 100644 > > --- a/qemu-timer.c > > +++ b/qemu-timer.c > > @@ -844,6 +844,6 @@ fail: > > > > int qemu_calculate_timeout(void) > > { > > - return 1000; > > + return 1000*60*60; > > } > > > > This might break Windows networking, but I''m going to fix it before 1.1 > anyway. > > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>Thanks!
Stefano Stabellini
2012-Jan-27 15:53 UTC
Re: [PATCH v2 4/5] qemu_next_alarm_deadline: check the expire time of a clock only if it is enabled
On Fri, 27 Jan 2012, Stefano Stabellini wrote:> On Fri, 27 Jan 2012, Paolo Bonzini wrote: > > On 01/27/2012 01:26 PM, Stefano Stabellini wrote: > > > Also delta in qemu_next_alarm_deadline is a 64 bit value so set the > > > default to INT64_MAX instead of INT32_MAX. > > > > > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > > > --- > > > qemu-timer.c | 10 ++++------ > > > 1 files changed, 4 insertions(+), 6 deletions(-) > > > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > > index cd026c6..648db1d 100644 > > > --- a/qemu-timer.c > > > +++ b/qemu-timer.c > > > @@ -106,23 +106,21 @@ static inline int alarm_has_dynticks(struct qemu_alarm_timer *t) > > > > > > static int64_t qemu_next_alarm_deadline(void) > > > { > > > - int64_t delta; > > > + int64_t delta = INT64_MAX; > > > > I''m worried of overflows elsewhere... > > I think that you are right: mm_rearm_timer and win32_rearm_timer would > overflow. > I''ll just repost the patch using INT32_MAX.Actually it is better to fix mm_rearm_timer and win32_rearm_timer so that they won''t overflow anymore. I''ll add a patch to do that.
On Fri, 27 Jan 2012, Paolo Bonzini wrote:> On 01/27/2012 01:26 PM, Stefano Stabellini wrote: > > rtc_clock is only used by the RTC emulator (mc146818rtc.c), however Xen > > has its own RTC emulator in the hypervisor so we can disable it. > > > > Signed-off-by: Stefano Stabellini<stefano.stabellini@eu.citrix.com> > > --- > > xen-all.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/xen-all.c b/xen-all.c > > index d1fc597..bf183f7 100644 > > --- a/xen-all.c > > +++ b/xen-all.c > > @@ -513,6 +513,7 @@ void xen_vcpu_init(void) > > qemu_register_reset(xen_reset_vcpu, first_cpu); > > xen_reset_vcpu(first_cpu); > > } > > + qemu_clock_enable(rtc_clock, false); > > } > > > > /* get the ioreq packets from share mem */ > > Depending on "-rtc clock=vm" or "-rtc clock=rt", this may not be true.Good point. I should check for rtc_clock == host_clock.> Why do you need to instantiate an RTC at all?I don''t, in fact in previous versions of this patch series I tried to do that but it is difficult and makes the common code worse, so I followed Anthony''s suggestion of just disabling the rtc_clock.
On 01/30/2012 12:56 PM, Stefano Stabellini wrote:>> > Depending on "-rtc clock=vm" or "-rtc clock=rt", this may not be true. > > Good point. > I should check for rtc_clock == host_clock. > >> > Why do you need to instantiate an RTC at all? > I don''t, in fact in previous versions of this patch series I tried to do > that but it is difficult and makes the common code worse, so I followed > Anthony''s suggestion of just disabling the rtc_clock.I see. It shouldn''t surprise you that I completely disagree with Anthony on this. :) Paolo
On Mon, 30 Jan 2012, Paolo Bonzini wrote:> On 01/30/2012 12:56 PM, Stefano Stabellini wrote: > >> > Depending on "-rtc clock=vm" or "-rtc clock=rt", this may not be true. > > > > Good point. > > I should check for rtc_clock == host_clock. > > > >> > Why do you need to instantiate an RTC at all? > > I don''t, in fact in previous versions of this patch series I tried to do > > that but it is difficult and makes the common code worse, so I followed > > Anthony''s suggestion of just disabling the rtc_clock. > > I see. It shouldn''t surprise you that I completely disagree with > Anthony on this. :)I am OK with both approaches, but I can see the benefits of reducing this patch to (almost) a one-liner.
On 01/30/2012 05:59 AM, Paolo Bonzini wrote:> On 01/30/2012 12:56 PM, Stefano Stabellini wrote: >>> > Depending on "-rtc clock=vm" or "-rtc clock=rt", this may not be true. >> >> Good point. >> I should check for rtc_clock == host_clock. >> >>> > Why do you need to instantiate an RTC at all? >> I don''t, in fact in previous versions of this patch series I tried to do >> that but it is difficult and makes the common code worse, so I followed >> Anthony''s suggestion of just disabling the rtc_clock. > > I see. It shouldn''t surprise you that I completely disagree with Anthony on > this. :)Give me some credit at least... The original patches didn''t disable the RTC, it introduced a half-neutered Xen specific RTC. My original suggestion (which I still think is the best approach) is to change the RTC emulation to not use a second timer at all. Regards, Anthony Liguori> Paolo >
On 01/30/2012 08:28 PM, Anthony Liguori wrote:>> >> I see. It shouldn''t surprise you that I completely disagree with >> Anthony on this. :) > > Give me some credit at least... > > The original patches didn''t disable the RTC, it introduced a > half-neutered Xen specific RTC.Ah. :)> My original suggestion (which I still think is the best approach) is to > change the RTC emulation to not use a second timer at all.Yes, that would be really the best approach. I tried and it''s pretty hard, but we can give it a second try. Paolo