This patch improves the performance of Standard VGA, the mode used during Windows boot and by the Linux splash screen. It does so by buffering all the stdvga programmed output ops and memory mapped ops (both reads and writes) that are sent to QEMU. We maintain locally essential VGA state so we can respond immediately to input and read ops without waiting for QEMU. We snoop output and write ops to keep our state up-to-date. PIO input ops are satisfied from cached state without bothering QEMU. PIO output and mmio ops are passed through to QEMU, including mmio read ops. This is necessary because mmio reads can have side effects. I have changed the format of the buffered_iopage. It used to contain 80 elements of type ioreq_t (48 bytes each). Now it contains 672 elements of type buf_ioreq_t (6 bytes each). Being able to pipeline 8 times as many ops improves VGA performance by a factor of 8. I changed hvm_buffered_io_intercept to use the same registration and callback mechanism as hvm_portio_intercept rather than the hacky hardcoding it used before. In platform.c, I fixed send_timeoffset_req() to sets its ioreq size to 8 (rather than 4), and its count to 1 (which was missing). Signed-off-by: Ben Guthro <bguthro@virtualron.com> Signed-off-by: Robert Phillips <rphillips@virtualiron.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 24/10/07 22:36, "Ben Guthro" <bguthro@virtualiron.com> wrote:> This patch improves the performance of Standard VGA, > the mode used during Windows boot and by the Linux > splash screen. > > It does so by buffering all the stdvga programmed output ops > and memory mapped ops (both reads and writes) that are sent to QEMU.How much benefit comes from immediate servicing of PIO input ops versus the massive increase in buffered-io slots? Removing the former optimisation would certainly make the patch a lot smaller! What happens across save/restore? The hypervisor''s state cache will go away, won''t it? I suppose it''s okay if the guest is in SVGA LFB mode at that point (actually, that''s another thing - do you correctly handle hand-off between VGA and SVGA modes), but I don''t know that we want to rely on that. -- Keir> We maintain locally essential VGA state so we can respond > immediately to input and read ops without waiting for > QEMU. We snoop output and write ops to keep our state > up-to-date. > > PIO input ops are satisfied from cached state without > bothering QEMU. > > PIO output and mmio ops are passed through to QEMU, including > mmio read ops. This is necessary because mmio reads > can have side effects. > > I have changed the format of the buffered_iopage. > It used to contain 80 elements of type ioreq_t (48 bytes each). > Now it contains 672 elements of type buf_ioreq_t (6 bytes each). > Being able to pipeline 8 times as many ops improves > VGA performance by a factor of 8. > > I changed hvm_buffered_io_intercept to use the same > registration and callback mechanism as hvm_portio_intercept > rather than the hacky hardcoding it used before. > > In platform.c, I fixed send_timeoffset_req() to sets its > ioreq size to 8 (rather than 4), and its count to 1 (which > was missing)._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Good questions, Keir. Answers below: On 10/25/07, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > On 24/10/07 22:36, "Ben Guthro" <bguthro@virtualiron.com> wrote: > > > This patch improves the performance of Standard VGA, > > the mode used during Windows boot and by the Linux > > splash screen. > > > > It does so by buffering all the stdvga programmed output ops > > and memory mapped ops (both reads and writes) that are sent to QEMU. > > How much benefit comes from immediate servicing of PIO input ops versus > the > massive increase in buffered-io slots? Removing the former optimisation > would certainly make the patch a lot smaller!Subjectively, the performance improvement appears substantial. We have tested the code with the stdvga emulation and with and without the increased number of slots. With more slots the screen painting goes from being fast to very fast. As you''ve noticed, the increase in number of slots is compensated by the decrease in slot size (so there is no increase in memory use) at the cost of packing (and unpacking) ioreqs as they are written to (and read from) the buffer. What happens across save/restore? The hypervisor''s state cache will go away,> won''t it? I suppose it''s okay if the guest is in SVGA LFB mode at that > point > (actually, that''s another thing - do you correctly handle hand-off between > VGA and SVGA modes), but I don''t know that we want to rely on that.This hasn''t been a problem in practice. The guest quickly switches from VGA to SVGA mode causing the stdvga code to be largely inactive, and we have only seen it switch back when the guest blue-screens. The stdvga code detects that transition correctly and paints the blue-screen quickly. After a restore, the code assumes it is not in standard VGA mode so is largely inactive. That conservative assumption might not be optimal but it is correct. I don''t believe the failure to save/restore the stdvga cache will prove problematic but if it becomes so I will add corrective code. One might ask (and we did) what is the point of all this VGA emulation code when it is only active during the boot process (or during blue-screen painting). The answer is that one wants the user''s first experience with Xen to be positive; as watching an excruciatingly slow Windows boot screen or Linux splash panel is not. -- rsp -- Keir> > > We maintain locally essential VGA state so we can respond > > immediately to input and read ops without waiting for > > QEMU. We snoop output and write ops to keep our state > > up-to-date. > > > > PIO input ops are satisfied from cached state without > > bothering QEMU. > > > > PIO output and mmio ops are passed through to QEMU, including > > mmio read ops. This is necessary because mmio reads > > can have side effects. > > > > I have changed the format of the buffered_iopage. > > It used to contain 80 elements of type ioreq_t (48 bytes each). > > Now it contains 672 elements of type buf_ioreq_t (6 bytes each). > > Being able to pipeline 8 times as many ops improves > > VGA performance by a factor of 8. > > > > I changed hvm_buffered_io_intercept to use the same > > registration and callback mechanism as hvm_portio_intercept > > rather than the hacky hardcoding it used before. > > > > In platform.c, I fixed send_timeoffset_req() to sets its > > ioreq size to 8 (rather than 4), and its count to 1 (which > > was missing). > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- -------------------------------------------------------------------- Robert S. Phillips Virtual Iron Software rphillips@virtualiron.com Tower 1, Floor 2 978-849-1220 900 Chelmsford Street Lowell, MA 01851 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/10/07 16:28, "Robert Phillips" <rsp.vi.xen@gmail.com> wrote:>> How much benefit comes from immediate servicing of PIO input ops versus the >> massive increase in buffered-io slots? Removing the former optimisation >> would certainly make the patch a lot smaller! > > Subjectively, the performance improvement appears substantial. We have tested > the code with the stdvga emulation and with and without the increased number > of slots. With more slots the screen painting goes from being fast to very > fast. > > As you''ve noticed, the increase in number of slots is compensated by the > decrease in slot size (so there is no increase in memory use) at the cost of > packing (and unpacking) ioreqs as they are written to (and read from) the > buffer.I guess what I¹m really interested in is the performance /with/ the increased number of slots and with versus without the stdvga emulation. Since it¹s the stdvga emulation that really adds the complexity. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
The performance is poor with increased slots and without emulation. As presented the emulation code treats the buffered-io slots as an asynchronous queue. The stdvga emulator pushes ioreqs into the queue but need not wait for any response because it can satisfy read requests locally. (The only time it must wait is when the queue becomes full.) Without the emulation, the code must block on each read (of which there are many) waiting for QEMU to provide an answer. This really slows things down and renders the buffer largely useless. I don''t believe it ever gets full; there are never enough consecutive writes to fill it. With both increased slots and emulation, the performance feels so very much better. Like taking a stone out of your shoe. :-) -- rsp On 10/25/07, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > On 25/10/07 16:28, "Robert Phillips" <rsp.vi.xen@gmail.com> wrote: > > How much benefit comes from immediate servicing of PIO input ops versus > the > massive increase in buffered-io slots? Removing the former optimisation > would certainly make the patch a lot smaller! > > > Subjectively, the performance improvement appears substantial. We have > tested the code with the stdvga emulation and with and without the increased > number of slots. With more slots the screen painting goes from being fast to > very fast. > > As you''ve noticed, the increase in number of slots is compensated by the > decrease in slot size (so there is no increase in memory use) at the cost of > packing (and unpacking) ioreqs as they are written to (and read from) the > buffer. > > > I guess what I''m really interested in is the performance /with/ the > increased number of slots and with versus without the stdvga emulation. > Since it''s the stdvga emulation that really adds the complexity. > > -- Keir >-- -------------------------------------------------------------------- Robert S. Phillips Virtual Iron Software rphillips@virtualiron.com Tower 1, Floor 2 978-849-1220 900 Chelmsford Street Lowell, MA 01851 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alex Williamson
2007-Oct-29 18:48 UTC
[Xen-ia64-devel] Re: [Xen-devel] [PATCH] Std VGA Performance
On Wed, 2007-10-24 at 17:36 -0400, Ben Guthro wrote:> This patch improves the performance of Standard VGA, > the mode used during Windows boot and by the Linux > splash screen.Hi, ia64 uses VGA too. I''ve been able to regain some functionality with the patch below, but the VGA modes used by our firmware still have significant issues (once we boot to Linux userspace, VGA text mode gets readable). It seems like perhaps we''ve lost support for some basic text VGA modes. I haven''t tried to understand the changes in qemu yet, but are we sacrificing compatibility for performance? Patch and screen shot below. Thanks, Alex PS - for xen-ia64-devel, both the Open Source and Intel GFW have issues with EFI text mode w/ this patch (use EFI shell to see it on Intel GFW). Signed-off-by: Alex Williamson <alex.williamson@hp.com> --- diff -r 4034317507de xen/arch/ia64/vmx/mmio.c --- a/xen/arch/ia64/vmx/mmio.c Mon Oct 29 16:49:02 2007 +0000 +++ b/xen/arch/ia64/vmx/mmio.c Mon Oct 29 12:29:18 2007 -0600 @@ -56,10 +56,12 @@ static int hvm_buffered_io_intercept(ior { struct vcpu *v = current; spinlock_t *buffered_io_lock; - buffered_iopage_t *buffered_iopage + buffered_iopage_t *pg (buffered_iopage_t *)(v->domain->arch.hvm_domain.buffered_io_va); - unsigned long tmp_write_pointer = 0; int i; + buf_ioreq_t bp; + /* Timeoffset sends 64b data, but no address. Use two consecutive slots. */ + int qw = 0; /* ignore READ ioreq_t! */ if ( p->dir == IOREQ_READ ) @@ -75,11 +77,41 @@ static int hvm_buffered_io_intercept(ior if ( i == HVM_BUFFERED_IO_RANGE_NR ) return 0; + /* Return 0 for the cases we can''t deal with. */ + if ( p->addr > 0xffffful || p->data_is_ptr || p->df || p->count != 1 ) + return 0; + + bp.type = p->type; + bp.dir = p->dir; + switch (p->size) { + case 1: + bp.size = 0; + break; + case 2: + bp.size = 1; + break; + case 4: + bp.size = 2; + break; + case 8: + bp.size = 3; + qw = 1; + gdprintk(XENLOG_INFO, "quadword ioreq type:%d data:%"PRIx64"\n", + p->type, p->data); + break; + default: + gdprintk(XENLOG_WARNING, "unexpected ioreq size:%"PRId64"\n", p->size); + return 0; + } + + bp.data = p->data; + bp.addr = qw ? ((p->data >> 16) & 0xfffful) : (p->addr & 0xffffful); + buffered_io_lock = &v->domain->arch.hvm_domain.buffered_io_lock; spin_lock(buffered_io_lock); - if ( buffered_iopage->write_pointer - buffered_iopage->read_pointer =- (unsigned long)IOREQ_BUFFER_SLOT_NUM ) { + if ( pg->write_pointer - pg->read_pointer >+ (unsigned long)IOREQ_BUFFER_SLOT_NUM - (qw ? 1 : 0) ) { /* the queue is full. * send the iopacket through the normal path. * NOTE: The arithimetic operation could handle the situation for @@ -89,13 +121,19 @@ static int hvm_buffered_io_intercept(ior return 0; } - tmp_write_pointer = buffered_iopage->write_pointer % IOREQ_BUFFER_SLOT_NUM; - - memcpy(&buffered_iopage->ioreq[tmp_write_pointer], p, sizeof(ioreq_t)); + memcpy(&pg->buf_ioreq[pg->write_pointer % IOREQ_BUFFER_SLOT_NUM], + &bp, sizeof(bp)); + + if (qw) { + bp.data = p->data >> 32; + bp.addr = (p->data >> 48) & 0xfffful; + memcpy(&pg->buf_ioreq[(pg->write_pointer + 1) % IOREQ_BUFFER_SLOT_NUM], + &bp, sizeof(bp)); + } /*make the ioreq_t visible before write_pointer*/ wmb(); - buffered_iopage->write_pointer++; + pg->write_pointer += qw ? 2 : 1; spin_unlock(buffered_io_lock); _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
On 29/10/07 18:48, "Alex Williamson" <alex.williamson@hp.com> wrote:> On Wed, 2007-10-24 at 17:36 -0400, Ben Guthro wrote: >> This patch improves the performance of Standard VGA, >> the mode used during Windows boot and by the Linux >> splash screen. > > Hi, > > ia64 uses VGA too. I''ve been able to regain some functionality with > the patch below, but the VGA modes used by our firmware still have > significant issues (once we boot to Linux userspace, VGA text mode gets > readable). It seems like perhaps we''ve lost support for some basic text > VGA modes. I haven''t tried to understand the changes in qemu yet, but > are we sacrificing compatibility for performance? Patch and screen shot > below. Thanks,All that''s changed for ia64 is the definition of the buffered ioreq structure, which has become more densely packed. All the rest of the acceleration is (currently) x86-specific. So this shouldn''t be too hard to track down... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Mon, 2007-10-29 at 19:17 +0000, Keir Fraser wrote:> > All that''s changed for ia64 is the definition of the buffered ioreq > structure, which has become more densely packed. All the rest of the > acceleration is (currently) x86-specific. So this shouldn''t be too hard to > track down...Yes, you''re right, but easy to overlook, and I''m not sure how it works on x86. I copied the x86 code for filling in the buffered ioreq, but failed to notice that it attempts to store 4 bytes of data into a 2 byte field... The comment for the size entry in buf_ioreq could be interpreted that only 1, 2, and 8 bytes are expected, but I definitely see 4 bytes on occasion. I''d guess x86 has a bug here that''s simply not exposed because of the 16bit code that''s probably being used to initialize VGA. I also question the 8 byte support, which is why I skipped it in the patch below. Wouldn''t an 8 byte MMIO access that isn''t a timeoffset be possible? Keir, please apply this to the staging tree. Thanks, Alex Signed-off-by: Alex Williamson <alex.williamson@hp.com> -- diff -r 4034317507de xen/arch/ia64/vmx/mmio.c --- a/xen/arch/ia64/vmx/mmio.c Mon Oct 29 16:49:02 2007 +0000 +++ b/xen/arch/ia64/vmx/mmio.c Tue Oct 30 10:03:42 2007 -0600 @@ -55,53 +55,68 @@ static int hvm_buffered_io_intercept(ior static int hvm_buffered_io_intercept(ioreq_t *p) { struct vcpu *v = current; - spinlock_t *buffered_io_lock; - buffered_iopage_t *buffered_iopage + buffered_iopage_t *pg (buffered_iopage_t *)(v->domain->arch.hvm_domain.buffered_io_va); - unsigned long tmp_write_pointer = 0; + buf_ioreq_t bp; int i; + /* Ensure buffered_iopage fits in a page */ + BUILD_BUG_ON(sizeof(buffered_iopage_t) > PAGE_SIZE); + /* ignore READ ioreq_t! */ - if ( p->dir == IOREQ_READ ) - return 0; - - for ( i = 0; i < HVM_BUFFERED_IO_RANGE_NR; i++ ) { - if ( p->addr >= hvm_buffered_io_ranges[i]->start_addr && - p->addr + p->size - 1 < hvm_buffered_io_ranges[i]->start_addr + - hvm_buffered_io_ranges[i]->length ) + if (p->dir == IOREQ_READ) + return 0; + + for (i = 0; i < HVM_BUFFERED_IO_RANGE_NR; i++) { + if (p->addr >= hvm_buffered_io_ranges[i]->start_addr && + p->addr + p->size - 1 < hvm_buffered_io_ranges[i]->start_addr + + hvm_buffered_io_ranges[i]->length) break; } - if ( i == HVM_BUFFERED_IO_RANGE_NR ) - return 0; - - buffered_io_lock = &v->domain->arch.hvm_domain.buffered_io_lock; - spin_lock(buffered_io_lock); - - if ( buffered_iopage->write_pointer - buffered_iopage->read_pointer =- (unsigned long)IOREQ_BUFFER_SLOT_NUM ) { + if (i == HVM_BUFFERED_IO_RANGE_NR) + return 0; + + bp.type = p->type; + bp.dir = p->dir; + switch (p->size) { + case 1: + bp.size = 0; + break; + case 2: + bp.size = 1; + break; + default: + /* Could use quad word semantics, but it only appears + * to be useful for timeoffset data. */ + return 0; + } + bp.data = (uint16_t)p->data; + bp.addr = (uint32_t)p->addr; + + spin_lock(&v->domain->arch.hvm_domain.buffered_io_lock); + + if (pg->write_pointer - pg->read_pointer == IOREQ_BUFFER_SLOT_NUM) { /* the queue is full. * send the iopacket through the normal path. * NOTE: The arithimetic operation could handle the situation for * write_pointer overflow. */ - spin_unlock(buffered_io_lock); - return 0; - } - - tmp_write_pointer = buffered_iopage->write_pointer % IOREQ_BUFFER_SLOT_NUM; - - memcpy(&buffered_iopage->ioreq[tmp_write_pointer], p, sizeof(ioreq_t)); - - /*make the ioreq_t visible before write_pointer*/ + spin_unlock(&v->domain->arch.hvm_domain.buffered_io_lock); + return 0; + } + + memcpy(&pg->buf_ioreq[pg->write_pointer % IOREQ_BUFFER_SLOT_NUM], + &bp, sizeof(bp)); + + /* Make the ioreq_t visible before write_pointer */ wmb(); - buffered_iopage->write_pointer++; - - spin_unlock(buffered_io_lock); + pg->write_pointer++; + + spin_unlock(&v->domain->arch.hvm_domain.buffered_io_lock); return 1; } - static void low_mmio_access(VCPU *vcpu, u64 pa, u64 *val, size_t s, int dir) { @@ -110,32 +125,36 @@ static void low_mmio_access(VCPU *vcpu, ioreq_t *p; vio = get_vio(v->domain, v->vcpu_id); - if (vio == 0) { - panic_domain(NULL,"bad shared page: %lx", (unsigned long)vio); - } + if (!vio) + panic_domain(NULL, "bad shared page"); + p = &vio->vp_ioreq; + p->addr = pa; p->size = s; p->count = 1; + if (dir == IOREQ_WRITE) + p->data = *val; + else + p->data = 0; + p->data_is_ptr = 0; p->dir = dir; - if (dir==IOREQ_WRITE) // write; - p->data = *val; - else if (dir == IOREQ_READ) - p->data = 0; // clear all bits - p->data_is_ptr = 0; + p->df = 0; p->type = 1; - p->df = 0; p->io_count++; + if (hvm_buffered_io_intercept(p)) { p->state = STATE_IORESP_READY; vmx_io_assist(v); - return; - } else - vmx_send_assist_req(v); - if (dir == IOREQ_READ) { // read + if (dir != IOREQ_READ) + return; + } + + vmx_send_assist_req(v); + if (dir == IOREQ_READ) *val = p->data; - } + return; } @@ -227,16 +246,18 @@ static void legacy_io_access(VCPU *vcpu, ioreq_t *p; vio = get_vio(v->domain, v->vcpu_id); - if (vio == 0) { - panic_domain(NULL,"bad shared page\n"); - } + if (!vio) + panic_domain(NULL, "bad shared page\n"); + p = &vio->vp_ioreq; - p->addr = TO_LEGACY_IO(pa&0x3ffffffUL); + p->addr = TO_LEGACY_IO(pa & 0x3ffffffUL); p->size = s; p->count = 1; p->dir = dir; - if (dir == IOREQ_WRITE) // write; + if (dir == IOREQ_WRITE) p->data = *val; + else + p->data = 0; p->data_is_ptr = 0; p->type = 0; p->df = 0; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yeah, hopefully that''s a bug in the comment. I would expect 4-byte accesses to be possible and be handled. As for 8-byte accesses, they can certainly happen, why not? Unlikely at start of day, but once we''re in x86/64 mode there''s no reason why not. -- Keir On 30/10/07 16:19, "Alex Williamson" <alex.williamson@hp.com> wrote:> Yes, you''re right, but easy to overlook, and I''m not sure how it > works on x86. I copied the x86 code for filling in the buffered ioreq, > but failed to notice that it attempts to store 4 bytes of data into a 2 > byte field... The comment for the size entry in buf_ioreq could be > interpreted that only 1, 2, and 8 bytes are expected, but I definitely > see 4 bytes on occasion. I''d guess x86 has a bug here that''s simply not > exposed because of the 16bit code that''s probably being used to > initialize VGA. I also question the 8 byte support, which is why I > skipped it in the patch below. Wouldn''t an 8 byte MMIO access that > isn''t a timeoffset be possible? Keir, please apply this to the staging > tree. Thanks,_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alex Williamson
2007-Oct-30 16:40 UTC
[Xen-ia64-devel] Re: [PATCH] Re: [Xen-devel] [PATCH] Std VGA Performance
On Tue, 2007-10-30 at 16:24 +0000, Keir Fraser wrote:> Yeah, hopefully that''s a bug in the comment. I would expect 4-byte accesses > to be possible and be handled. As for 8-byte accesses, they can certainly > happen, why not? Unlikely at start of day, but once we''re in x86/64 mode > there''s no reason why not.Right, and it seems that the "quadword" handling is quite specific to timeoffset. It takes advantage of the fact that that there''s no address and stuff some of the data in there. So, I think both 4 & 8 byte buffered mmio is likely broken right now on x86. Thanks, Alex -- Alex Williamson HP Open Source & Linux Org. _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@lists.xensource.com http://lists.xensource.com/xen-ia64-devel
On 30/10/07 16:40, "Alex Williamson" <alex.williamson@hp.com> wrote:> On Tue, 2007-10-30 at 16:24 +0000, Keir Fraser wrote: >> Yeah, hopefully that''s a bug in the comment. I would expect 4-byte accesses >> to be possible and be handled. As for 8-byte accesses, they can certainly >> happen, why not? Unlikely at start of day, but once we''re in x86/64 mode >> there''s no reason why not. > > Right, and it seems that the "quadword" handling is quite specific to > timeoffset. It takes advantage of the fact that that there''s no address > and stuff some of the data in there. So, I think both 4 & 8 byte > buffered mmio is likely broken right now on x86. Thanks,I guess we''ll see how testing goes over the next little while. If the bufioreq changes prove to be broken we can back them out before 3.2.0, or better yet fix them ;-). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Robert Phillips
2007-Oct-31 19:28 UTC
Re: [PATCH] Re: [Xen-devel] [PATCH] Std VGA Performance
Alex is correct; there is a bug with size=32 operations. The fix is simple. We''ll submit an updated patch very soon. -- rsp On 10/30/07, Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote:> > On 30/10/07 16:40, "Alex Williamson" <alex.williamson@hp.com> wrote: > > > On Tue, 2007-10-30 at 16:24 +0000, Keir Fraser wrote: > >> Yeah, hopefully that''s a bug in the comment. I would expect 4-byte > accesses > >> to be possible and be handled. As for 8-byte accesses, they can > certainly > >> happen, why not? Unlikely at start of day, but once we''re in x86/64 > mode > >> there''s no reason why not. > > > > Right, and it seems that the "quadword" handling is quite specific to > > timeoffset. It takes advantage of the fact that that there''s no address > > and stuff some of the data in there. So, I think both 4 & 8 byte > > buffered mmio is likely broken right now on x86. Thanks, > > I guess we''ll see how testing goes over the next little while. If the > bufioreq changes prove to be broken we can back them out before 3.2.0, or > better yet fix them ;-). > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel