This patch series is the result of some analysis to improve the boot peformance of HVM guests for XenServer. Tests were performed with Win7 as the HVM guest, but the fixes apply to all HVM guests. All improvements are by reducing the number of PIO traps from rombios, through Xen to Qemu; some completely needless and some by knowing how Qemu is going to react. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper
2012-Jul-30 19:47 UTC
[PATCH 1 of 5] rombios/keyboard: Don''t needlessly poll the status register
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2012-Jul-30 19:47 UTC
[PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2012-Jul-30 19:47 UTC
[PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2012-Jul-30 19:47 UTC
[PATCH 4 of 5] rombios/ata Remove more needless traps from the int 0x13 path
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2012-Jul-30 19:47 UTC
[PATCH 5 of 5] rombios/debug: Reduce verbosity of rombios
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Pasi Kärkkäinen
2012-Jul-30 20:18 UTC
Re: [PATCH 2 of 5] rombios/ata: Do not wait for BSY to be set
On Mon, Jul 30, 2012 at 08:47:21PM +0100, Andrew Cooper wrote:> I can''t find any guarantee in the ATA specification that this will happen, and it > certainly does not with Qemu. SeaBIOS has replaced it with a call to udelay(5) > instead. > > As rombios does not have an equivalent udelay(), so replace the wait with a write > to port 0x80 which is whilelisted by Xen for ''a small delay''. > ^a small typo probably.. I think it should say "whitelisted". -- Pasi
Alan Cox
2012-Jul-30 21:57 UTC
Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
On Mon, 30 Jul 2012 20:47:22 +0100 Andrew Cooper <andrew.cooper3@citrix.com> wrote:> So taking two traps when one will do is pointless. This codepath is on the int > 0x13 hot path, and removing it has about a 30% reduction in the number of traps > to Qemu during Win7 boot.You can''t read the status for 400nS after a command issue, so throwing one away is a typical way to handle that. All of this is optimising the wrong thing. The problem is that neither kvm not xen have the most basic prediction handlers in the kernel side exception code so keep hitting qemu. For a 99% of the ATA transfers you can predict the next few in and outs and pre-load them into your trap handler avoiding bouncing into qemu, on a miss you go back into qemu and load the next prediction block (or tree even) That''s the kind of optimisation that will really make it fly. Alan
Andrew Cooper
2012-Jul-31 09:37 UTC
Re: [PATCH 2 of 5] (V2) rombios/ata: Do not wait for BSY to be set
On 30/07/12 21:18, Pasi Kärkkäinen wrote:> On Mon, Jul 30, 2012 at 08:47:21PM +0100, Andrew Cooper wrote: >> I can''t find any guarantee in the ATA specification that this will happen, and it >> certainly does not with Qemu. SeaBIOS has replaced it with a call to udelay(5) >> instead. >> >> As rombios does not have an equivalent udelay(), so replace the wait with a write >> to port 0x80 which is whilelisted by Xen for ''a small delay''. >> ^ > a small typo probably.. I think it should say "whitelisted". > > -- Pasi >D''oh - yes. Attached is version 2 which corrects this, and another grammatical error in the sentence. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com --------------020208080304000707050203 Content-Type: text/x-patch; name="rombios-ata_reset.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="rombios-ata_reset.patch" # HG changeset patch # Parent 2c92985dc53fbc62d1a2975aed968a8bb021c8ef rombios/ata: Do not wait for BSY to be set I can''t find any guarantee in the ATA specification that this will happen, and it certainly does not with Qemu. SeaBIOS has replaced it with a call to udelay(5) instead. As rombios does not have an equivalent udelay(), replace the wait with a write to port 0x80 which is whitelisted by Xen for ''a small delay''. This causes roughly 42k fewer traps to Qemu, which is very roughly 10% of the number of traps during boot of a Win7 guest. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 2c92985dc53f tools/firmware/rombios/rombios.c --- a/tools/firmware/rombios/rombios.c +++ b/tools/firmware/rombios/rombios.c @@ -2914,8 +2914,8 @@ Bit16u device; // 8.2.1 (a) -- set SRST in DC outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN | ATA_CB_DC_SRST); -// 8.2.1 (b) -- wait for BSY - await_ide(BSY, iobase1, 20); +// 8.2.1 (b) -- wait + outb(0x80, 0x00); // 8.2.1 (f) -- clear SRST outb(iobase2+ATA_CB_DC, ATA_CB_DC_HD15 | ATA_CB_DC_NIEN); --------------020208080304000707050203 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --------------020208080304000707050203--
David Vrabel
2012-Jul-31 10:08 UTC
Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
On 30/07/12 22:57, Alan Cox wrote:> On Mon, 30 Jul 2012 20:47:22 +0100 > Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >> So taking two traps when one will do is pointless. This codepath is on the int >> 0x13 hot path, and removing it has about a 30% reduction in the number of traps >> to Qemu during Win7 boot. > > You can''t read the status for 400nS after a command issue, so throwing > one away is a typical way to handle that.This is only relevant when talking to real hardware, the qemu model has no such requirement. Also, I think you mean 400 ns not 400 nanosiemens.> All of this is optimising the wrong thing. > > The problem is that neither kvm not xen have the most basic prediction > handlers in the kernel side exception code so keep hitting qemu.I''d be interested in seeing how you think this will work without knowledge of the emulated device in the hypervisor. How does the predictor know whether accesses have side effects? A better solution would be to avoid most I/O accesses by the BIOS by using PV drivers instead. David
Andrew Cooper
2012-Jul-31 10:28 UTC
Re: [PATCH 3 of 5] rombios/ata: Reading this status register has no relevant side effects
On 30/07/12 22:57, Alan Cox wrote:> On Mon, 30 Jul 2012 20:47:22 +0100 > Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >> So taking two traps when one will do is pointless. This codepath is on the int >> 0x13 hot path, and removing it has about a 30% reduction in the number of traps >> to Qemu during Win7 boot. > You can''t read the status for 400nS after a command issue, so throwing > one away is a typical way to handle that.On real hardware, but virtual hardware has no such restriction. This version of rombios is never going to be running on real hardware.> > All of this is optimising the wrong thing. > > The problem is that neither kvm not xen have the most basic prediction > handlers in the kernel side exception code so keep hitting qemu. > > For a 99% of the ATA transfers you can predict the next few in and outs > and pre-load them into your trap handler avoiding bouncing into qemu, on > a miss you go back into qemu and load the next prediction block (or tree > even) > > That''s the kind of optimisation that will really make it fly.Quite likely, but that is a substantial architectural change, whereas these patches are the result of a few hours of work and many hours of testing.> > Alan >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com