I plan to tag -rc1 later this week. If you have any outstanding patches, please send them to the list now. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-05 08:56 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate
>>> Keir Fraser <keir.fraser@eu.citrix.com> 05.01.10 07:42 >>> >I plan to tag -rc1 later this week. If you have any outstanding patches, >please send them to the list now.While I have a kernel side draft patch implementing a replacement privcmd mmap with proper error indication batch close to ready, I don''t have the libxc and qemu ones even started, yet. To fully finish up the kernel side I wanted to wait for your and possibly other''s opinions on the lifted single shot mapping I suggested earlier today. Plus the way errors are to be propagated may be controversial: Other than originally planned, after the paging patches went in, using a simple bit field won''t do anymore, as we now need at least two bits for indicating all possible states. Right now I''m simply using an array of int-s (returning the actual error codes): typedef struct privcmd_mmap_batch { unsigned int num; /* number of pages to populate */ domid_t dom; /* target domain */ __u64 addr; /* virtual address */ const xen_pfn_t __user *arr; /* array of mfns */ int __user *err; /* array of error codes */ } privcmd_mmap_batch_t; but that could be considered overkill. A non-extensible alternative would be two bit fields (one for error indications, the other for paged-out ones), and another possibility would be to at least use __s16 instead of int for the array to reduce the virtual address space needed. A third possibility, helping in those cases where the caller doesn''t need the MFN array for other than passing to the ioctl, could be to explicitly allow the two pointers to hold the same address (i.e. documenting that the output will never overwrite unconsumed input). In any case I''m of the opinion that the tools limitations with the old ioctl should be eliminated before 4.0 gets released - I had hoped that someone with better knowledge of the tools than I have would approach this, but since no-one showed up I''ll try to. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-05 09:06 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate
On 05/01/2010 08:56, "Jan Beulich" <JBeulich@novell.com> wrote:>>>> Keir Fraser <keir.fraser@eu.citrix.com> 05.01.10 07:42 >>> >> I plan to tag -rc1 later this week. If you have any outstanding patches, >> please send them to the list now. > > While I have a kernel side draft patch implementing a replacement > privcmd mmap with proper error indication batch close to ready, I > don''t have the libxc and qemu ones even started, yet. To fully > finish up the kernel side I wanted to wait for your and possibly > other''s opinions on the lifted single shot mapping I suggested > earlier today.That sounded okay to me.> In any case I''m of the opinion that the tools limitations with the old > ioctl should be eliminated before 4.0 gets released - I had hoped > that someone with better knowledge of the tools than I have > would approach this, but since no-one showed up I''ll try to.It''s a bit late for 4.0.0 really. The lack of interest is probably the lack of people hitting the 43-bit limitation built into the current interface. Just about noone is anywhere near close to it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Alex Williamson
2010-Jan-05 16:00 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate
On Tue, Jan 5, 2010 at 2:06 AM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:> On 05/01/2010 08:56, "Jan Beulich" <JBeulich@novell.com> wrote: > >> In any case I''m of the opinion that the tools limitations with the old >> ioctl should be eliminated before 4.0 gets released - I had hoped >> that someone with better knowledge of the tools than I have >> would approach this, but since no-one showed up I''ll try to. > > It''s a bit late for 4.0.0 really. The lack of interest is probably the lack > of people hitting the 43-bit limitation built into the current interface. > Just about noone is anywhere near close to it.Perhaps due to the lack of x86 processors supporting more than 40-bits of physical address space that are currently on the market. But we know that''s going to change fairly shortly and x86 will finally get support for more than 1TB. That opens the doors for hardware vendors to create interesting configurations and maybe not worry so much about compressing the address space into a contiguous block. If Xen can''t support at least a 44-bit physical address space within the 4.x lifetime, it could become a serious limiting factor. It seems rather shortsighted not to prepare for it now, especially given the opportunity we have at a major version break. Thanks, Alex _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-05 16:06 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate
On 05/01/2010 16:00, "Alex Williamson" <alex.williamson@hp.com> wrote:>> It''s a bit late for 4.0.0 really. The lack of interest is probably the lack >> of people hitting the 43-bit limitation built into the current interface. >> Just about noone is anywhere near close to it. > > Perhaps due to the lack of x86 processors supporting more than 40-bits > of physical address space that are currently on the market. But we > know that''s going to change fairly shortly and x86 will finally get > support for more than 1TB. That opens the doors for hardware vendors > to create interesting configurations and maybe not worry so much about > compressing the address space into a contiguous block. If Xen can''t > support at least a 44-bit physical address space within the 4.x > lifetime, it could become a serious limiting factor. It seems rather > shortsighted not to prepare for it now, especially given the > opportunity we have at a major version break. Thanks,If it''s considered important we can hold up until next week. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-06 13:50 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:> I plan to tag -rc1 later this week. If you have any outstanding patches, > please send them to the list now. >Hmm.. I just remembered this pygrub bug: https://bugzilla.redhat.com/show_bug.cgi?id=466681 pygrub doesn''t use O_DIRECT so sometimes it gets old information from dom0 kernel cache - and fails to use the updated domU grub.conf. Redhat seems to have patches available for testing.. not for xen-unstable though. I''ve personally hit this bug many times. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christian Tramnitz
2010-Jan-06 15:00 UTC
[Xen-devel] Re: Tagging Xen 4.0.0 first release candidate
Was this a little too late for the 4.0 discussion or was just noone interested? http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902 Best regards, Christian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2010-Jan-06 21:42 UTC
RE: [Xen-devel] Re: Tagging Xen 4.0.0 first release candidate
> Was this a little too late for the 4.0 discussion or was just noone > interested? > http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902XCP contains a small iso image with the Citrix PV drivers on. I believe they''ve been made to work on xen-unstable, however the binaries are freely distributable but not open source. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-06 21:52 UTC
Re: [Xen-devel] Re: Tagging Xen 4.0.0 first release candidate
On Wed, Jan 06, 2010 at 09:42:19PM +0000, Ian Pratt wrote:> > Was this a little too late for the 4.0 discussion or was just noone > > interested? > > http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902 > > XCP contains a small iso image with the Citrix PV drivers on. I believe they''ve been made to work on xen-unstable, however the binaries are freely distributable but not open source. >xen-3.4-testing.hg also has XCP Windows PV drivers support in it, so upcoming Xen 3.4.3 will also support them. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-07 11:51 UTC
Re: [Xen-devel] Re: Tagging Xen 4.0.0 first release candidate
On Wed, Jan 06, 2010 at 04:00:34PM +0100, Christian Tramnitz wrote:> Was this a little too late for the 4.0 discussion or was just noone > interested? > http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902 >What do you exactly mean with "bundling" gplpv drivers with Xen release? What would it help with? Xen is distributed as source tarball anyway.. Gplpv drivers are already available as a binary from author''s website. I think it''s more flexible when the gplpv drivers are distributed as a separate package, and not part of Xen. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-19 13:57 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate, 44-bit address space support
On Tue, Jan 05, 2010 at 04:06:50PM +0000, Keir Fraser wrote:> On 05/01/2010 16:00, "Alex Williamson" <alex.williamson@hp.com> wrote: > > >> It''s a bit late for 4.0.0 really. The lack of interest is probably the lack > >> of people hitting the 43-bit limitation built into the current interface. > >> Just about noone is anywhere near close to it. > > > > Perhaps due to the lack of x86 processors supporting more than 40-bits > > of physical address space that are currently on the market. But we > > know that''s going to change fairly shortly and x86 will finally get > > support for more than 1TB. That opens the doors for hardware vendors > > to create interesting configurations and maybe not worry so much about > > compressing the address space into a contiguous block. If Xen can''t > > support at least a 44-bit physical address space within the 4.x > > lifetime, it could become a serious limiting factor. It seems rather > > shortsighted not to prepare for it now, especially given the > > opportunity we have at a major version break. Thanks, > > If it''s considered important we can hold up until next week. >Any progress with this 44-bit address space support? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Jan-19 14:02 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate, 44-bit address space support
On 19/01/2010 13:57, "Pasi Kärkkäinen" <pasik@iki.fi> wrote:>>> Perhaps due to the lack of x86 processors supporting more than 40-bits >>> of physical address space that are currently on the market. But we >>> know that''s going to change fairly shortly and x86 will finally get >>> support for more than 1TB. That opens the doors for hardware vendors >>> to create interesting configurations and maybe not worry so much about >>> compressing the address space into a contiguous block. If Xen can''t >>> support at least a 44-bit physical address space within the 4.x >>> lifetime, it could become a serious limiting factor. It seems rather >>> shortsighted not to prepare for it now, especially given the >>> opportunity we have at a major version break. Thanks, >> >> If it''s considered important we can hold up until next week. > > Any progress with this 44-bit address space support?Yes, it''s checked in now, ahead of 4.0.0-rc2. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-19 14:22 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate, 44-bit address space support
On Tue, Jan 19, 2010 at 02:02:46PM +0000, Keir Fraser wrote:> On 19/01/2010 13:57, "Pasi Kärkkäinen" <pasik@iki.fi> wrote: > > >>> Perhaps due to the lack of x86 processors supporting more than 40-bits > >>> of physical address space that are currently on the market. But we > >>> know that''s going to change fairly shortly and x86 will finally get > >>> support for more than 1TB. That opens the doors for hardware vendors > >>> to create interesting configurations and maybe not worry so much about > >>> compressing the address space into a contiguous block. If Xen can''t > >>> support at least a 44-bit physical address space within the 4.x > >>> lifetime, it could become a serious limiting factor. It seems rather > >>> shortsighted not to prepare for it now, especially given the > >>> opportunity we have at a major version break. Thanks, > >> > >> If it''s considered important we can hold up until next week. > > > > Any progress with this 44-bit address space support? > > Yes, it''s checked in now, ahead of 4.0.0-rc2. >Oh, nice, I missed that. Thanks! -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-21 12:28 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:> On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > please send them to the list now. > > > > Hmm.. I just remembered this pygrub bug: > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > Redhat seems to have patches available for testing.. not for > xen-unstable though. > > I''ve personally hit this bug many times. >It seems Redhat guys have a fix available.. they fixed the problem by patching dom0 kernel blkback. More details about the fix here: https://bugzilla.redhat.com/show_bug.cgi?id=466681 Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2010-Jan-21 15:39 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
>>> Pasi Kärkkäinen<pasik@iki.fi> 21.01.10 13:28 >>> >On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: >> pygrub doesn''t use O_DIRECT so sometimes it gets old information >> from dom0 kernel cache - and fails to use the updated domU grub.conf. >> >> Redhat seems to have patches available for testing.. not for >> xen-unstable though. >> >> I''ve personally hit this bug many times. >> > >It seems Redhat guys have a fix available.. they fixed the problem by >patching dom0 kernel blkback. > >More details about the fix here: >https://bugzilla.redhat.com/show_bug.cgi?id=466681 > >Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?Yes, please. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 18:44 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:> On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > please send them to the list now. > > > > > > > Hmm.. I just remembered this pygrub bug: > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > Redhat seems to have patches available for testing.. not for > > xen-unstable though. > > > > I''ve personally hit this bug many times. > > > > It seems Redhat guys have a fix available.. they fixed the problem by > patching dom0 kernel blkback. > > More details about the fix here: > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?Only to 2.6.18. It''s obsolete after 2.6.27. O_DIRECT gained page cache invalidation in the meantime. Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 19:16 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:> On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > please send them to the list now. > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > Redhat seems to have patches available for testing.. not for > > > xen-unstable though. > > > > > > I''ve personally hit this bug many times. > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > patching dom0 kernel blkback. > > > > More details about the fix here: > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > Only to 2.6.18. > > It''s obsolete after 2.6.27. > O_DIRECT gained page cache invalidation in the meantime.Aiiee, sorry. I guess this one only applies to tapdisks. The page cache invalidation only covers the filemap. That obviously won''t fix blkback bios on raw devices. Ian Campbell recently noted he came across a different fix, which adds direct-io to e2fsprogs. http://www.spinics.net/lists/linux-ext4/msg16992.html Any opinions on the tradeoff? Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Jan-21 19:37 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:> On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote: > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > > please send them to the list now. > > > > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > > > Redhat seems to have patches available for testing.. not for > > > > xen-unstable though. > > > > > > > > I''ve personally hit this bug many times. > > > > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > > patching dom0 kernel blkback. > > > > > > More details about the fix here: > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > > > Only to 2.6.18. > > > > It''s obsolete after 2.6.27. > > O_DIRECT gained page cache invalidation in the meantime. > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache > invalidation only covers the filemap. That obviously won''t fix blkback > bios on raw devices. > > Ian Campbell recently noted he came across a different fix, which adds > direct-io to e2fsprogs.I noted the thread because the root problem seemed interesting and worthy of investigation, but I should have made it clear that I didn''t think messing with direct-io in e2fsprogs was the correct solution. I think the majority of the participants in the thread thought that too. The biggest problem is that it only solves the issue in the one specific case of things which use e2fsprogs and not in general, we can''t go round adding O_DIRECT to everything which might be used to access these disks. Ian.> > http://www.spinics.net/lists/linux-ext4/msg16992.html > > Any opinions on the tradeoff? > > Daniel > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-21 21:01 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote:> On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote: > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote: > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > > > please send them to the list now. > > > > > > > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > > > > > Redhat seems to have patches available for testing.. not for > > > > > xen-unstable though. > > > > > > > > > > I''ve personally hit this bug many times. > > > > > > > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > > > patching dom0 kernel blkback. > > > > > > > > More details about the fix here: > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > > > > > Only to 2.6.18. > > > > > > It''s obsolete after 2.6.27. > > > O_DIRECT gained page cache invalidation in the meantime. > > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache > > invalidation only covers the filemap. That obviously won''t fix blkback > > bios on raw devices. > > > > Ian Campbell recently noted he came across a different fix, which adds > > direct-io to e2fsprogs. > > I noted the thread because the root problem seemed interesting and > worthy of investigation, but I should have made it clear that I didn''t > think messing with direct-io in e2fsprogs was the correct solution. I > think the majority of the participants in the thread thought that too. > The biggest problem is that it only solves the issue in the one specific > case of things which use e2fsprogs and not in general, we can''t go round > adding O_DIRECT to everything which might be used to access these disks. >Yeah, it should be fixed in blkback.. who knows, some users might be using other tools in dom0 aswell, not just pygrub. -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-21 21:53 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote:> On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote: > > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote: > > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote: > > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > > > > please send them to the list now. > > > > > > > > > > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > > > > > > > Redhat seems to have patches available for testing.. not for > > > > > > xen-unstable though. > > > > > > > > > > > > I''ve personally hit this bug many times. > > > > > > > > > > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > > > > patching dom0 kernel blkback. > > > > > > > > > > More details about the fix here: > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > > > > > > > Only to 2.6.18. > > > > > > > > It''s obsolete after 2.6.27. > > > > O_DIRECT gained page cache invalidation in the meantime. > > > > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache > > > invalidation only covers the filemap. That obviously won''t fix blkback > > > bios on raw devices. > > > > > > Ian Campbell recently noted he came across a different fix, which adds > > > direct-io to e2fsprogs. > > > > I noted the thread because the root problem seemed interesting and > > worthy of investigation, but I should have made it clear that I didn''t > > think messing with direct-io in e2fsprogs was the correct solution. I > > think the majority of the participants in the thread thought that too. > > The biggest problem is that it only solves the issue in the one specific > > case of things which use e2fsprogs and not in general, we can''t go round > > adding O_DIRECT to everything which might be used to access these disks. > > > > Yeah, it should be fixed in blkback.. who knows, some users might be using > other tools in dom0 aswell, not just pygrub.Fully agreed. But: One thing about the rhel patch isn''t immediately clear to me. The invalidate step apparently goes into the VBD creation. I don''t see why this is sufficient, my understanding was that pygrub would rather read stale data after boot, then run, then shutdown, then reboot. Which rather suggests flushing during shutdown (?). Or rather on both ends. Because 1) installing a guest by copying a VDI image 2) failing to properly close the raw device to get the caches flushed before 3) booting the VM is another potential problem. We used to see the latter becoming an issue in the past. Thanks, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2010-Jan-27 09:27 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Thu, Jan 21, 2010 at 01:53:21PM -0800, Daniel Stodden wrote:> On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote: > > On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote: > > > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote: > > > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote: > > > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > > > > > please send them to the list now. > > > > > > > > > > > > > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > > > > > > > > > Redhat seems to have patches available for testing.. not for > > > > > > > xen-unstable though. > > > > > > > > > > > > > > I''ve personally hit this bug many times. > > > > > > > > > > > > > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > > > > > patching dom0 kernel blkback. > > > > > > > > > > > > More details about the fix here: > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > > > > > > > > > Only to 2.6.18. > > > > > > > > > > It''s obsolete after 2.6.27. > > > > > O_DIRECT gained page cache invalidation in the meantime. > > > > > > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache > > > > invalidation only covers the filemap. That obviously won''t fix blkback > > > > bios on raw devices. > > > > > > > > Ian Campbell recently noted he came across a different fix, which adds > > > > direct-io to e2fsprogs. > > > > > > I noted the thread because the root problem seemed interesting and > > > worthy of investigation, but I should have made it clear that I didn''t > > > think messing with direct-io in e2fsprogs was the correct solution. I > > > think the majority of the participants in the thread thought that too. > > > The biggest problem is that it only solves the issue in the one specific > > > case of things which use e2fsprogs and not in general, we can''t go round > > > adding O_DIRECT to everything which might be used to access these disks. > > > > > > > Yeah, it should be fixed in blkback.. who knows, some users might be using > > other tools in dom0 aswell, not just pygrub. > > Fully agreed. > > But: One thing about the rhel patch isn''t immediately clear to me. The > invalidate step apparently goes into the VBD creation. >With the RH kernel blkback kernel patch/fix: 1) xm create domU 2) pygrub runs, caching stuff in dom0 kernel cache 3) domU is started, the patched blkback driver flushes dom0 kernel cache when the disk backend is created 4) grub.conf is modified in the guest 5) domU shuts down 6) xm create domU 7) pygrub runs, and gets the new updated grub.conf, since there''s nothing in the dom0 kernel cache, since it was flushed in 3) 8) domU is started, blkback again flushes the dom0 kernel cache to prevent future problems That''w how I understood it..> I don''t see why this is sufficient, my understanding was that pygrub > would rather read stale data after boot, then run, then shutdown, then > reboot. > > Which rather suggests flushing during shutdown (?). >disk IO from the domU blkfront is not cached in dom0, so it''s enough to flush during the disk backend creation? pygrub is the only player here who gets stuff in the dom0 cache.> Or rather on both ends. Because 1) installing a guest by copying a VDI > image 2) failing to properly close the raw device to get the caches > flushed before 3) booting the VM is another potential problem. > > We used to see the latter becoming an issue in the past. >I guess it wouldn''t hurt to also flush during shutdown.. ? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel Stodden
2010-Jan-28 19:34 UTC
Re: [Xen-devel] Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
On Wed, 2010-01-27 at 04:27 -0500, Pasi Kärkkäinen wrote:> On Thu, Jan 21, 2010 at 01:53:21PM -0800, Daniel Stodden wrote: > > On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote: > > > On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote: > > > > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote: > > > > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote: > > > > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote: > > > > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote: > > > > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote: > > > > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches, > > > > > > > > > please send them to the list now. > > > > > > > > > > > > > > > > > > > > > > > > > Hmm.. I just remembered this pygrub bug: > > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > > > > > > > pygrub doesn''t use O_DIRECT so sometimes it gets old information > > > > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf. > > > > > > > > > > > > > > > > Redhat seems to have patches available for testing.. not for > > > > > > > > xen-unstable though. > > > > > > > > > > > > > > > > I''ve personally hit this bug many times. > > > > > > > > > > > > > > > > > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by > > > > > > > patching dom0 kernel blkback. > > > > > > > > > > > > > > More details about the fix here: > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681 > > > > > > > > > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell? > > > > > > > > > > > > Only to 2.6.18. > > > > > > > > > > > > It''s obsolete after 2.6.27. > > > > > > O_DIRECT gained page cache invalidation in the meantime. > > > > > > > > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache > > > > > invalidation only covers the filemap. That obviously won''t fix blkback > > > > > bios on raw devices. > > > > > > > > > > Ian Campbell recently noted he came across a different fix, which adds > > > > > direct-io to e2fsprogs. > > > > > > > > I noted the thread because the root problem seemed interesting and > > > > worthy of investigation, but I should have made it clear that I didn''t > > > > think messing with direct-io in e2fsprogs was the correct solution. I > > > > think the majority of the participants in the thread thought that too. > > > > The biggest problem is that it only solves the issue in the one specific > > > > case of things which use e2fsprogs and not in general, we can''t go round > > > > adding O_DIRECT to everything which might be used to access these disks. > > > > > > > > > > Yeah, it should be fixed in blkback.. who knows, some users might be using > > > other tools in dom0 aswell, not just pygrub. > > > > Fully agreed. > > > > But: One thing about the rhel patch isn''t immediately clear to me. The > > invalidate step apparently goes into the VBD creation. > > > > With the RH kernel blkback kernel patch/fix: > > 1) xm create domU > 2) pygrub runs, caching stuff in dom0 kernel cache > 3) domU is started, the patched blkback driver flushes dom0 kernel cache when the disk backend is created > 4) grub.conf is modified in the guest > 5) domU shuts down > > 6) xm create domU > 7) pygrub runs, and gets the new updated grub.conf, since there''s nothing in the dom0 kernel cache, since it was flushed in 3) > 8) domU is started, blkback again flushes the dom0 kernel cache to prevent future problems > > That''w how I understood it.. > > > I don''t see why this is sufficient, my understanding was that pygrub > > would rather read stale data after boot, then run, then shutdown, then > > reboot. > > > > Which rather suggests flushing during shutdown (?). > > > > disk IO from the domU blkfront is not cached in dom0, > so it''s enough to flush during the disk backend creation?Yes. By flush I meant discarding the cache entries left from 2), not some writeback. You''re right, it doesn''t matter as long as the disk is not buffered somewhere while still opened by the backend.> pygrub is the only player here who gets stuff in the dom0 cache.> > Or rather on both ends. Because 1) installing a guest by copying a VDI > > image 2) failing to properly close the raw device to get the caches > > flushed before 3) booting the VM is another potential problem. > > > > We used to see the latter becoming an issue in the past. > > > > I guess it wouldn''t hurt to also flush during shutdown.. ?I don''t really mind. Xenserver accesses the disks only by attaching them to dom0, right now that''s actually my preferred alternative. The blkback thing can hardly hurt, so let''s pull it in. I''d mainly wonder a little barrier utility for the control stack wouldn''t be a way more flexible solution. So instead of patching some particular backend to fix the world of e2fsprogs, or (worse) patching e2fsprogs themselves, let pygrub call some program with in turn does some ioctl(barrier) magic on the node to discard potentially stale mappings. Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel