Ian Campbell
2009-Dec-02 19:28 UTC
[Xen-devel] [PATCH] xen: reduce severity of message about using v1 grant tables.
From: Ian Campbell <ijc@hellion.org.uk> It''s hardly the end of the world... Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Steven Smith <Steven.Smith@eu.citrix.com> --- drivers/xen/grant-table.c | 19 ++++++++++--------- 1 files changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 2240adf..85ce951 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -699,21 +699,22 @@ static void gnttab_request_version(void) gsv.version = 2; rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); - if (rc == 0) { + if (rc == 0) grant_table_version = 2; - printk(KERN_NOTICE "Using V2 grant tables.\n"); - } else { + else { if (grant_table_version == 2) { - /* If we''ve already used version 2 features, - but then suddenly discover that they''re not - available (e.g. migrating to an older - version of Xen), almost unbounded badness - can happen. */ + /* + * If we''ve already used version 2 features, + * but then suddenly discover that they''re not + * available (e.g. migrating to an older + * version of Xen), almost unbounded badness + * can happen. + */ panic("we need grant tables version 2, but only version 1 is available"); } grant_table_version = 1; - printk(KERN_WARNING "Using legacy V1 grant tables; upgrade to a newer version of Xen.\n"); } + printk(KERN_INFO "Grant tables using version %d layout.\n", grant_table_version); } static int gnttab_map(unsigned int start_idx, unsigned int end_idx) -- 1.6.5.3 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-02 19:33 UTC
[Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On 12/02/09 11:28, Ian Campbell wrote:> From: Ian Campbell <ijc@hellion.org.uk> > > It''s hardly the end of the world... > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > Cc: Jeremy Fitzhardinge <jeremy@goop.org> > Cc: Steven Smith <Steven.Smith@eu.citrix.com> > --- > drivers/xen/grant-table.c | 19 ++++++++++--------- > 1 files changed, 10 insertions(+), 9 deletions(-) > > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c > index 2240adf..85ce951 100644 > --- a/drivers/xen/grant-table.c > +++ b/drivers/xen/grant-table.c > @@ -699,21 +699,22 @@ static void gnttab_request_version(void) > > gsv.version = 2; > rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); > - if (rc == 0) { > + if (rc == 0) > grant_table_version = 2; > - printk(KERN_NOTICE "Using V2 grant tables.\n"); > - } else { > + else { > if (grant_table_version == 2) { > - /* If we''ve already used version 2 features, > - but then suddenly discover that they''re not > - available (e.g. migrating to an older > - version of Xen), almost unbounded badness > - can happen. */ > + /* > + * If we''ve already used version 2 features, > + * but then suddenly discover that they''re not > + * available (e.g. migrating to an older > + * version of Xen), almost unbounded badness > + * can happen. > + */ > panic("we need grant tables version 2, but only version 1 is available"); >Does it really need to be a panic? Can''t we just start failing all future operations? Seems bad to take out the whole machine if we can just get away with crippling one device (especially if it can be recovered by downing it and re-upping a new one with nc1 and/or gt1). J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Dec-02 19:54 UTC
[Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On Wed, 2009-12-02 at 19:33 +0000, Jeremy Fitzhardinge wrote:> On 12/02/09 11:28, Ian Campbell wrote: > > From: Ian Campbell <ijc@hellion.org.uk> > > > > It''s hardly the end of the world... > > > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > Cc: Jeremy Fitzhardinge <jeremy@goop.org> > > Cc: Steven Smith <Steven.Smith@eu.citrix.com> > > --- > > drivers/xen/grant-table.c | 19 ++++++++++--------- > > 1 files changed, 10 insertions(+), 9 deletions(-) > > > > diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c > > index 2240adf..85ce951 100644 > > --- a/drivers/xen/grant-table.c > > +++ b/drivers/xen/grant-table.c > > @@ -699,21 +699,22 @@ static void gnttab_request_version(void) > > > > gsv.version = 2; > > rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); > > - if (rc == 0) { > > + if (rc == 0) > > grant_table_version = 2; > > - printk(KERN_NOTICE "Using V2 grant tables.\n"); > > - } else { > > + else { > > if (grant_table_version == 2) { > > - /* If we''ve already used version 2 features, > > - but then suddenly discover that they''re not > > - available (e.g. migrating to an older > > - version of Xen), almost unbounded badness > > - can happen. */ > > + /* > > + * If we''ve already used version 2 features, > > + * but then suddenly discover that they''re not > > + * available (e.g. migrating to an older > > + * version of Xen), almost unbounded badness > > + * can happen. > > + */ > > panic("we need grant tables version 2, but only version 1 is available"); > > > > Does it really need to be a panic? Can''t we just start failing all > future operations? Seems bad to take out the whole machine if we can > just get away with crippling one device (especially if it can be > recovered by downing it and re-upping a new one with nc1 and/or gt1).Wouldn''t there be (failing) grant table ops on the down path? In any case doesn''t it effect all devices since they all use the same grant table? Ian.> > J_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Dec-03 10:28 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On Wed, 2009-12-02 at 19:54 +0000, Ian Campbell wrote:> > > Does it really need to be a panic? Can''t we just start failing all > > future operations? Seems bad to take out the whole machine if we > can > > just get away with crippling one device (especially if it can be > > recovered by downing it and re-upping a new one with nc1 and/or > gt1). > > Wouldn''t there be (failing) grant table ops on the down path? > > In any case doesn''t it effect all devices since they all use the same > grant table?Oh, I see what you meant... in the proper resume case (as opposed to the cancelled suspend/checkpoint case I was thinking of) there should be no grant tables in use at this point so most devices should, in theory, be able to reconnect using v1 grants, any drivers which require v2 grant tables need to check for them in their resume hook as well as at start of day. Unfortunately frontend devices tear down their grant entries after the resume rather than before the suspend (I presume this has to do with faster checkpointing?) which means they could be trying to clear an entry of the wrong layout, leading the unbounded badness that the comment refers to. I think the choices are basically: * Always latch to either v1 or v2 at start of day, if we can''t get the version we want then panic (this is a stronger restriction than the current code which will try to upgrade to v2 on resume) * Write v1<->v2 layout transformations called on gnttab resume before the devices get a chance to try and unmap their old entries. Would need to handle v2 entries sing feature which are not expressible in v1. I''m tempted to go with the former for simplicity, it enables migration to a newer version of Xen (the guest will just keep using v1) but will not allow migration back to an older version of Xen, which is not something we generally support anyway. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2009-Dec-03 12:01 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
> Oh, I see what you meant... in the proper resume case (as opposed to the > cancelled suspend/checkpoint case I was thinking of) there should be no > grant tables in use at this point so most devices should, in theory, be > able to reconnect using v1 grants, any drivers which require v2 grant > tables need to check for them in their resume hook as well as at start > of day. > > Unfortunately frontend devices tear down their grant entries after the > resume rather than before the suspend (I presume this has to do with > faster checkpointing?) which means they could be trying to clear an > entry of the wrong layout, leading the unbounded badness that the > comment refers to.Actually, I think that''d be okay. Drivers should be clearing grant table entries through the gnttab_end_* functions, which always check grant_table_version and do the right thing. gnttab_grant_foreign_access_ref_{subpage,trans} would BUG(), but that could be turned into an error return easily enough. Handling it everywhere would be a pain, though. The only other potential subtlety is handling a suspend while some other vcpu is doing grant table operations, but I think the stop_machine() is sufficient protection against that. So I think a downgrade might actually be safe (despite what I said before). It can only happen if you migrate from a new Xen to an older one, which I didn''t think we supported anyway, but it could be made to work.> I think the choices are basically: > * Always latch to either v1 or v2 at start of day, if we can''t get > the version we want then panic (this is a stronger restriction > than the current code which will try to upgrade to v2 on resume)Yeah, that probably counts as a bug in the current code. If we boot on a Xen which only supports v1 tabled then we should probably stick with v1 until we shut down. panic()ing when v2 isn''t available sounds like overkill, though. Would something like this work? Allow downgrades from v2 grant tables to v1. Completely untested, beyond checking that it actually compiles. Signed-off-by: Steven Smith <sos22@cam.ac.uk.> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index fd619b4..fa971e2 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -210,36 +210,40 @@ int gnttab_grant_foreign_access(domid_t domid, unsigned long frame, } EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access); -void gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid, - unsigned long frame, int flags, - unsigned page_off, - unsigned length) +int gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid, + unsigned long frame, int flags, + unsigned page_off, + unsigned length) { BUG_ON(flags & (GTF_accept_transfer | GTF_reading | GTF_writing | GTF_sub_page | GTF_permit_access)); - BUG_ON(grant_table_version == 1); + if (grant_table_version == 1) + return -EOPNOTSUPP; shared.v2[ref].sub_page.frame = frame; shared.v2[ref].sub_page.page_off = page_off; shared.v2[ref].sub_page.length = length; shared.v2[ref].hdr.domid = domid; wmb(); shared.v2[ref].hdr.flags = GTF_permit_access | GTF_sub_page | flags; + return 0; } EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access_ref_subpage); -void gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid, - int flags, - domid_t trans_domid, - grant_ref_t trans_gref) +int gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid, + int flags, + domid_t trans_domid, + grant_ref_t trans_gref) { BUG_ON(flags & (GTF_accept_transfer | GTF_reading | GTF_writing | GTF_sub_page | GTF_permit_access)); - BUG_ON(grant_table_version == 1); + if (grant_table_version == 1) + return -EOPNOTSUPP; shared.v2[ref].transitive.trans_domid = trans_domid; shared.v2[ref].transitive.gref = trans_gref; shared.v2[ref].hdr.domid = domid; wmb(); shared.v2[ref].hdr.flags = GTF_permit_access | GTF_transitive | flags; + return 0; } EXPORT_SYMBOL_GPL(gnttab_grant_foreign_access_ref_trans); @@ -588,25 +592,40 @@ static inline unsigned max_nr_grant_status_frames(void) return nr_status_frames(max_nr_grant_frames()); } +/* Called whenever there''s some possibility that the hypervisor + disagrees with us about what grant table version we''re using + (i.e. start of day and following a resume). */ static void gnttab_request_version(void) { int rc; struct gnttab_set_version gsv; - gsv.version = 2; - rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); - if (rc == 0) { - grant_table_version = 2; - } else { - if (grant_table_version == 2) { - /* If we''ve already used version 2 features, - but then suddenly discover that they''re not - available (e.g. migrating to an older - version of Xen), almost unbounded badness - can happen. */ - panic("we need grant tables version 2, but only version 1 is available"); + if (grant_table_version == 0 || grant_table_version == 2) { + /* Try to get v2 grant tables. */ + gsv.version = 2; + rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); + if (rc == 0) { + /* Success */ + grant_table_version = 2; + } else { + if (grant_table_version == 2) { + /* We''ve seen v2 features before, but + they''re no longer available. + That''s not disastrous, but it is + pretty bad. */ + printk(KERN_WARNING "Downgrading from v2 grant tables to v1; expect some device misbehaviour.\n"); + } + grant_table_version = 1; } - grant_table_version = 1; + } + if (grant_table_version == 1) { + /* This shouldn''t be necessary (Xen should default to + v1, and we''ve not asked for anything else), but do + it anyway for sanity. */ + gsv.version = 1; + /* If this fails, it probably just means the + hypervisor only supports v1, so we''re fine. */ + (void)HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); } } diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h index cf316d7..85f2d52 100644 --- a/include/xen/grant_table.h +++ b/include/xen/grant_table.h @@ -65,14 +65,14 @@ int gnttab_resume(void); int gnttab_grant_foreign_access(domid_t domid, unsigned long frame, int flags); -void gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid, - unsigned long frame, int flags, - unsigned page_off, - unsigned length); -void gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid, - int flags, - domid_t trans_domid, - grant_ref_t trans_gref); +int gnttab_grant_foreign_access_ref_subpage(grant_ref_t ref, domid_t domid, + unsigned long frame, int flags, + unsigned page_off, + unsigned length); +int gnttab_grant_foreign_access_ref_trans(grant_ref_t ref, domid_t domid, + int flags, + domid_t trans_domid, + grant_ref_t trans_gref); /* * Are sub-page grants available on this version of Xen? Returns 1 if _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Dec-03 12:15 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On Thu, 2009-12-03 at 12:01 +0000, Steven Smith wrote:> > Oh, I see what you meant... in the proper resume case (as opposed to the > > cancelled suspend/checkpoint case I was thinking of) there should be no > > grant tables in use at this point so most devices should, in theory, be > > able to reconnect using v1 grants, any drivers which require v2 grant > > tables need to check for them in their resume hook as well as at start > > of day. > > > > Unfortunately frontend devices tear down their grant entries after the > > resume rather than before the suspend (I presume this has to do with > > faster checkpointing?) which means they could be trying to clear an > > entry of the wrong layout, leading the unbounded badness that the > > comment refers to. > Actually, I think that''d be okay. Drivers should be clearing grant > table entries through the gnttab_end_* functions, which always check > grant_table_version and do the right thing.In the case where you have gone v1->v2 then the entries would have been setup while grante_table_version==1 layout but by the time gnttab_end_* is called grant_table_version==2 and the wrong thing happens.> > > I think the choices are basically: > > * Always latch to either v1 or v2 at start of day, if we can''t get > > the version we want then panic (this is a stronger restriction > > than the current code which will try to upgrade to v2 on resume) > Yeah, that probably counts as a bug in the current code. If we boot > on a Xen which only supports v1 tabled then we should probably stick > with v1 until we shut down. > > panic()ing when v2 isn''t available sounds like overkill, though.It''s only if you''ve already used v2 in this guest.> Would something like this work?I don''t think so, because I don''t think changing the grant table layout is safe. I''m experimenting with this: Subject: xen: latch grant-table layout at start of day. It is not possible to switch grant table layout over a save restore since entries setup before with the old layout cannot be torn down under the new layout. Also add a grant_table.version parameter to allow the user to force a particular layout (e.g. if they know they need to migrate to v1 only hosts) Signed-off-by: Ian Campbell <ian.campbell@citrix.com> diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index c653970..a449ef4 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -77,6 +77,7 @@ static grant_status_t *grstatus; static struct gnttab_free_callback *gnttab_free_callback_list; static int grant_table_version; +module_param_named(version, grant_table_version, int, 0); static int gnttab_expand(unsigned int req_entries); @@ -697,24 +698,29 @@ static void gnttab_request_version(void) int rc; struct gnttab_set_version gsv; - gsv.version = 2; + /* + * If we have already used a particular layout (e.g. before + * suspend/resume) then we must continue to use that + * version. Otherwise try and use v2. + */ + gsv.version = grant_table_version ? : 2; rc = HYPERVISOR_grant_table_op(GNTTABOP_set_version, &gsv, 1); - if (rc == 0) - grant_table_version = 2; - else { - if (grant_table_version == 2) { - /* - * If we''ve already used version 2 features, - * but then suddenly discover that they''re not - * available (e.g. migrating to an older - * version of Xen), almost unbounded badness - * can happen. - */ - panic("we need grant tables version 2, but only version 1 is available"); - } - grant_table_version = 1; - } - printk(KERN_INFO "Grant tables using version %d layout.\n", grant_table_version); + BUG_ON(rc == -EFAULT); + + /* + * Compatibility with hypervisors which do not support >= v2 + * grant tables. + */ + if (rc == -ENOSYS) + gsv.version = 1; + else if (rc != 0) + printk(KERN_WARNING "Error %d requesting v%d grant table layout, got v%d\n", + rc, grant_table_version ? : 2, gsv.version); + + if (grant_table_version && grant_table_version != gsv.version) + panic("unable to set required v%d grant table layout", grant_table_version); + + grant_table_version = gsv.version; } static int gnttab_map(unsigned int start_idx, unsigned int end_idx) @@ -964,7 +970,7 @@ static int __devinit gnttab_init(void) gnttab_free_count = nr_init_grefs - NR_RESERVED_ENTRIES; gnttab_free_head = NR_RESERVED_ENTRIES; - printk("Grant table initialized\n"); + printk("Grant table initialized using v%d layout\n", grant_table_version); return 0; ini_nomem: _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Steven Smith
2009-Dec-03 14:47 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
> > > Oh, I see what you meant... in the proper resume case (as opposed to the > > > cancelled suspend/checkpoint case I was thinking of) there should be no > > > grant tables in use at this point so most devices should, in theory, be > > > able to reconnect using v1 grants, any drivers which require v2 grant > > > tables need to check for them in their resume hook as well as at start > > > of day. > > > > > > Unfortunately frontend devices tear down their grant entries after the > > > resume rather than before the suspend (I presume this has to do with > > > faster checkpointing?) which means they could be trying to clear an > > > entry of the wrong layout, leading the unbounded badness that the > > > comment refers to. > > Actually, I think that''d be okay. Drivers should be clearing grant > > table entries through the gnttab_end_* functions, which always check > > grant_table_version and do the right thing. > In the case where you have gone v1->v2 then the entries would have been > setup while grante_table_version==1 layout but by the time gnttab_end_* > is called grant_table_version==2 and the wrong thing happens.Hmm... I still think we''re probably okay (in the narrowest possible sense). Changing grant table version memset()s the shared tables to zero, which effectively revokes all the grants, so all we need is for the gnttab_end_*_ref operations to be no-ops when applied to zeroed grant table slots, and a quick check of the code suggests that they are. But, yeah, it''s not something we want to be doing anyway, whether it''s safe or not.> > > I think the choices are basically: > > > * Always latch to either v1 or v2 at start of day, if we can''t get > > > the version we want then panic (this is a stronger restriction > > > than the current code which will try to upgrade to v2 on resume) > > Yeah, that probably counts as a bug in the current code. If we boot > > on a Xen which only supports v1 tabled then we should probably stick > > with v1 until we shut down. > > > > panic()ing when v2 isn''t available sounds like overkill, though. > It''s only if you''ve already used v2 in this guest.Ah, okay.> > Would something like this work? > > I don''t think so, because I don''t think changing the grant table layout > is safe. > > I''m experimenting with this: > > Subject: xen: latch grant-table layout at start of day. > > It is not possible to switch grant table layout over a save restore > since entries setup before with the old layout cannot be torn down under > the new layout. > > Also add a grant_table.version parameter to allow the user to force a > particular layout (e.g. if they know they need to migrate to v1 only > hosts) > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com>Acked-by: Steven Smith <sos22@cam.ac.uk> Steven. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-03 20:10 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On 12/03/09 02:28, Ian Campbell wrote:> On Wed, 2009-12-02 at 19:54 +0000, Ian Campbell wrote: > >> >>> Does it really need to be a panic? Can''t we just start failing all >>> future operations? Seems bad to take out the whole machine if we >>> >> can >> >>> just get away with crippling one device (especially if it can be >>> recovered by downing it and re-upping a new one with nc1 and/or >>> >> gt1). >> >> Wouldn''t there be (failing) grant table ops on the down path? >> >> In any case doesn''t it effect all devices since they all use the same >> grant table? >> > Oh, I see what you meant... in the proper resume case (as opposed to the > cancelled suspend/checkpoint case I was thinking of) there should be no > grant tables in use at this point so most devices should, in theory, be > able to reconnect using v1 grants, any drivers which require v2 grant > tables need to check for them in their resume hook as well as at start > of day. > > Unfortunately frontend devices tear down their grant entries after the > resume rather than before the suspend (I presume this has to do with > faster checkpointing?) which means they could be trying to clear an > entry of the wrong layout, leading the unbounded badness that the > comment refers to. >I think the reason frontends don''t do anything before suspend is because they need to cope with backends going away at any moment, and a suspend/migrate is just a special case of that. But a normal backend restart won''t change the grant table format, whereas a resume/migrate can, so it does make sense to take advantage of the suspend callback. Also I think originally there wasn''t a suspend callback, but there is now that we''re using the device model. I don''t know how it affects performance, but I guess it would require checkpoints to do a full teardown/reconnect so that the checkpointed image can cope. On the other hand, on resume, there are no existing grants, so the device can just ignore any grant state it currently has established and do it all afresh with the current grant mechanism, no?> I think the choices are basically: > * Always latch to either v1 or v2 at start of day, if we can''t get > the version we want then panic (this is a stronger restriction > than the current code which will try to upgrade to v2 on resume) > * Write v1<->v2 layout transformations called on gnttab resume > before the devices get a chance to try and unmap their old > entries. Would need to handle v2 entries sing feature which are > not expressible in v1. > > I''m tempted to go with the former for simplicity, it enables migration > to a newer version of Xen (the guest will just keep using v1) but will > not allow migration back to an older version of Xen, which is not > something we generally support anyway. >Yeah, given the "no downgrade" rule we don''t need to solve it in the most general way. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-03 20:13 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On 12/03/09 04:01, Steven Smith wrote:>> Oh, I see what you meant... in the proper resume case (as opposed to the >> cancelled suspend/checkpoint case I was thinking of) there should be no >> grant tables in use at this point so most devices should, in theory, be >> able to reconnect using v1 grants, any drivers which require v2 grant >> tables need to check for them in their resume hook as well as at start >> of day. >> >> Unfortunately frontend devices tear down their grant entries after the >> resume rather than before the suspend (I presume this has to do with >> faster checkpointing?) which means they could be trying to clear an >> entry of the wrong layout, leading the unbounded badness that the >> comment refers to. >> > Actually, I think that''d be okay. Drivers should be clearing grant > table entries through the gnttab_end_* functions, which always check > grant_table_version and do the right thing. > gnttab_grant_foreign_access_ref_{subpage,trans} would BUG(), but that > could be turned into an error return easily enough. Handling it > everywhere would be a pain, though. > > The only other potential subtlety is handling a suspend while some > other vcpu is doing grant table operations, but I think the > stop_machine() is sufficient protection against that. >Yes. We already have to be careful to prevent pte updates from being preempted by suspend, so grant operations are similar. (When CONFIG_PREEMPT is enabled, it is freeze/thaw which provides this guarantee.)> BUG_ON(flags & (GTF_accept_transfer | GTF_reading | > GTF_writing | GTF_sub_page | GTF_permit_access)); > - BUG_ON(grant_table_version == 1); > + if (grant_table_version == 1) >Rather than having all these version tests, would it make sense to have a grant_ops vector? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2009-Dec-03 20:15 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On 12/03/09 04:15, Ian Campbell wrote:> I''m experimenting with this: > > Subject: xen: latch grant-table layout at start of day. > > It is not possible to switch grant table layout over a save restore > since entries setup before with the old layout cannot be torn down under > the new layout. >Doesn''t resume implicitly tear them all down? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Dec-03 21:22 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On Thu, 2009-12-03 at 20:15 +0000, Jeremy Fitzhardinge wrote:> On 12/03/09 04:15, Ian Campbell wrote: > > I''m experimenting with this: > > > > Subject: xen: latch grant-table layout at start of day. > > > > It is not possible to switch grant table layout over a save restore > > since entries setup before with the old layout cannot be torn down under > > the new layout. > > > > Doesn''t resume implicitly tear them all down?Yes, but the devices will try and tear them down again in their resume handlers. I think the conclusion was (see Steven''s last mail) that since changing gnttab version memsets the array to zero and gnttab_end_* happen to be safe (by inspection) when used on a zeroed slot it would be ok. I think it''s just safest to avoid the whole issue, we don''t care about the downgrade case and in the upgrade case the guest isn''t going to have any devices running which need v2 so can just as well stick with v1. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2009-Dec-03 21:23 UTC
Re: [Xen-devel] Re: [PATCH] xen: reduce severity of message about using v1 grant tables.
On Thu, 2009-12-03 at 20:13 +0000, Jeremy Fitzhardinge wrote:> > > BUG_ON(flags & (GTF_accept_transfer | GTF_reading | > > GTF_writing | GTF_sub_page | > GTF_permit_access)); > > - BUG_ON(grant_table_version == 1); > > + if (grant_table_version == 1) > > > > Rather than having all these version tests, would it make sense to > have a grant_ops vector?I was looking at these today and reckon you could go a long way with a couple of accessor macros towards avoiding most of the version checks. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel