Dan Magenheimer
2010-Sep-03 15:47 UTC
[Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle long object-ids (XEN-UNSTABLE)
tmem (tools-side): move to new ABI version to handle long object-ids Please apply to XEN-UNSTABLE ... see separate post for the essentially identical xen-4.0-testing equivalent. (Note to Keir/Ian: These patches should be applied together, but I''m not clear on how to submit patches that cross MAINTAINERS boundaries as this one does.) After a great deal of discussion and review with linux kernel developers, it appears there are "next-generation" filesystems (such as btrfs, xfs, Lustre) that will not be able to use tmem due to an ABI limitation... a field that represents a unique file identifier is 64-bits in the tmem ABI and may need to be as large as 192-bits. So to support these guest filesystems, the tmem ABI must be revised, from "v0" to "v1". I *think* it is still the case that tmem is experimental and is not used anywhere yet in production. The tmem ABI is designed to support multiple revisions, so the Xen tmem implementation could be updated to handle both v0 and v1. However this is a bit messy and would require data structures for both v0 and v1 to appear in public Xen header files. I am inclined to update the Xen tmem implementation to only support v1 and gracefully fail v0. This would result in only a performance loss (as if tmem were disabled) for newly launched tmem-v0-enabled guests, but live-migration between old tmem-v0 Xen and new tmem-v1 Xen machines would fail, and saved tmem-v0 guests will not be able to be restored on a tmem-v1 Xen machine. I would plan to update both pre-4.0.2 and unstable (future 4.1) to only support v1. I believe these restrictions are reasonable at this point in the tmem lifecycle, though they may not be reasonable in the near future; should the tmem ABI need to be revised from v1 to v2, I understand backwards compatibility will be required. Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com> diff -r 3c4c3d48a835 tools/libxc/xc_tmem.c --- a/tools/libxc/xc_tmem.c Thu Aug 26 11:16:56 2010 +0100 +++ b/tools/libxc/xc_tmem.c Fri Sep 03 09:41:16 2010 -0600 @@ -63,7 +63,56 @@ int xc_tmem_control(xc_interface *xch, set_xen_guest_handle(op.u.ctrl.buf,buf); op.u.ctrl.arg1 = arg1; op.u.ctrl.arg2 = arg2; - op.u.ctrl.arg3 = arg3; + /* use xc_tmem_control_oid if arg3 is required */ + op.u.ctrl.oid[0] = 0; + op.u.ctrl.oid[1] = 0; + op.u.ctrl.oid[2] = 0; + + if (subop == TMEMC_LIST) { + if ((arg1 != 0) && (lock_pages(buf, arg1) != 0)) + { + PERROR("Could not lock memory for Xen hypercall"); + return -ENOMEM; + } + } + +#ifdef VALGRIND + if (arg1 != 0) + memset(buf, 0, arg1); +#endif + + rc = do_tmem_op(xc, &op); + + if (subop == TMEMC_LIST) { + if (arg1 != 0) + unlock_pages(buf, arg1); + } + + return rc; +} + +int xc_tmem_control_oid(int xc, + int32_t pool_id, + uint32_t subop, + uint32_t cli_id, + uint32_t arg1, + uint32_t arg2, + struct tmem_oid oid, + void *buf) +{ + tmem_op_t op; + int rc; + + op.cmd = TMEM_CONTROL; + op.pool_id = pool_id; + op.u.ctrl.subop = subop; + op.u.ctrl.cli_id = cli_id; + set_xen_guest_handle(op.u.ctrl.buf,buf); + op.u.ctrl.arg1 = arg1; + op.u.ctrl.arg2 = arg2; + op.u.ctrl.oid[0] = oid.oid[0]; + op.u.ctrl.oid[1] = oid.oid[1]; + op.u.ctrl.oid[2] = oid.oid[2]; if (subop == TMEMC_LIST) { if ((arg1 != 0) && (lock_pages(buf, arg1) != 0)) @@ -254,7 +303,7 @@ int xc_tmem_save(xc_interface *xch, } else { /* page list terminator */ h = (struct tmem_handle *)buf; - h->oid = -1; + h->oid[0] = h->oid[1] = h->oid[2] = -1L; if ( write_exact(io_fd, &h->oid, sizeof(h->oid)) ) return -1; break; @@ -291,7 +340,8 @@ int xc_tmem_save_extra(xc_interface *xch if ( write_exact(io_fd, &handle.index, sizeof(handle.index)) ) return -1; count++; - checksum += handle.pool_id + handle.oid + handle.index; + checksum += handle.pool_id + handle.oid[0] + handle.oid[1] + + handle.oid[2] + handle.index; } if ( count ) DPRINTF("needed %d tmem invalidates, check=%d\n",count,checksum); @@ -401,20 +451,21 @@ int xc_tmem_restore(xc_interface *xch, i } for ( j = n_pages; j > 0; j-- ) { - uint64_t oid; + struct tmem_oid oid; uint32_t index; int rc; if ( read_exact(io_fd, &oid, sizeof(oid)) ) return -1; - if ( oid == -1 ) + if ( oid.oid[0] == -1L && oid.oid[1] == -1L && oid.oid[2] == -1L ) break; if ( read_exact(io_fd, &index, sizeof(index)) ) return -1; if ( read_exact(io_fd, buf, pagesize) ) return -1; checksum += *buf; - if ( (rc = xc_tmem_control(xch, pool_id, TMEMC_RESTORE_PUT_PAGE, - dom, bufsize, index, oid, buf)) <= 0 ) + if ( (rc = xc_tmem_control_oid(xc, pool_id, + TMEMC_RESTORE_PUT_PAGE, dom, + bufsize, index, oid, buf)) <= 0 ) { DPRINTF("xc_tmem_restore: putting page failed, rc=%d\n",rc); return -1; @@ -434,7 +485,7 @@ int xc_tmem_restore_extra(xc_interface * int xc_tmem_restore_extra(xc_interface *xch, int dom, int io_fd) { uint32_t pool_id; - uint64_t oid; + struct tmem_oid oid; uint32_t index; int count = 0; int checksum = 0; @@ -445,11 +496,11 @@ int xc_tmem_restore_extra(xc_interface * return -1; if ( read_exact(io_fd, &index, sizeof(index)) ) return -1; - if ( xc_tmem_control(xch, pool_id, TMEMC_RESTORE_FLUSH_PAGE, dom, + if ( xc_tmem_control_oid(xch, pool_id, TMEMC_RESTORE_FLUSH_PAGE, dom, 0,index,oid,NULL) <= 0 ) return -1; count++; - checksum += pool_id + oid + index; + checksum += pool_id + oid.oid[0] + oid.oid[1] + oid.oid[2] + index; } if ( pool_id != -1 ) return -1; diff -r 3c4c3d48a835 tools/libxc/xenctrl.h --- a/tools/libxc/xenctrl.h Thu Aug 26 11:16:56 2010 +0100 +++ b/tools/libxc/xenctrl.h Fri Sep 03 09:41:16 2010 -0600 @@ -1453,6 +1453,14 @@ int xc_disable_turbo(xc_interface *xch, /** * tmem operations */ + +struct tmem_oid { + uint64_t oid[3]; +}; + +int xc_tmem_control_oid(int xc, int32_t pool_id, uint32_t subop, + uint32_t cli_id, uint32_t arg1, uint32_t arg2, + struct tmem_oid oid, void *buf); int xc_tmem_control(xc_interface *xch, int32_t pool_id, uint32_t subop, uint32_t cli_id, uint32_t arg1, uint32_t arg2, uint64_t arg3, void *buf); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2010-Sep-03 17:50 UTC
Re: [Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle long object-ids (XEN-UNSTABLE)
Dan Magenheimer writes ("[Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle long object-ids (XEN-UNSTABLE)"):> tmem (tools-side): move to new ABI version to handle > long object-idsThanks for this patch. I''m sad to say that this appears to confirm that the decision by the world in general not to use tmem yet was correct. After all if they _were_ using it now they''d have some pain. I don''t have a strong opinion about this patch. The tools part seems reasonably well-contained so I guess I''m happy to see it applied.> Please apply to XEN-UNSTABLE ... see separate post for > the essentially identical xen-4.0-testing equivalent.It would probably be best if Keir applied both patches in a single changeset, to avoid build breakage. It would be easier to find your messages if your patchbomb script (or your mailer, if you''re doing it by hand) could be persuaded to post all the portions of a series in a single thread, with appropriate In-Reply-To and References headers. Thanks, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Sep-03 19:45 UTC
RE: [Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle long object-ids (XEN-UNSTABLE)
Hi Ian --> From: Ian Jackson [mailto:Ian.Jackson@eu.citrix.com] > > Dan Magenheimer writes ("[Xen-devel] [PATCH] tmem (tools-side): ABI v1 > to handle long object-ids (XEN-UNSTABLE)"): > > tmem (tools-side): move to new ABI version to handle > > long object-ids > > Thanks for this patch. > > I''m sad to say that this appears to confirm that the decision by the > world in general not to use tmem yet was correct. After all if they > _were_ using it now they''d have some pain.1) To be fair, I''ve never yet advertised tmem as anything but experimental or "technology preview". But if tmem WAS widely deployed today, backwards compatibility was certainly possible, just a pain (for me and for Xen code maintainability) that, since it was not widely deployed, I chose to avoid. 2) As one wag put it, that''s the risk one takes when releasing a technology before it is accepted mainstream in "the kernel". Tmem requires some coordination between Linux and Xen so the tmem ABI was designed to be extremely flexible and this was the first time in >18 months that an ABI rev was "necessary". 3) The problem that resulted in the ABI change could have fairly easily been handled on the Linux side*, but would have required changes to multiple filesystems. As other Xen developers have discovered, tails don''t easily wag dogs and (to mix metaphors) that was a rathole I chose not to explore. (* One would think that 64 bits would be sufficient to uniquely represent a file in a filesystem :-)> I don''t have a strong opinion about this patch. The tools part seems > reasonably well-contained so I guess I''m happy to see it applied.Thanks. All of tmem was designed to be well-contained with very little impact on other subsystems. I was pleased that a rather major change like this had as low impact as it did on libxc.> > Please apply to XEN-UNSTABLE ... see separate post for > > the essentially identical xen-4.0-testing equivalent. > > It would probably be best if Keir applied both patches in a single > changeset, to avoid build breakage.However the two of you want to arrange it is OK with me as long as both patches are applied temporally relatively near. (And preferably both for both 4.0-testing and xen-unstable.)> It would be easier to find your messages if your patchbomb script (or > your mailer, if you''re doing it by hand) could be persuaded to post > all the portions of a series in a single thread, with appropriate > In-Reply-To and References headers.This is the first patch I''ve submitted since Keir''s request for tools patches to go to you, and the first patch in a LONG time that crossed the boundary between hypervisor and tools. I can submit tmem patches however you like... I was just using the same process Keir has cheerfully accepted in the past (and tried to split tools from hypervisor as best I could... in the past, a monolithic patch was sufficient since there was little interest in reviewing details of tmem-related patches anyway). Sheepishly, I admit I use Outlook for most mundane email tasks including Xen patches but can (and have, for lkml) submitted patch series via a Linux mailer. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Sep-17 18:43 UTC
RE: [Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle long object-ids (XEN-UNSTABLE)
For the record, I''ve figured out a way on the Linux-side tmem patch to support backwards compatibility to tmem v0. Guests with this patch will work for Xen tmem v0 and tmem v1. It''s ugly enough that I won''t be proposing it upstream but simple enough that it''s a reasonable patch for Linux distros that support tmem, e.g. to be backwards compatible with Xen 4.0.0 and/or other Xen distros that have shipped with tmem v0. Vice-versa won''t be supported however (e.g. tmem v0 guests on tmem v1 Xen) but, if necessary, that is resolvable with a simple in-guest kernel upgrade. Also, migrating of running tmem guests between Xen hosts with different tmem versions will not be supported.> -----Original Message----- > From: Dan Magenheimer > Sent: Friday, September 03, 2010 1:46 PM > To: Ian Jackson > Cc: Xen-Devel (xen-devel@lists.xensource.com); Keir Fraser > Subject: RE: [Xen-devel] [PATCH] tmem (tools-side): ABI v1 to handle > long object-ids (XEN-UNSTABLE) > > Hi Ian -- > > > From: Ian Jackson [mailto:Ian.Jackson@eu.citrix.com] > > > > Dan Magenheimer writes ("[Xen-devel] [PATCH] tmem (tools-side): ABI > v1 > > to handle long object-ids (XEN-UNSTABLE)"): > > > tmem (tools-side): move to new ABI version to handle > > > long object-ids > > > > Thanks for this patch. > > > > I''m sad to say that this appears to confirm that the decision by the > > world in general not to use tmem yet was correct. After all if they > > _were_ using it now they''d have some pain. > > 1) To be fair, I''ve never yet advertised tmem as anything but > experimental or "technology preview". But if tmem WAS widely > deployed today, backwards compatibility was certainly possible, > just a pain (for me and for Xen code maintainability) that, > since it was not widely deployed, I chose to avoid. > > 2) As one wag put it, that''s the risk one takes when releasing > a technology before it is accepted mainstream in "the kernel". > Tmem requires some coordination between Linux and Xen so > the tmem ABI was designed to be extremely flexible and this was > the first time in >18 months that an ABI rev was "necessary". > > 3) The problem that resulted in the ABI change could have fairly > easily been handled on the Linux side*, but would have required > changes to multiple filesystems. As other Xen developers have > discovered, tails don''t easily wag dogs and (to mix metaphors) > that was a rathole I chose not to explore. (* One would think > that 64 bits would be sufficient to uniquely represent a file > in a filesystem :-) > > > I don''t have a strong opinion about this patch. The tools part seems > > reasonably well-contained so I guess I''m happy to see it applied. > > Thanks. All of tmem was designed to be well-contained with very little > impact on other subsystems. I was pleased that a rather major > change like this had as low impact as it did on libxc. > > > > Please apply to XEN-UNSTABLE ... see separate post for > > > the essentially identical xen-4.0-testing equivalent. > > > > It would probably be best if Keir applied both patches in a single > > changeset, to avoid build breakage. > > However the two of you want to arrange it is OK with me as > long as both patches are applied temporally relatively near. > (And preferably both for both 4.0-testing and xen-unstable.) > > > It would be easier to find your messages if your patchbomb script (or > > your mailer, if you''re doing it by hand) could be persuaded to post > > all the portions of a series in a single thread, with appropriate > > In-Reply-To and References headers. > > This is the first patch I''ve submitted since Keir''s request > for tools patches to go to you, and the first patch in a LONG > time that crossed the boundary between hypervisor and tools. > I can submit tmem patches however you like... I was just using > the same process Keir has cheerfully accepted in the past > (and tried to split tools from hypervisor as best I could... > in the past, a monolithic patch was sufficient since there was > little interest in reviewing details of tmem-related patches > anyway). > > Sheepishly, I admit I use Outlook for most mundane email tasks > including Xen patches but can (and have, for lkml) submitted patch > series via a Linux mailer._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel