Brendan Cully
2007-May-09 00:01 UTC
[Xen-devel] [RFC] use event channel to improve suspend speed
Hi, I''ve been doing a little work on improving the latency of guest domain suspends. I''ve added a couple of printfs into xc_domain_save around the last round, and hooked up a harness to loop over the last round code every couple of seconds. Here are some numbers for a run of 100 last rounds (from just before the suspend callback to just before it would exit), on a 3.2 Ghz P4 with 1 GB of RAM, 128 MB of which goes to a guest. This approximates the best-case downtime for live migration, I think. current code: avg: 133.57 ms, min: 82.53, max: 559.86, median: 135.63 with the attached patch: avg: 36.05 ms, min: 33.99, max: 52.14, median: 35.51 The patch creates an event channel in the guest that fires the suspend code. xc_save can use this to suspend the domain instead of calling back to xend, which then writes a xenstore entry, which then causes a watch to fire in the guest. It seems the xenstore interaction is fairly slow and very jittery. This isn''t intended for 3.1, but I thought I''d put it out just in case anyone else finds it interesting. I''d appreciate comments about the approach. There''s also a fair amount of latency involved in xend receiving the notification that the domain has suspended and passing that back on to xc_save. A quick hack to let xc_save simply loop on xc_domain_getinfo until the domain suspends indicates that it should be fairly easy to cut the suspend latency in half again, to about 15ms. I''ll see about finding a clean equivalent of this... Comments? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Brendan Cully
2007-May-10 22:13 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
The posted patch was a fairly conservative approach (backward compatible, equivalent to existing semantics). I''ve done some more experimental work that reduces the time for the final round to about 5ms. Here are the stats for 100 checkpoints: avg: 5.62 ms, min: 3.96, max: 13.70, median: 4.86 It turns out the biggest remaining delay is (surprise!) xenstored. To get the above numbers I unwired xenstored from VIRQ_DOM_EXC and let xc_save bind to it. Obviously this isn''t a practical approach. I''d love to hear any ideas about the right way to avoid the xenstore penalty though. My current thought is that it might be possible to arrange to register a dynamic virq from xc_save into xen for a target domain, and then have xen fire it on suspend instead of DOM_EXC (iff it''s installed, otherwise use the normal path). Any advice would be welcome. On Tuesday, 08 May 2007 at 17:01, Brendan Cully wrote:> Hi, > > I''ve been doing a little work on improving the latency of guest domain > suspends. I''ve added a couple of printfs into xc_domain_save around > the last round, and hooked up a harness to loop over the last round > code every couple of seconds. Here are some numbers for a run of 100 > last rounds (from just before the suspend callback to just before it > would exit), on a 3.2 Ghz P4 with 1 GB of RAM, 128 MB of which goes to > a guest. This approximates the best-case downtime for live migration, > I think. > > current code: > avg: 133.57 ms, min: 82.53, max: 559.86, median: 135.63 > > with the attached patch: > avg: 36.05 ms, min: 33.99, max: 52.14, median: 35.51 > > The patch creates an event channel in the guest that fires the suspend > code. xc_save can use this to suspend the domain instead of calling > back to xend, which then writes a xenstore entry, which then causes a > watch to fire in the guest. It seems the xenstore interaction is > fairly slow and very jittery. > > This isn''t intended for 3.1, but I thought I''d put it out just in case > anyone else finds it interesting. I''d appreciate comments about the > approach. > > There''s also a fair amount of latency involved in xend receiving the > notification that the domain has suspended and passing that back on to > xc_save. A quick hack to let xc_save simply loop on xc_domain_getinfo > until the domain suspends indicates that it should be fairly easy to > cut the suspend latency in half again, to about 15ms. I''ll see about > finding a clean equivalent of this... > > Comments?_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Daniel P. Berrange
2007-May-10 23:00 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On Thu, May 10, 2007 at 03:13:10PM -0700, Brendan Cully wrote:> The posted patch was a fairly conservative approach (backward > compatible, equivalent to existing semantics). I''ve done some > more experimental work that reduces the time for the final round to > about 5ms. Here are the stats for 100 checkpoints: > > avg: 5.62 ms, min: 3.96, max: 13.70, median: 4.86 > > It turns out the biggest remaining delay is (surprise!) xenstored. To > get the above numbers I unwired xenstored from VIRQ_DOM_EXC and let > xc_save bind to it. > > Obviously this isn''t a practical approach. I''d love to hear any ideas > about the right way to avoid the xenstore penalty though. My current > thought is that it might be possible to arrange to register a dynamic > virq from xc_save into xen for a target domain, and then have xen fire > it on suspend instead of DOM_EXC (iff it''s installed, otherwise use > the normal path).It would be interesting to know what aspect of the xenstore interaction is responsible for the slowdown. In particular, whether it is a fundamental architectural constraint, or whether it is merely due to the poor performance of the current impl. We already know from previous tests that XenD impl of transactions absolutely kills performance of various XenD operations due to the vast amount of unneccessary I/O it does. If fixing the XenstoreD transaction code were to help suspend performance too, it might be a better option than re-writing all code which touches xenstore. A quick test of putting /var/lib/xenstored on a ramdisk would be a way of testing whether its the I/O which is hurting suspend time. Dan -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=| _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Brendan Cully
2007-May-11 00:06 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On Friday, 11 May 2007 at 00:00, Daniel P. Berrange wrote:> On Thu, May 10, 2007 at 03:13:10PM -0700, Brendan Cully wrote: > > The posted patch was a fairly conservative approach (backward > > compatible, equivalent to existing semantics). I''ve done some > > more experimental work that reduces the time for the final round to > > about 5ms. Here are the stats for 100 checkpoints: > > > > avg: 5.62 ms, min: 3.96, max: 13.70, median: 4.86 > > > > It turns out the biggest remaining delay is (surprise!) xenstored. To > > get the above numbers I unwired xenstored from VIRQ_DOM_EXC and let > > xc_save bind to it. > > > > Obviously this isn''t a practical approach. I''d love to hear any ideas > > about the right way to avoid the xenstore penalty though. My current > > thought is that it might be possible to arrange to register a dynamic > > virq from xc_save into xen for a target domain, and then have xen fire > > it on suspend instead of DOM_EXC (iff it''s installed, otherwise use > > the normal path). > > It would be interesting to know what aspect of the xenstore interaction > is responsible for the slowdown. In particular, whether it is a fundamental > architectural constraint, or whether it is merely due to the poor performance > of the current impl. We already know from previous tests that XenD impl of > transactions absolutely kills performance of various XenD operations due to > the vast amount of unneccessary I/O it does. > > If fixing the XenstoreD transaction code were to help suspend performance > too, it might be a better option than re-writing all code which touches > xenstore. A quick test of putting /var/lib/xenstored on a ramdisk would > be a way of testing whether its the I/O which is hurting suspend time.That''s certainly part of it. If I rewrite xc_save to set up a watch on @releaseDomain, then select on the xs handle (deferring actually reading the watch until after the checkpoint), then I get the following timings: /var/lib/xenstored on ext3: avg: 29.41 ms, min: 27.65, max: 40.33, median: 29.30 on tmpfs: avg: 17.58 ms, min: 7.05, max: 43.88, median: 16.57 It''s still awfully jittery though, and significantly slower. I''d guess that the watch mechanism is the problem. I haven''t looked very closely at its internals, but I wonder if it''s just delivering synchronous notifications to the watcher list in order (in this case, making xc_save wait until xend has handled the watch). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-May-11 06:55 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On 11/5/07 00:00, "Daniel P. Berrange" <berrange@redhat.com> wrote:> It would be interesting to know what aspect of the xenstore interaction > is responsible for the slowdown. In particular, whether it is a fundamental > architectural constraint, or whether it is merely due to the poor performance > of the current impl. We already know from previous tests that XenD impl of > transactions absolutely kills performance of various XenD operations due to > the vast amount of unneccessary I/O it does. > > If fixing the XenstoreD transaction code were to help suspend performance > too, it might be a better option than re-writing all code which touches > xenstore. A quick test of putting /var/lib/xenstored on a ramdisk would > be a way of testing whether its the I/O which is hurting suspend time.Yes. We could go either way -- it wouldn''t be too bad to add support via dynamic VIRQ_DOM_EXC for example, or add other things to get xenstore off the critical path for save/restore. But if the problem is that xenstored sucks it probably is worth investing a bit of time to tackle the problem directly and see where the time is going. We could end up with optimisations which have benefits beyond just save/restore. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Brendan Cully
2007-May-25 00:06 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On Friday, 11 May 2007 at 07:55, Keir Fraser wrote:> On 11/5/07 00:00, "Daniel P. Berrange" <berrange@redhat.com> wrote: > > > It would be interesting to know what aspect of the xenstore interaction > > is responsible for the slowdown. In particular, whether it is a fundamental > > architectural constraint, or whether it is merely due to the poor performance > > of the current impl. We already know from previous tests that XenD impl of > > transactions absolutely kills performance of various XenD operations due to > > the vast amount of unneccessary I/O it does. > > > > If fixing the XenstoreD transaction code were to help suspend performance > > too, it might be a better option than re-writing all code which touches > > xenstore. A quick test of putting /var/lib/xenstored on a ramdisk would > > be a way of testing whether its the I/O which is hurting suspend time. > > Yes. We could go either way -- it wouldn''t be too bad to add support via > dynamic VIRQ_DOM_EXC for example, or add other things to get xenstore off > the critical path for save/restore. But if the problem is that xenstored > sucks it probably is worth investing a bit of time to tackle the problem > directly and see where the time is going. We could end up with optimisations > which have benefits beyond just save/restore.I''m sure xenstore could be made significantly faster, but barring a redesign maybe it''s better just to use it for low-frequency transactions with pretty loose latency expectations? Running the suspend notification through xenstore, to xend and finally back to xc_save (as the current code does) seems convoluted, and bound to create opportunities for bad scheduling compared to directly notifying xc_save. In case there''s interest, I''ll attach the two patches I''m using to speed up checkpointing (and live migration downtime). As I mentioned earlier, the first patch should be semantically equivalent to existing code, and cuts downtime to about 30-35ms. The second notifies xend that the domain has been suspended asynchronously, so that final round memory copying may begin before device migration stage 2. This is a semantic change, but I can''t think of a concrete drawback. It''s a little rough-and-ready -- suggestions for improvement are welcome. Here are some stats on final round time (100 runs): xen 3.1: avg: 93.40 ms, min: 72.59, max: 432.46, median: 85.10 patch 1 (trigger suspend via event channel): avg: 43.69 ms, min: 35.21, max: 409.50, median: 37.21 patch 1, /var/lib/xenstored on tmpfs: avg: 33.88 ms, min: 27.01, max: 369.21, median: 28.34 patch 2 (receive suspended notification via event channel): avg: 4.95 ms, min: 3.46, max: 14.73, median: 4.63 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-May-25 06:46 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On 25/5/07 01:06, "Brendan Cully" <brendan@cs.ubc.ca> wrote:> In case there''s interest, I''ll attach the two patches I''m using to > speed up checkpointing (and live migration downtime). As I mentioned > earlier, the first patch should be semantically equivalent to existing > code, and cuts downtime to about 30-35ms. The second notifies xend > that the domain has been suspended asynchronously, so that final round > memory copying may begin before device migration stage 2. This is a > semantic change, but I can''t think of a concrete drawback. It''s a > little rough-and-ready -- suggestions for improvement are welcome.Can patch 2 be used without patch 1? The fact it doesn''t need to change the guest interface again is a big advantage. And it seems to provide by far the larger proportional speedup. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Brendan Cully
2007-May-25 23:41 UTC
Re: [Xen-devel] [RFC] use event channel to improve suspend speed
On Friday, 25 May 2007 at 07:46, Keir Fraser wrote:> On 25/5/07 01:06, "Brendan Cully" <brendan@cs.ubc.ca> wrote: > > > In case there''s interest, I''ll attach the two patches I''m using to > > speed up checkpointing (and live migration downtime). As I mentioned > > earlier, the first patch should be semantically equivalent to existing > > code, and cuts downtime to about 30-35ms. The second notifies xend > > that the domain has been suspended asynchronously, so that final round > > memory copying may begin before device migration stage 2. This is a > > semantic change, but I can''t think of a concrete drawback. It''s a > > little rough-and-ready -- suggestions for improvement are welcome. > > Can patch 2 be used without patch 1? The fact it doesn''t need to change the > guest interface again is a big advantage. And it seems to provide by far the > larger proportional speedup.Yes, it''s possible to separate them out. But they work best together, since that''s the only case where xenstore is completely removed from the suspend path. I''ve done another version of patch 1 in which xc_save triggers the suspend by writing to xenstore (instead of asking xend to do it), with a vanilla guest kernel. With patch 2 also applied, I get this: avg: 23.17 ms, min: 9.08, max: 545.51, median: 13.66 implying the base penalty for using xenstore is about 5-10ms, and we still suffer some pretty serious jitter. The same test with /var/lib/xenstored mounted on tmpfs is much less jittery: avg: 14.62 ms, min: 8.27, max: 32.67, median: 11.71 but there''s still a 5-20ms xenstore penalty. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel