I noted a possible problem restoring a machine. In xc_domain_restore (xc_domain_restore.c) if it''s not the last checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call pagebuf_get or just load other pages (see following "goto loadpages;" line). Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra (xc_tmem.c) which call read_extract (xc_private.c) on the same non blocking socket/file but read_extract does not handle EAGAIN/EWOULDBLOCK (both can be returned on non blocking socket depending on file type and Unix/Linux version) leading to a failure. Does this make sense or is it impossible ?? Also note that rdexact (xc_domain_restore.c) handle data timeout but we can still block in read_exact called by xc_tmem_restore/xc_tmem_restore_extra. Last note on rdexact, isn''t 1 second (HEARTBEAT_MS) too small if there are network problems? Frediano
CCiong the Remus maintainer since all this non-blocking stuff is for remus/checkpointing. On Wed, 2012-05-23 at 10:39 +0100, Frediano Ziglio wrote:> I noted a possible problem restoring a machine. > > In xc_domain_restore (xc_domain_restore.c) if it''s not the last > checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call > pagebuf_get or just load other pages (see following "goto loadpages;" > line). > Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra > (xc_tmem.c) which call read_extract (xc_private.c) on the same non > blocking socket/fileThere''s a bunch of such places in that function, the RDEXACT macro is also == rdexact except on Minios.> but read_extract does not handle EAGAIN/EWOULDBLOCK > (both can be returned on non blocking socket depending on file type and > Unix/Linux version) leading to a failure. > Does this make sense or is it impossible ??Isn''t this what the if line: len = read(fd, buf + offset, size - offset); if ( (len == -1) && ((errno == EINTR) || (errno == EAGAIN)) ) continue; is doing?> Also note that rdexact (xc_domain_restore.c) handle data timeout but we > can still block in read_exact called by > xc_tmem_restore/xc_tmem_restore_extra.Oh, wait! read_exact != rdexact -- ouch! Those are confusingly similar! I suspect we need to pull the xc_tmem_{save,restore} into the appropriate file and use the non-blocking capable versions or to export the non-blocking function, with an improved name, so it can be used from xc_tmem.c. Shriram, any thoughts?> > Last note on rdexact, isn''t 1 second (HEARTBEAT_MS) too small if there > are network problems? > > Frediano > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Wed, 2012-05-23 at 11:25 +0100, Ian Campbell wrote:> CCiong the Remus maintainer since all this non-blocking stuff is for > remus/checkpointing. > > On Wed, 2012-05-23 at 10:39 +0100, Frediano Ziglio wrote: > > I noted a possible problem restoring a machine. > > > > In xc_domain_restore (xc_domain_restore.c) if it''s not the last > > checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call > > pagebuf_get or just load other pages (see following "goto loadpages;" > > line). > > Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra > > (xc_tmem.c) which call read_extract (xc_private.c) on the same non > > blocking socket/file > > There''s a bunch of such places in that function, the RDEXACT macro is > also == rdexact except on Minios. > > > but read_extract does not handle EAGAIN/EWOULDBLOCK > > (both can be returned on non blocking socket depending on file type and > > Unix/Linux version) leading to a failure. > > Does this make sense or is it impossible ?? > > Isn''t this what the if line: > len = read(fd, buf + offset, size - offset); > if ( (len == -1) && ((errno == EINTR) || (errno == EAGAIN)) ) > continue; > > is doing? > > > Also note that rdexact (xc_domain_restore.c) handle data timeout but we > > can still block in read_exact called by > > xc_tmem_restore/xc_tmem_restore_extra. > > Oh, wait! read_exact != rdexact -- ouch! Those are confusingly similar! > > I suspect we need to pull the xc_tmem_{save,restore} into the > appropriate file and use the non-blocking capable versions or to export > the non-blocking function, with an improved name, so it can be used from > xc_tmem.c. >I was working on a patch to try to reduce cpu usage and read calls using buffering for io_fd. Currently works but is not still that good to post.> Shriram, any thoughts? > > > > > Last note on rdexact, isn''t 1 second (HEARTBEAT_MS) too small if there > > are network problems? > >Frediano
On Wed, May 23, 2012 at 5:39 AM, Frediano Ziglio <frediano.ziglio@citrix.com> wrote:> I noted a possible problem restoring a machine. > > In xc_domain_restore (xc_domain_restore.c) if it''s not the last > checkpoint we set O_NONBLOCK flag (search for fcntl) that we can call > pagebuf_get or just load other pages (see following "goto loadpages;" > line). > Now we could ending up calling xc_tmem_restore/xc_tmem_restore_extra > (xc_tmem.c) which call read_extract (xc_private.c) on the same non > blocking socket/file but read_extract does not handle EAGAIN/EWOULDBLOCK > (both can be returned on non blocking socket depending on file type and > Unix/Linux version) leading to a failure. > Does this make sense or is it impossible ?? >It certainly is possible. But again, I have never seen anyone use tmem with Remus. I dont even know if it would work properly, even if we fix the read_exact code to handle non-blocking fds. For the normal live-migration scenario, the O_NONBLOCK change does not happen. So, RDEXACT == rdexact == read_exact, output wise.> Also note that rdexact (xc_domain_restore.c) handle data timeout but we > can still block in read_exact called by > xc_tmem_restore/xc_tmem_restore_extra. >Yep. Only in Remus case. As stated above, havent come across anyone using Remus + tmem and/or dont know if it would work properly. I dont know the semantics of tmem enough to comment on remus+tmem, whether it makes sense or not, etc..> Last note on rdexact, isn''t 1 second (HEARTBEAT_MS) too small if there > are network problems? >This wont be a problem for live migration. Because that timeout code is within the if (ctx->completed) { } block. It only becomes active when Remus is enabled i.e. ctx->last_checkpoint = 0. Otherwise, the read call is still blocking.> Frediano > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
> From: Shriram Rajagopalan [mailto:rshriram@cs.ubc.ca] > Subject: Re: [Xen-devel] Possible error restoring machine > > Yep. Only in Remus case. As stated above, havent come across anyone > using Remus + tmem and/or dont know if it would work properly. I dont > know the semantics of tmem enough to comment on remus+tmem, whether > it makes sense or not, etc..An interesting question... from what I remember about Remus (it''s been a few years now since I looked at it), they can''t co-exist I think. To Remus, tmem is like a hidden hypervisor-private local disk and the writes to it don''t get captured/replicated by Remus. I think this is fixable but I don''t think the fix would be easy. But this is just a few seconds of thought, so I may be all wrong. Dan
Reasonably Related Threads
- XCP
- when timer go back in dom0 save and restore or migrate, PV domain hung
- [PATCH 0 of 4] Support for VM generation ID save/restore and migrate
- [PATCH 0 of 3] Support for VM generation ID save/restore and migrate
- [PATCH 0 of 2] Support for VM generation ID save/restore and migrate