thr3ads.net - Xen devel - [Xen-devel] vbd flushing during migration? [Jul 2006]

If this information is useful, please help other people find it:
Share via:

John Byrne

2006-Jul-31 19:39 UTC

[Xen-devel] vbd flushing during migration?

Hi,

I don''t see any obvious flush to disk taking place for vbd''s
on the
source host in XendCheckpoint.py before the domain is started on the new 
host. Is there a guarantee that all written data is on disk somewhere 
else or is something needed?

Thanks,

John Byrne


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Warfield

2006-Jul-31 19:56 UTC

head link

Re: [Xen-devel] vbd flushing during migration?

It''s slightly more than a flush that''s required.  The
migration
protocol needs to be extended so that execution on the target host
doesn''t start until all of the outstanding (i.e. issued by the
backend) block requests have been either cancelled or acknowledged.
This should be pretty straight forward given that the backend driver
ref counts a blkif''s state based on pending requests, and
won''t tear
down the backend directory in xenstore until all the outstanding
requests have cleared.  All that is likely required is to have the
migration code register watches on the backend vbd directories, and
wait for them to disappear before giving the all-clear to the new
host.

We''ve talked about this enough to know how to fix it, but
haven''t had
a chance to hack it up.  (I think Julian has looked into the problem a
bit for blktap, but not yet done a general fix.) Patches would
certainly be welcome though. ;)

a.

On 7/31/06, John Byrne <john.l.byrne@hp.com>
wrote:>
> Hi,
>
> I don''t see any obvious flush to disk taking place for
vbd''s on the
> source host in XendCheckpoint.py before the domain is started on the new
> host. Is there a guarantee that all written data is on disk somewhere
> else or is something needed?
>
> Thanks,
>
> John Byrne
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Byrne

2006-Jul-31 22:26 UTC

head link

Re: [Xen-devel] vbd flushing during migration?

It would be a bit ugly, but mostly straightforward to watch for the 
destruction of the vbds (or all devices) after the destroyDomain() is 
done and then sending an all-clear. (The last time I looked there
wasn''t
a waitForDomainDestroy() anywhere, so it would probably be best to write 
one.) This would guarantee correctness: which is the most important thing.

The problem I see with that strategy is the effect on downtime during a 
live-move. Ideally you''d like to start the vbd cleanup when the final 
suspend is done and hope to parallelize the any final device operations 
with the final pass of live-move. How to do that and play nice with 
domain destruction on the normal path and handle errors seems a lot less 
clear to me.

So, are you just ignoring the notion of minimizing downtime for the 
moment or is there something I''m missing?

John

Andrew Warfield wrote:> It''s slightly more than a flush that''s required.  The
migration
> protocol needs to be extended so that execution on the target host
> doesn''t start until all of the outstanding (i.e. issued by the
> backend) block requests have been either cancelled or acknowledged.
> This should be pretty straight forward given that the backend driver
> ref counts a blkif''s state based on pending requests, and
won''t tear
> down the backend directory in xenstore until all the outstanding
> requests have cleared.  All that is likely required is to have the
> migration code register watches on the backend vbd directories, and
> wait for them to disappear before giving the all-clear to the new
> host.
> 
> We''ve talked about this enough to know how to fix it, but
haven''t had
> a chance to hack it up.  (I think Julian has looked into the problem a
> bit for blktap, but not yet done a general fix.) Patches would
> certainly be welcome though. ;)
> 
> a.
> 
> On 7/31/06, John Byrne <john.l.byrne@hp.com> wrote:
>>
>> Hi,
>>
>> I don''t see any obvious flush to disk taking place for
vbd''s on the
>> source host in XendCheckpoint.py before the domain is started on the
new
>> host. Is there a guarantee that all written data is on disk somewhere
>> else or is something needed?
>>
>> Thanks,
>>
>> John Byrne
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xensource.com
>> http://lists.xensource.com/xen-devel
>>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andrew Warfield

2006-Jul-31 23:03 UTC

head link

Re: [Xen-devel] vbd flushing during migration?

> So, are you just ignoring the notion of minimizing downtime for the
> moment or is there something I''m missing?
That''s exactly what I''m suggesting.  The current risk is a
(very slim)
write-after-write error case.  Basically, you have a number of
in-flight write requests on the original machine that''s somewhere in
between the backend and the physical disk at the time of migration.
Currently, you migrate and the shadow request ring reissues these on
the new host -- which is the right thing to do given that requests are
idempotent.  The problem is that the original in-flight requests can
still hit the disk some time later and cause problems.  The WAW is if
you write an update to a block that had an in-flight request
immediately on arriving at the new host, and it then gets overwritten
by the original request.

Note that for sane block devices this is extremely unlikely as the
aperture that we are talking about is basically whatever is in the
disk''s request queue-- it''s only really a problem for things
like
NFS+loopback and other instances of buffered I/O behind blockback
(which is generally a really bad idea!) where you could see a large
window of outstanding requests that haven''t actually hit the disk.
These situations probably need more than just waiting for blkback to
clear pending reqs, as loopback will acknowledge requests befre they
hit the disk in some cases.

So, I think the short-term correctness-preserving approach is to (a)
modify the migration process to add an interlock on block backends on
the source physical machine to go to a closed state -- indicating that
all the outstanding requests have cleared, and (b) not to use
loopback, or buffered IO generally, behind blkback when you intend to
do migration.  The blktap code in the tree is much safer for this sort
of thing and we''re happy to sort out migration problems if/when they
come up.

If this winds up adding a big overhead to migration switching time (I
don''t think it should, block shutdown can be parallelized with the
stop-and-copy round of migration -- you''ll be busy transferring all
the dirty pages that you''ve queued for DMA anyway) we can probably
speed it up.  One option would be to look into whether the linux block
layer will let you abort submitted requests.  Another would be to
modify the block frontend driver to realize that it''s just been
migrated and queue all requests to blocks that were in it''s shadow
ring until it receives notification that those writes have cleared
from the original host.  As you point out -- these are probably best
left as a second step. ;)

I''d be interested to know if anyone on the list is solving this sort
of thing already using some sort of storage fencing fanciness to just
sink any pending requests on the original host after migration has
happened.

a.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Charles Coffing

2006-Aug-01 19:28 UTC

head link

Re: [Xen-devel] vbd flushing during migration?

I''ve got a patch in our tree that does (basically) what John is
describing.

The exact bug we hit was that a "xm shutdown -w vm" did not wait until
the vbds were cleared out before returning.  So now I wait until the
backend/vbd nodes go away before returning.

This could probably be done more cleanly with watches, and should be
abstracted out to be sure it applies equally to migration, and so forth.
 But for the sake of discussion, the patch is attached.

-Charles

>>> On Mon, Jul 31, 2006 at  4:26 PM, in message<44CE83B1.1090605@hp.com>, John
Byrne <john.l.byrne@hp.com> wrote: > It would be a bit ugly, but mostly straightforward to watch for the 
> destruction of the vbds (or all devices) after the destroyDomain() is
> done and then sending an all- clear. (The last time I looked there
wasn''t > a waitForDomainDestroy() anywhere, so it would probably be best to
write > one.) This would guarantee correctness: which is the most important
thing.> 
> The problem I see with that strategy is the effect on downtime during
a > live- move. Ideally you''d like to start the vbd cleanup when the
final > suspend is done and hope to parallelize the any final device
operations > with the final pass of live- move. How to do that and play nice with
> domain destruction on the normal path and handle errors seems a lot
less > clear to me.
> 
> So, are you just ignoring the notion of minimizing downtime for the 
> moment or is there something I''m missing?
> 
> John
> 
> Andrew Warfield wrote:
>> It''s slightly more than a flush that''s required.  The
migration
>> protocol needs to be extended so that execution on the target host
>> doesn''t start until all of the outstanding (i.e. issued by the
>> backend) block requests have been either cancelled or acknowledged.
>> This should be pretty straight forward given that the backend
driver>> ref counts a blkif''s state based on pending requests, and
won''t
tear>> down the backend directory in xenstore until all the outstanding
>> requests have cleared.  All that is likely required is to have the
>> migration code register watches on the backend vbd directories, and
>> wait for them to disappear before giving the all- clear to the new
>> host.
>> 
>> We''ve talked about this enough to know how to fix it, but
haven''t
had>> a chance to hack it up.  (I think Julian has looked into the problem
a>> bit for blktap, but not yet done a general fix.) Patches would
>> certainly be welcome though. ;)
>> 
>> a.
>> 
>> On 7/31/06, John Byrne <john.l.byrne@hp.com> wrote:
>>>
>>> Hi,
>>>
>>> I don''t see any obvious flush to disk taking place for
vbd''s on
the>>> source host in XendCheckpoint.py before the domain is started on
the new>>> host. Is there a guarantee that all written data is on disk
somewhere>>> else or is something needed?
>>>
>>> Thanks,
>>>
>>> John Byrne
>>>
>>>
>>> _______________________________________________
>>> Xen- devel mailing list
>>> Xen- devel@lists.xensource.com
>>> http://lists.xensource.com/xen- devel
>>>
>> 
> 
> 
> _______________________________________________
> Xen- devel mailing list
> Xen- devel@lists.xensource.com
> http://lists.xensource.com/xen- devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Reasonably Related Threads

Search for more possibly parallel threads

Xen devel - Jul 2006 - vbd flushing during migration?

[Xen-devel] vbd flushing during migration?

Re: [Xen-devel] vbd flushing during migration?

Re: [Xen-devel] vbd flushing during migration?

Re: [Xen-devel] vbd flushing during migration?

Re: [Xen-devel] vbd flushing during migration?

Reasonably Related Threads