thr3ads.net - Xen users - [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5 [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Antibozo

2008-Jul-31 21:30 UTC

[Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Greetings.

I''ve reviewed the list archives, particularly the posts from Zakk, on 
this subject, and found results similar to his. drbd provides a 
block-drbd script, but with full virtualization, at least on RHEL 5, 
this does not work; by the time the block script is run, the qemu-dm has 
already been started.

Instead I''ve been simply musing the possibility of keeping the drbd 
devices in primary/primary state at all times. I''m concerned about a 
race condition, however, and want to ask if others have examined this 
alternative.

I am thinking of a scenario where the vm is running on node A, and has a 
process that is writing to disk at full speed, and consequently the drbd 
device on the node B is lagging. If I perform a live migration from node 
A to B under this condition, the local device on node B might not be in 
sync at the time the vm is started on that node. Maybe.

If I use drbd protocol C, theoretically at least, a sync on the device 
on node A shouldn''t return until node B is fully in sync. So I guess my
main question is: during migration, does xend force a device sync on 
node A before the vm is started on node B?

A secondary question I have (and this may be a question for the drbd 
folks as well) is: why is the block-drbd script necessary? I.e. why not 
simply leave the drbd primary/primary at all times--what benefit is 
there to marking the device secondary on the standby node?

Or am I just very confused? Does anyone else have thoughts or experience 
on this matter? All responses are appreciated.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

nathan@robotics.net

2008-Jul-31 21:58 UTC

head link

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS with no 
problems. The only issue I have with live migration is that the arp takes 
10 - 15 sec to get refreshed so you lose connectivity during that time. I 
have the problem with 3.0ish xen on Centos 5.2 as well as xen 3.2.1.

Anyway, other then the ARP issue, I have this working in production with 
about two dozen DomUs.

Note: If you want to use LVM for xen rather then files on GFS/LVM/DRBD you 
need to run the latest DRBD that supports max-bio-bvecs.
><>Nathan Stratton                                CTO, BlinkMind, Inc.
nathan at robotics.net                         nathan at blinkmind.com
http://www.robotics.net                        http://www.blinkmind.com

On Thu, 31 Jul 2008, Antibozo wrote:
> Greetings.
>
> I''ve reviewed the list archives, particularly the posts from Zakk,
on this
> subject, and found results similar to his. drbd provides a block-drbd
script,
> but with full virtualization, at least on RHEL 5, this does not work; by
the
> time the block script is run, the qemu-dm has already been started.
>
> Instead I''ve been simply musing the possibility of keeping the
drbd devices
> in primary/primary state at all times. I''m concerned about a race
condition,
> however, and want to ask if others have examined this alternative.
>
> I am thinking of a scenario where the vm is running on node A, and has a 
> process that is writing to disk at full speed, and consequently the drbd 
> device on the node B is lagging. If I perform a live migration from node A
to
> B under this condition, the local device on node B might not be in sync at 
> the time the vm is started on that node. Maybe.
>
> If I use drbd protocol C, theoretically at least, a sync on the device on 
> node A shouldn''t return until node B is fully in sync. So I guess
my main
> question is: during migration, does xend force a device sync on node A
before
> the vm is started on node B?
>
> A secondary question I have (and this may be a question for the drbd folks
as
> well) is: why is the block-drbd script necessary? I.e. why not simply leave
> the drbd primary/primary at all times--what benefit is there to marking the
> device secondary on the standby node?
>
> Or am I just very confused? Does anyone else have thoughts or experience on
> this matter? All responses are appreciated.
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
>
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Antibozo

2008-Jul-31 23:24 UTC

head link

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

On 2008-07-31 21:58, nathan@robotics.net wrote:> I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS with 
> no problems. The only issue I have with live migration is that the arp 
> takes 10 - 15 sec to get refreshed so you lose connectivity during that 
> time. I have the problem with 3.0ish xen on Centos 5.2 as well as xen 
> 3.2.1.
One can run a job on the vm to generate a packet every second or two to 
resolve this; ping in a loop should do it.

My scenario doesn''t involve any clustered filesystem. I''m
using phy:
drbd devices as the backing for the vm, not files. As far as I 
understand things, a clustered filesystem shouldn''t be necessary, as 
long as the drbd devices are in sync at the moment migration occurs.

But the question remains whether that condition is guaranteed, and I 
hope to hear from someone who knows the answer to that question...
> Anyway, other then the ARP issue, I have this working in production with 
> about two dozen DomUs.
> 
> Note: If you want to use LVM for xen rather then files on GFS/LVM/DRBD 
> you need to run the latest DRBD that supports max-bio-bvecs.
I''m actually running drbd on top of LVM. But I''ll look into
the
max-bio-bvecs thing anyway out of curiosity.

Thanks for the reply.
> On Thu, 31 Jul 2008, Antibozo wrote:
>> Greetings.
>>
>> I''ve reviewed the list archives, particularly the posts from
Zakk, on
>> this subject, and found results similar to his. drbd provides a 
>> block-drbd script, but with full virtualization, at least on RHEL 5, 
>> this does not work; by the time the block script is run, the qemu-dm 
>> has already been started.
>>
>> Instead I''ve been simply musing the possibility of keeping the
drbd
>> devices in primary/primary state at all times. I''m concerned
about a
>> race condition, however, and want to ask if others have examined this 
>> alternative.
>>
>> I am thinking of a scenario where the vm is running on node A, and has 
>> a process that is writing to disk at full speed, and consequently the 
>> drbd device on the node B is lagging. If I perform a live migration 
>> from node A to B under this condition, the local device on node B 
>> might not be in sync at the time the vm is started on that node. Maybe.
>>
>> If I use drbd protocol C, theoretically at least, a sync on the device 
>> on node A shouldn''t return until node B is fully in sync. So I
guess
>> my main question is: during migration, does xend force a device sync 
>> on node A before the vm is started on node B?
>>
>> A secondary question I have (and this may be a question for the drbd 
>> folks as well) is: why is the block-drbd script necessary? I.e. why 
>> not simply leave the drbd primary/primary at all times--what benefit 
>> is there to marking the device secondary on the standby node?
>>
>> Or am I just very confused? Does anyone else have thoughts or 
>> experience on this matter? All responses are appreciated.
_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Antibozo

2008-Jul-31 23:43 UTC

head link

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

On 2008-07-31 23:24, Antibozo wrote:> On 2008-07-31 21:58, nathan@robotics.net wrote:
>> I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS with 
>> no problems. The only issue I have with live migration is that the arp 
>> takes 10 - 15 sec to get refreshed so you lose connectivity during 
>> that time. I have the problem with 3.0ish xen on Centos 5.2 as well as 
>> xen 3.2.1.
> 
> One can run a job on the vm to generate a packet every second or two to 
> resolve this; ping in a loop should do it.
Quick follow-up:

arping -b -A *ip-address-of-vm*

seems to work pretty well. I get a 2-3 second dropout if I''m running 
this during live migration.

-- 
Jefferson Ogata : Internetworker, Antibozo
<ogata@antibozo.net>  http://www.antibozo.net/ogata/

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

nathan@robotics.net

2008-Jul-31 23:58 UTC

head link

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

On Thu, 31 Jul 2008, Antibozo wrote:
> On 2008-07-31 23:24, Antibozo wrote:
>> On 2008-07-31 21:58, nathan@robotics.net wrote:
>>> I am running DRBD primary/primary on Centos 5.2 with CLVM and GFS
with no
>>> problems. The only issue I have with live migration is that the arp
takes
>>> 10 - 15 sec to get refreshed so you lose connectivity during that
time. I
>>> have the problem with 3.0ish xen on Centos 5.2 as well as xen
3.2.1.
>> 
>> One can run a job on the vm to generate a packet every second or two to
>> resolve this; ping in a loop should do it.
>
> Quick follow-up:
>
> arping -b -A *ip-address-of-vm*
>
> seems to work pretty well. I get a 2-3 second dropout if I''m
running this
> during live migration.
Odd, I am seeing 15 - 20 sec, how on earth does someone get 165ms that is 
talked about? I get the same delay if I am migrating over gig e or 20 gig 
infiniband. It also should not be my I/O subsystem as I am getting 400 MB 
writes and 550 MB reads.
><>Nathan Stratton                                CTO, BlinkMind, Inc.
nathan at robotics.net                         nathan at blinkmind.com
http://www.robotics.net                        http://www.blinkmind.com

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Antibozo

2008-Aug-03 21:55 UTC

head link

[Xen-users] Re: drbd 8 primary/primary and xen migration on RHEL 5

On 2008-07-31 21:30, Antibozo wrote:> I''ve reviewed the list archives, particularly the posts from Zakk,
on
> this subject, and found results similar to his. drbd provides a 
> block-drbd script, but with full virtualization, at least on RHEL 5, 
> this does not work; by the time the block script is run, the qemu-dm has 
> already been started.
I''ve developed a workaround for all of this, in the form of a wrapper 
script for qemu-dm. This is trickier than it might seem at first blush, 
because of the way that xend uses signals to communicate with qemu-dm. 
The wrapper script can be used in the "model =" line of a vm
definition,
and will take care of assuring consistency of the drbd resource(s) for a 
vm during reboots, migration, etc.

The script can be found here:

http://www.antibozo.net/xen/qemu-dm.drbd

Strategy is detailed in script comments. Please review these if you want 
details. The principle objective is prevention of split brain.

If you want to use Xen on top of drbd for high availability, this is a 
decent first cut, as far as I can tell. Feedback is welcome.
> Instead I''ve been simply musing the possibility of keeping the
drbd
> devices in primary/primary state at all times. I''m concerned about
a
> race condition, however, and want to ask if others have examined this 
> alternative.
I''ve moved away from this strategy, and am keeping resources secondary 
when a vm isn''t using them. This enables the remote node to tell if a
vm
is already running on a drbd resource by inspecting the peer 
primary/secondary status (the wrapper script does this). This makes it 
difficult, though not impossible, for you to accidentally fire up a vm 
using a resource that is already in use by a vm on the remote node.

I''ve also discovered that primary/primary mode is not actually needed, 
at least for HVM vms using Xen 3.0.3 as shipped on RHEL 5. The 
conventional wisdom was that primary/primary was necessary during 
migration, but with the appropriate wrapper around qemu-dm, we can wait 
for the peer to go secondary before going primary on the local node.

One way you can still get yourself pretty hosed (if you''re determined
to
do so) is the following:

- Start vm on node A. The wrapper makes the drbd resource primary, and 
the vm starts running.
- Start vm on node B. This creates the vm instance, but the wrapper 
blocks waiting for the drbd resource on node A to be secondary.
- Start a migration from node A to B. This freaks xend out since it 
already has a vm with the same name running (even though it isn''t 
actually running yet).

In this scenario, you may end up having to reboot node B because the xen 
store gets crufty. But you still should never end up with a split brain 
condition.

Obviously you could also get hosed if your nodes can''t talk to one 
another, and you start the same vm on both nodes. This is classic split 
brain. In this case, drbd should refuse to resync when drbd connectivity 
is restored, and you''ll have to kill one of the vm instances,
invalidate
the local drbd resource, and resync, after which things should be fine. 
I haven''t tested this scenario yet, so YMMV.
> I am thinking of a scenario where the vm is running on node A, and has a 
> process that is writing to disk at full speed, and consequently the drbd 
> device on the node B is lagging. If I perform a live migration from node 
> A to B under this condition, the local device on node B might not be in 
> sync at the time the vm is started on that node. Maybe.
I have done some testing of heavy disk i/o situations during live 
migration, and things appear to remain fully consistent. Note that the 
i/o stack of filesystem on top of LVM volume, on top of xen, on top of 
drbd, on top of LVM volume is not super fast. I see 10-20 MB/s with new 
block allocation on a 4-core PowerEdge 1950 using SAS disks (with one 
CPU allocated to the vm). So don''t plan on that particular architecture
for your heavily used RDBMS.
> If I use drbd protocol C, theoretically at least, a sync on the device 
> on node A shouldn''t return until node B is fully in sync. So I
guess my
> main question is: during migration, does xend force a device sync on 
> node A before the vm is started on node B?
By all appearances (empirically), yes. And since this qemu-dm wrapper 
also waits for secondary state on the peer, and UpToDate state on the 
local copy, before actually invoking the real qemu-dm, I believe we are 
covered.

-- 
Jefferson Ogata : Internetworker, Antibozo

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Antibozo

2008-Aug-03 22:00 UTC

head link

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

On 2008-07-31 23:58, nathan@robotics.net wrote:> On Thu, 31 Jul 2008, Antibozo wrote:
>> On 2008-07-31 23:24, Antibozo wrote:
>>> One can run a job on the vm to generate a packet every second or
two
>>> to resolve this; ping in a loop should do it.
>>
>> Quick follow-up:
>>
>> arping -b -A *ip-address-of-vm*
>>
>> seems to work pretty well. I get a 2-3 second dropout if I''m
running
>> this during live migration.
> 
> Odd, I am seeing 15 - 20 sec, how on earth does someone get 165ms that 
> is talked about? I get the same delay if I am migrating over gig e or 20 
> gig infiniband. It also should not be my I/O subsystem as I am getting 
> 400 MB writes and 550 MB reads.
I assume the 165ms statistic is actual downtime of the vm during live 
migration, and delays regarding network switching paths are a completely 
external matter. Are you sure the 15-20 seconds you''re seeing are
actual
downtime? If you run "while true ; do date ; sleep 1 ; done" on the vm
while it''s migrating, do you see a 15-20 second dropout in the output 
(once it''s visible to you)?

I wonder if your IP switches have some sort of arp flap limiting going 
on. You might try disabling spanning tree if that''s enabled.

-- 
Jefferson Ogata : Internetworker, Antibozo

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Possibly Parallel Threads

Search for more reasonably related threads

Xen users - Jul 2008 - drbd 8 primary/primary and xen migration on RHEL 5

[Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

[Xen-users] Re: drbd 8 primary/primary and xen migration on RHEL 5

Re: [Xen-users] drbd 8 primary/primary and xen migration on RHEL 5

Possibly Parallel Threads