We''re looking at using xen on SLES11SP1 servers in production at remote sites. We''ve been testing ocfs2 over dual primary drbd as one of the storage choices. It runs great, and is certainly more cost affective than putting SANs at our sites. Our big concern is split brain and how to handle that when it happens. If you have a large, shared storage over drbd with VMs running on either host, how do you handle a split brain situation from a recovery standpoint? One idea we had is to run multiple ocfs2/drbd''s, one for each VM, and we can pick and choose which way to recover in a split brain. That seems like it makes it a lot more complex and not sure how successful it would even be. Are others using drbd in production? What has been your experience? Any suggestions are appreciated. Our company standard is SLES, so we have to use tools in that distro. Thanks, James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi,> Our big concern is split brain and how to handle that when it happens. > If you have a large, shared storage over drbd with VMs running on either > host, how do you handle a split brain situation from a recovery > standpoint?Use the SLE HA extension. With pacemaker and a hardware STONITH device, you''ll be able to fence the dead node. To prevent data loss on the DRBD volume, you can choose between three replication modes. http://www.drbd.org/users-guide-emb/s-replication-protocols.html> One idea we had is to run multiple ocfs2/drbd''s, one for each VM, and we > can pick and choose which way to recover in a split brain. That seems > like it makes it a lot more complex and not sure how successful it would > even be.DRBD can be used both as a level under, and on top of LVM2. I think I''d go with the one LV for a DomU + DRBD on the top of this config. This way you don''t even need cLVM. Regards, Ervin _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > We''re looking at using xen on SLES11SP1 servers in production atremote> sites. We''ve been testing ocfs2 over dual primary drbd as one of the > storage choices. It runs great, and is certainly more cost affective > than putting SANs at our sites. > > Our big concern is split brain and how to handle that when it happens. > If you have a large, shared storage over drbd with VMs running oneither> host, how do you handle a split brain situation from a recovery > standpoint? > > One idea we had is to run multiple ocfs2/drbd''s, one for each VM, andwe> can pick and choose which way to recover in a split brain. That seems > like it makes it a lot more complex and not sure how successful itwould> even be. > > Are others using drbd in production? > What has been your experience? > > Any suggestions are appreciated. Our company standard is SLES, so we > have to use tools in that distro. >I''m using DRBD. I was using LVM2 on a multiple-primary DRBD (eg one big DRBD volume cut into slices with LVM) and when it worked it was fine but it would split brain occasionally (on startup after a crash normally, not just spontaneously) and the CLVM daemon would hang on occasion for no good reason. Now I''m using DRBD on LVM on RAID0 and only multiple-primary where necessary. Each DRBD is formed from an LV on each node. Extra work to create a new DRBD volume (create LV on both nodes then set up the DRBD) but much less likely to go wrong during normal use - it hasn''t gone wrong yet after months of use! A better setup though would be a SAN consisting of iSCSI on DRBD in single primary mode (using HA to handle failover if the primary fails) and all the hosts using iSCSI. I don''t have enough hardware to make that work though unfortunately. James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> I''m using DRBD. I was using LVM2 on a multiple-primary DRBD (eg one big > DRBD volume cut into slices with LVM) and when it worked it was fine but > it would split brain occasionally (on startup after a crash normally, > not just spontaneously) and the CLVM daemon would hang on occasion for > no good reason. > > Now I''m using DRBD on LVM on RAID0 and only multiple-primary where > necessary. Each DRBD is formed from an LV on each node. Extra work to > create a new DRBD volume (create LV on both nodes then set up the DRBD) > but much less likely to go wrong during normal use - it hasn''t gone > wrong yet after months of use! >-Ok, so: -You create the same size LV on each xen host. -Setup a DRBD using that LV -Each VM would use that DRBD as it''s storage? In a split brain you then choose which way to recover the data for each separate LV/DRBD set. Do I have all that straight? How difficult/complex is it for you to add a VM this way? I guess once you get the procedure down it''s probably not that difficulty... Any experience with ocfs2 over drbd? In our testing it has actually been quite stable, and at times even tough to force a split brain situation, but you never know when it''s going to happen!> A better setup though would be a SAN consisting of iSCSI on DRBD in > single primary mode (using HA to handle failover if the primary fails) > and all the hosts using iSCSI. I don''t have enough hardware to make that > work though unfortunately.Using a SAN would be our first choice, unfortunately the costs, even for a low end SAN, make it not possible. Thanks, James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> > I''m using DRBD. I was using LVM2 on a multiple-primary DRBD (eg onebig> > DRBD volume cut into slices with LVM) and when it worked it was finebut> > it would split brain occasionally (on startup after a crashnormally,> > not just spontaneously) and the CLVM daemon would hang on occasionfor> > no good reason. > > > > Now I''m using DRBD on LVM on RAID0 and only multiple-primary where > > necessary. Each DRBD is formed from an LV on each node. Extra workto> > create a new DRBD volume (create LV on both nodes then set up theDRBD)> > but much less likely to go wrong during normal use - it hasn''t gone > > wrong yet after months of use! > > > > -Ok, so: > -You create the same size LV on each xen host. > -Setup a DRBD using that LV > -Each VM would use that DRBD as it''s storage?Correct. It might be a bit of a pain to resize an LV but I haven''t had to try yet.> In a split brain you then choose which way to recover the data foreach> separate LV/DRBD set. >I''m only using a single primary model now so the problem hasn''t come up.> > How difficult/complex is it for you to add a VM this way? I guess once > you get the procedure down it''s probably not that difficulty...It''s tricker than having LVM on top of DRBD but it''s not so bad.> > Any experience with ocfs2 over drbd? In our testing it has actuallybeen> quite stable, and at times even tough to force a split brainsituation,> but you never know when it''s going to happen!No. And in fact you''d be dealing with the possibility of split brain on DRBD and on OCFS2 so it becomes even trickier.> > > A better setup though would be a SAN consisting of iSCSI on DRBD in > > single primary mode (using HA to handle failover if the primaryfails)> > and all the hosts using iSCSI. I don''t have enough hardware to makethat> > work though unfortunately. > > Using a SAN would be our first choice, unfortunately the costs, evenfor> a low end SAN, make it not possible. >I''ve not yet done any performance tests to see if what I can build out of low end equipment would be fast enough... James _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On 25/08/2010, at 7:32 AM, James Harper wrote:> > > A better setup though would be a SAN consisting of iSCSI on DRBD in > single primary mode (using HA to handle failover if the primary fails) > and all the hosts using iSCSI. I don''t have enough hardware to make that > work though unfortunately. >I''ve had similar thoughts. You wouldn''t need a SAN though would you? DRBD with a single primary and iSCSI target. Use HA to fail over the DRBD primary and iscsi target on failure. My only concern was how much extra load the iscsi would place on the Dom0s. Haven''t got so far as testing this though. Jeff _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users