thr3ads.net - Xen users - [Xen-users] Alternatives to cman+clvmd ? [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Christopher Smith

2009-Mar-12 11:14 UTC

[Xen-users] Alternatives to cman+clvmd ?

I currently have a few CentOS 5.2 based Xen clusters at different sites. 
  These are built around a group of 3 or more Xen nodes (blades) and 
some sort of shared storage (FC or iSCSI) carved up by LVM and allocated 
to the domUs.

I am "managing" the shared storage (from the dom0 perspective) using 
cman+clvmd, so that changes to the LVs (rename/resize/create/delete/etc) 
are automatically and immediately visible to all the Xen hosts.

However, this combination seems to be horribly unreliable.  Any network 
hiccup more than a lost ping or two results in cman losing contact with 
the rest of the machines, which it frequently does not regain.  For 
example, failing one of the bonded NICs usually takes few seconds for 
everything to ''stabilise'' again on the network, but in that
time cman
has lost contact with all the other nodes and often killed itself (or 
bits of itself) in the process.  Further, I have found I often have to 
reboot the machine completely to convince everything to start talking 
nicely again (why this is so, I have no idea, but on more than one 
occasion I''ve spent an hour stopping/starting cman and manually killing
processes trying to get it working again, with no luck, then had a 
reboot fix everything instantly).


I am not interested in any fancy failover of domUs, or anything else 
cluster related on the Xen host side - we address HA and redundancy 
requirements within the VMs themselves.  The only reason I have this 
stuff setup at all, is to be able to use clvmd.  Is there any 
alternative out there that will let me keep using clvmd (or achieve 
similar functionality), without having to worry about cman constantly 
falling over ?

Cheers,
CS
-- 
Christopher Smith

UNIX Team Leader
Nighthawk Radiology Services
Limmatquai 4, 6th Floor
8001 Zurich, Switzerland
http://www.nighthawkrad.net
Sydney Fax:    +61 2 8211 2333
Zurich Fax:    +41 43 497 3301
USA Toll free:  866 241 6635

Email:         csmith@nighthawkrad.net
IP Extension:  8163
Sydney Phone:  +61 2 8211 2363
Sydney Mobile: +61 4 0739 7563
Zurich Phone:  +41 44 267 3363
Zurich Mobile: +41 79 550 2715

All phones forwarded to my current location, however, please consider 
the local time in Zurich before calling from abroad.


CONFIDENTIALITY NOTICE:   This email, including any attachments, 
contains information from NightHawk Radiology Services, which may be 
confidential or privileged. The information is intended to be for the 
use of the individual or entity named above. If you are not the intended 
recipient, be aware that any disclosure, copying, distribution or use of 
the contents of this information is prohibited. If you have received 
this email in error, please notify NightHawk Radiology Services 
immediately by forwarding message to postmaster@nighthawkrad.net and 
destroy all electronic and hard copies of the communication, including 
attachments.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Thiago Camargo Martins Cordeiro

2009-Mar-12 13:54 UTC

head link

Re: [Xen-users] Alternatives to cman+clvmd ?

What about creating a ZFS pool with all your discs and then share it with
your Xen Hypervisores through iSCSI, NFS or even CIFS! I''m trying it
right
now...

2009/3/12 Christopher Smith <csmith@nighthawkrad.net>
> I currently have a few CentOS 5.2 based Xen clusters at different sites.
>  These are built around a group of 3 or more Xen nodes (blades) and some
> sort of shared storage (FC or iSCSI) carved up by LVM and allocated to the
> domUs.
>
> I am "managing" the shared storage (from the dom0 perspective)
using
> cman+clvmd, so that changes to the LVs (rename/resize/create/delete/etc)
are
> automatically and immediately visible to all the Xen hosts.
>
> However, this combination seems to be horribly unreliable.  Any network
> hiccup more than a lost ping or two results in cman losing contact with the
> rest of the machines, which it frequently does not regain.  For example,
> failing one of the bonded NICs usually takes few seconds for everything to
> ''stabilise'' again on the network, but in that time cman
has lost contact
> with all the other nodes and often killed itself (or bits of itself) in the
> process.  Further, I have found I often have to reboot the machine
> completely to convince everything to start talking nicely again (why this
is
> so, I have no idea, but on more than one occasion I''ve spent an
hour
> stopping/starting cman and manually killing processes trying to get it
> working again, with no luck, then had a reboot fix everything instantly).
>
>
> I am not interested in any fancy failover of domUs, or anything else
> cluster related on the Xen host side - we address HA and redundancy
> requirements within the VMs themselves.  The only reason I have this stuff
> setup at all, is to be able to use clvmd.  Is there any alternative out
> there that will let me keep using clvmd (or achieve similar functionality),
> without having to worry about cman constantly falling over ?
>
> Cheers,
> CS
> --
> Christopher Smith
>
> UNIX Team Leader
> Nighthawk Radiology Services
> Limmatquai 4, 6th Floor
> 8001 Zurich, Switzerland
> http://www.nighthawkrad.net
> Sydney Fax:    +61 2 8211 2333
> Zurich Fax:    +41 43 497 3301
> USA Toll free:  866 241 6635
>
> Email:         csmith@nighthawkrad.net
> IP Extension:  8163
> Sydney Phone:  +61 2 8211 2363
> Sydney Mobile: +61 4 0739 7563
> Zurich Phone:  +41 44 267 3363
> Zurich Mobile: +41 79 550 2715
>
> All phones forwarded to my current location, however, please consider the
> local time in Zurich before calling from abroad.
>
>
> CONFIDENTIALITY NOTICE:   This email, including any attachments, contains
> information from NightHawk Radiology Services, which may be confidential or
> privileged. The information is intended to be for the use of the individual
> or entity named above. If you are not the intended recipient, be aware that
> any disclosure, copying, distribution or use of the contents of this
> information is prohibited. If you have received this email in error, please
> notify NightHawk Radiology Services immediately by forwarding message to
> postmaster@nighthawkrad.net and destroy all electronic and hard copies of
> the communication, including attachments.
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
>

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Christopher Smith

2009-Mar-12 14:05 UTC

head link

Re: [Xen-users] Alternatives to cman+clvmd ?

Thiago Camargo Martins Cordeiro wrote:> What about creating a ZFS pool with all your discs and then share it
> with your Xen Hypervisores through iSCSI, NFS or even CIFS! I''m
trying
> it right now...
1. I''m not using Solaris, so ZFS itself is a non-starter.
2. Using iSCSI off a ZFS pool will result in the same problem.
3. I''m not aware of any reliable/supported way to get HA out of iSCSI,
NFS or CIFS storage exported from "ZFS servers" (without buying
something like a 7000 series).

--
Christopher Smith

UNIX Team Leader
Nighthawk Radiology Services
Limmatquai 4, 6th Floor
8001 Zurich, Switzerland
http://www.nighthawkrad.net
Sydney Fax: +61 2 8211 2333
Zurich Fax: +41 43 497 3301
USA Toll free: 866 241 6635

Email: csmith@nighthawkrad.net
IP Extension: 8163
Sydney Phone: +61 2 8211 2363
Sydney Mobile: +61 4 0739 7563
Zurich Phone: +41 44 267 3363
Zurich Mobile: +41 79 550 2715

All phones forwarded to my current location, however, please consider
the local time in Zurich before calling from abroad.

CONFIDENTIALITY NOTICE: This email, including any attachments,
contains information from NightHawk Radiology Services, which may be
confidential or privileged. The information is intended to be for the
use of the individual or entity named above. If you are not the intended
recipient, be aware that any disclosure, copying, distribution or use of
the contents of this information is prohibited. If you have received
this email in error, please notify NightHawk Radiology Services
immediately by forwarding message to postmaster@nighthawkrad.net and
destroy all electronic and hard copies of the communication, including
attachments.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Javier Guerra

2009-Mar-12 14:06 UTC

head link

Re: [Xen-users] Alternatives to cman+clvmd ?

On Thu, Mar 12, 2009 at 8:54 AM, Thiago Camargo Martins Cordeiro
<thiagocmartinsc@gmail.com> wrote:> What about creating a ZFS pool with all your discs and then share it with
> your Xen Hypervisores through iSCSI, NFS or even CIFS! I''m trying
it right
> now...
AFAICT, that means doing all the volume
management/resizing/snapshot/etc on the storage unit, exporting the
''managed'' block devices.  it''s a nice way to do it,
but there''s also a
couple of cons relative to a shared VG SAN like the one with CLVM:

- you need to have *one* storage controller, this could become a
bottleneck since all data flows from its (ethernet) ports

- resizing a shared block device isn''t automatically noticed by all
layers.

granted, the second one is shared by several other schemas.  one of
the main culprits is Xen itself, because there''s no way to tell it to
recheck blockdevices'' size to tell the guest.  usually the DomU has to
be restarted.


-- 
Javier

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Ferenc Wagner

2009-Mar-12 22:33 UTC

head link

[Xen-users] Re: Alternatives to cman+clvmd ?

Christopher Smith <csmith@nighthawkrad.net> writes:
> I am "managing" the shared storage [...] using cman+clvmd [...]
> However, this combination seems to be horribly unreliable.  Any
> network hiccup more than a lost ping or two results in cman losing
> contact with the rest of the machines, which it frequently does not
> regain.
This isn''t inherent to cman, I''ve been using it for years
without much
problem.  You have to handle fencing correctly, otherwise it will bite
you, no matter what timeouts you configure.  But otherwise, it''s OK.
> For example, failing one of the bonded NICs usually takes few
> seconds for everything to ''stabilise'' again on the
network, but in
> that time cman has lost contact with all the other nodes and often
> killed itself (or bits of itself) in the process.
This shouldn''t happen.  Either you misconfigured your bonding (though
a few seconds failover time doesn''t sound gross), or more likely you
misconfigured cman: it''s defaults aren''t this strict, even.

Having said all this, managing the cluster infrastructure for so
little (clvm only) feels excessive indeed.  But I don''t know any
better way (other than doing volume management on the storage side).
-- 
Cheers,
Feri.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Christopher Smith

2009-Mar-19 17:52 UTC

head link

Re: [Xen-users] Re: Alternatives to cman+clvmd ?

Ferenc Wagner wrote:> Christopher Smith <csmith@nighthawkrad.net> writes:
> 
>> I am "managing" the shared storage [...] using cman+clvmd
[...]
>> However, this combination seems to be horribly unreliable.  Any
>> network hiccup more than a lost ping or two results in cman losing
>> contact with the rest of the machines, which it frequently does not
>> regain.
> 
> This isn''t inherent to cman, I''ve been using it for years
without much
> problem.  You have to handle fencing correctly, otherwise it will bite
> you, no matter what timeouts you configure.  But otherwise, it''s
OK.
Well, I use manual fencing...

I realise this is less than ideal, but it''s far more ideal to me than a
xen host getting fenced and taking out twenty VMs.  Especially given how 
frequently it seems to get to the point where it would have fenced 
something.
>> For example, failing one of the bonded NICs usually takes few
>> seconds for everything to ''stabilise'' again on the
network, but in
>> that time cman has lost contact with all the other nodes and often
>> killed itself (or bits of itself) in the process.
> 
> This shouldn''t happen.  Either you misconfigured your bonding
(though
> a few seconds failover time doesn''t sound gross), or more likely
you
> misconfigured cman: it''s defaults aren''t this strict,
even.
This was actually a problem on the switches, something to do with STP - 
the bonding failovers were actually taking a bit over 30 seconds, so 
that''s probably why cman was falling over.
> Having said all this, managing the cluster infrastructure for so
> little (clvm only) feels excessive indeed.  But I don''t know any
> better way (other than doing volume management on the storage side).
And volume management on the storage side brings along its own set of 
complications. :(

-- 
Christopher Smith

UNIX Team Leader
Nighthawk Radiology Services
Limmatquai 4, 6th Floor
8001 Zurich, Switzerland
http://www.nighthawkrad.net
Sydney Fax:    +61 2 8211 2333
Zurich Fax:    +41 43 497 3301
USA Toll free:  866 241 6635

Email:         csmith@nighthawkrad.net
IP Extension:  8163
Sydney Phone:  +61 2 8211 2363
Sydney Mobile: +61 4 0739 7563
Zurich Phone:  +41 44 267 3363
Zurich Mobile: +41 79 550 2715

All phones forwarded to my current location, however, please consider 
the local time in Zurich before calling from abroad.


CONFIDENTIALITY NOTICE:   This email, including any attachments, 
contains information from NightHawk Radiology Services, which may be 
confidential or privileged. The information is intended to be for the 
use of the individual or entity named above. If you are not the intended 
recipient, be aware that any disclosure, copying, distribution or use of 
the contents of this information is prohibited. If you have received 
this email in error, please notify NightHawk Radiology Services 
immediately by forwarding message to postmaster@nighthawkrad.net and 
destroy all electronic and hard copies of the communication, including 
attachments.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Reasonably Related Threads

Search for more possibly parallel threads

Xen users - Mar 2009 - Alternatives to cman+clvmd ?

[Xen-users] Alternatives to cman+clvmd ?

Re: [Xen-users] Alternatives to cman+clvmd ?

Re: [Xen-users] Alternatives to cman+clvmd ?

Re: [Xen-users] Alternatives to cman+clvmd ?

[Xen-users] Re: Alternatives to cman+clvmd ?

Re: [Xen-users] Re: Alternatives to cman+clvmd ?

Reasonably Related Threads