Hi everybody, what are the current options to get an Asterisk-system high available? Using two servers as active/passive with DRBD, Pacemaker/Corosync works very good, there are no quality issues of the voice quality, even not on high loaded servers and no problems with a lot of small packages. But for this you need two systems for every Asterisk-system, what is not "economic" in any way. Using (para-)virtualization with Xen could be an other option, on systems with low load this works reliable, but what happens on systems with high load? Are there any issues known about problems with the realtime, packet loss etc. because it runs in a VM? The idea would be having a HA-cluster of two servers with Xen, each of them runs one instance of an Asterisk-system in a single VM and on a failure the VM will be restarted on the other node. This might result in a much higher load on this node, because is runs two VMs, but for a short period, until the other node comes back again, it might be tolerable. Are there other options running two Asterisk-instances parallel on one system, each binded on it's own IP, maybe s.th. with chroot or similar? Thanks a lot, -- kind regards, Thorolf
Some food for thought: If you use DRBD, then you will mirror corruption from one system to another. You also cannot selectively pick files in a folder to mirror (you will mirror a lot!) As well, DRBD struggles as peers are set further apart (latency) or number of changes increases. A lot of HA tools don't look deeper into Asterisk to see if/how it has failed (they only detected catastrophic failures). What happens when the Asterisk process is alive but no longer bridging calls? If asterisk/host processes mess up an consume huge amounts of system resources, most HA tools cannot respond. As a biased recommendation, take a look at HAAst at www.generationd.com It takes care of moving a shared IP between hosts as well as other features. Michelle (I work for Generationd :) ________________________________________ From: asterisk-users-bounces at lists.digium.com <asterisk-users-bounces at lists.digium.com> on behalf of Thorolf Godawa <nospam at godawa.de> Sent: Thursday, March 6, 2014 10:21 AM To: Asterisk Users List Subject: [asterisk-users] High Availability with Asterisk Hi everybody, what are the current options to get an Asterisk-system high available? Using two servers as active/passive with DRBD, Pacemaker/Corosync works very good, there are no quality issues of the voice quality, even not on high loaded servers and no problems with a lot of small packages. But for this you need two systems for every Asterisk-system, what is not "economic" in any way. Using (para-)virtualization with Xen could be an other option, on systems with low load this works reliable, but what happens on systems with high load? Are there any issues known about problems with the realtime, packet loss etc. because it runs in a VM? The idea would be having a HA-cluster of two servers with Xen, each of them runs one instance of an Asterisk-system in a single VM and on a failure the VM will be restarted on the other node. This might result in a much higher load on this node, because is runs two VMs, but for a short period, until the other node comes back again, it might be tolerable. Are there other options running two Asterisk-instances parallel on one system, each binded on it's own IP, maybe s.th. with chroot or similar? Thanks a lot, -- kind regards, Thorolf -- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
On 6/3/14 3:21 pm, Thorolf Godawa wrote:> The idea would be having a HA-cluster of two servers with Xen, each of > them runs one instance of an Asterisk-system in a single VM and on a > failure the VM will be restarted on the other node. > This might result in a much higher load on this node, because is runs > two VMs, but for a short period, until the other node comes back again, > it might be tolerable.This is basically what we do, though in our case we use KVM rather than Xen; we found KVM behaved a great deal better managing timing than Xen, but YMMV and Xen may well have come along a great deal since we last looked at it. In fact, it could be argued that even without any need for HA, there's still an advantage to running a server in a VM: hardware portability. If the machine dies, you can quickly redeploy the VM to a new host without having to recompile things and so on because hardware has changed.> Are there other options running two Asterisk-instances parallel on one > system, each binded on it's own IP, maybe s.th. with chroot or similar?You might be able to do something interesting with containers (LXC), but given the ease of setting up KVM and the (relatively) small performance overhead, we've tended to just stick with that. On 6/3/14 3:46 pm, Michelle Dupuis wrote:> A lot of HA tools don't look deeper into Asterisk to see if/how it has failed (they only detected catastrophic failures). What happens when the Asterisk process is alive but no longer bridging calls?In fairness, the tools the OP mentioned (pacemaker/corosync) can be set up to detect other failures than whether asterisk is alive - a simple one to set up is to try connecting on 5060 UDP and make sure you get an acknowledgement. Likewise, you could even set up a call using the manager interface to a dummy extension and make sure it completes successfully. FWIW, we tend to use pacemaker with heartbeat rather than corosync, but both perform a pretty similar function. Kind regards, Chris -- This email is made from 100% recycled electrons
Hi Thorolf, Am 06.03.2014 16:21, schrieb Thorolf Godawa:> Using (para-)virtualization with Xen could be an other option, on > systems with low load this works reliable, but what happens on systems > with high load? Are there any issues known about problems with the > realtime, packet loss etc. because it runs in a VM?hmm, all my Asterisk'es run in (KVM) VMs, no issues there. But how is this related to high availability? I think it's not. :) I think the way to go for high availability (and scalability) is Kamailio! In a redundant setup, running on 2 separate physical machines (maybe in a VM, doesn't matter). Then you make them failsafe using whatever tool(s) available. Then you can set up 1, 2, 10 or 100 Asterisk "behind" Kamailio and any of them could fail (but 1 :-) ) and you will still be online. If you want to further develop the high availability thought, then you could use CephFS which will give you self-healing, 100% available storage over multiple physical storage servers. There you could store your Asterisk config files, or your MySQL database used by all the Asterisk servers, for CDRs, SIP registrations etc. It's kinda slow, but I think fast enough for Asterisk / MySQL. :) And, to scale and to make the Asterisk nodes redundant (redundancy is not really needed anymore, since Kamailio takes care of that, but basically then you get also VM/physical redundancy), you could look into OpenNebula which provides a nice auto-scaling feature already out of the box. If there's load on your Asterisk VMs, OpenNebula will detect this and spawn new Asterisk VMs (probably on different physical servers, otherwise it doesn't make that much sense performance-wise) which will automagically receive requests/calls from Kamailio. If the load goes down, the VM can be automagically stopped again to free resources for other VMs/applications. OpenNebula is less popular than OpenStack, which seems to be the first choice for Cloud-stuff today, but what I liked about OpenNebula is that it provides the auto-scaling feature already in the customer-facing web-frontend out-of-the-box, unlike OpenStack. So you could offer your customers a self-managed, redundant Asterisk cloud or something like that. :) In theory, this combination should give you a 100% redundant, auto-healing, auto-scaling VoIP setup. :) Regards Markus
My approach (in theory only, so please correct me if I'm wrong) would be to run asterisk on multiple boxes (one each). A dedicated monitoring box (nagios? custom scripts?) would perform frequent checks against the boxes (one of my previous projects one asterisk was using call files to demonstrate its health to another one). If a box fails, I would simply redirect/reroute its traffic to another one, using network solutions. Such as shutting down the production interface of a suspectedly failed asterisk box, having an idle one pick up its IP address, or using load balancing / routing / NAT to redirect the client's traffic to a standby box. My approach is based on the experience that linux based HA tools are often not free, or don't scale well, or engineered to circumvent an error in a slower manner (eg. booting a second VM takes too much time). However in the network world, there are well known protocols that were designed to take over in a matter of miliseconds. I do understand that this would not provide 'session' data, so failing over to a different box would mean the need to re-register, could cause calls to drop etc. This might be unacceptable for you. As I said in the beginning, I haven't been building such systems, in my experience a dropped call is not that big of a deal, if it happens because the network cuts over to a different box. This could be handled with a pair of frontend load balancers, where the number of asterisk boxes can be transparent. hope this helps adam