This page outlines the various option available to create high availability for a VoIP PBX. Some are generic solutions while others are PBX specific. Some are complete HA solutions while others are half-baked scripts that do some things but not others.
Before you select a HA solution, carefully read this page on creating / selecting a High Availability solution (see Asterisk High Availability Design )
PBX Specific Solutions
These solutions provide clusters that are PBX specific. As noted on the Asterisk High Availability Design wiki page, these solutions create clusters at the Application level and are deeply PBX aware, environmentally aware, trunk aware, etc. These solutions are usually deployed in mission critical call center environments (e.g.: 911/PSAP) and high-uptime commercial environments. The benefit of these solutions is complete peer autonomy and extensive detection, and all features in an integrated solution (heartbeat, data synchronizations, failure detection, sharing IP, etc.). The downside of these solutions is that they are PBX specific, so if your PBX software (eg: Asterisk, 3CX, FreeSwitch) is not listed below then you can’t use these solutions, and they require more OS and PBX skills to install.
- HAAst (High Availability for Asterisk) from Telium adds high availability / clustering to any pair of Asterisk servers. The High Availability for Asterisk (HAAst) add-on offers rapid automatic failover of a failed peer, total peer autonomy, IP sharing, advanced peer health detection, intelligent synchronization of files and databases, etc. HAAst also supports manual promote/demote for maintenance, a command line interface, a telnet interface, a web-based interface, and a developer API. Installation is straight forward, with no additional hardware required, no additional or complex heartbeat/cluster/etc software required either. HAAst is available in Free and Commercial editions and is in use at call centers, hospitals, and other high-uptime environments. HAAst is targeted at large commercial installations but a Free edition is available for anyone. HAAst operates at the OS level and is compatible with all Asterisk variations (FreePBX, Elastix, Thirdlane, Digium). See ((High Availability Asterisk (HAAst) )) for more information.
Distribution Specific HA Modules/Platforms
These solutions are proprietary/custom code and Asterisk bundled together to make a distribution; and/or a module which extends such a distribution. The benefit of these solutions is that because they are bundled/are part of a distribution, they are simple to install. The downside is that they lack peer autonomy, advanced failure detection, and other features listed on the Asterisk High Availability Design page. (eg: they use DRBD, use simplistic Asterisk process ID alive to detect failure, etc.). These solutions are ideally suited to home office / small office scenarios with low demands/expectations of HA, and low skills required of those performing the installation.
- Thirdlane Multi Tenant PBX platform is a Unified Communications software platform for service providers. Can be deployed anywhere. Uses DRBD and Hearteat for data replication and failover.
- Thirdlane Business PBX platform is a Unified Communications software platform for businesses and service providers. Can be deployed anywhere. Simple to set up, configure and manage. Uses DRBD and Hearteat for data replication and failover.
- SARK-HA from Aelintra Telecom offers High Availability Asterisk out-of-the-box. The Sark 200 is a complete PBX in a box solution, using a low-power ARM process all in the size of a deck of cards. Real-time failover takes less than 20 seconds to complete and includes support for ISDN PRI circuits. The servers are kept in synch using rsync (so no shared DRBD disk!). Wiki pages HERE. System also includes multi-tenant and a fully integrated provisioning system with zero touch, DHCP-free set-up for multicast capable phones… see HERE.
- Elastix HA is a module that integrates with Elastix and Asterisk. The HA modules use DRBD for sharing a disk between peers and uses Heartbeat to check if the Asterisk process is alive and failover. More information and documentation is available at 3CX Forum Archives.
Add-Ons, Hardware, and Scripts (Not Clustering)
These are a collection of scripts and tools which can help with mirror data, routing traffic, etc. These scripts are useful for a do it yourself approach to high availability, but they don’t contribute to the creation of a cluster and are not suitable for production environments.
- LoDi is a load balancer, call distribution manager, and protocol converter, that can balance calls over numerous devices / PBX’s. For environments that don’t want to / can’t cluster their PBX’s LoDi can ensure calls always get sent to an available PBX, and down / damaged PBX’s are removed from rotation. ((Load Distributor (LoDi)))
- DRBD + Heartbeat. If you are considering paying for the FreePBX HA ‘module’ then you should consider creating the exact same yourself for free (if you are a Linux admin). Using the free DRBD package for shared disk, and free Heartbeat package to detect failure, you end up with the same that is for sale from FreePBX. This type of High Availability (or FreePBX’s ‘module’) is very simplistic in terms of detection and failover – (and using a shared DRBD disk is undesirable), so you may not want to spend thousands of dollars on something you can assemble yourself for free. Elastix is very honest in saying that this is exactly what you get when you are installing their HA module (thumbs up for them!). For a home user or small office, this may be sufficient. As noted on the Asterisk High Availability Design page, when devices share a disk or use simple detection then this is not really a ‘cluster’, but it’s more ‘high availability’ than nothing.
- DNS SRV on the CPE side but not all phones handle this. It allows phones to register with a different PBX is one fails.
- Failover switches to automatically switch connections (T1, Ethernet, etc.) to a backup system.
- Network Monitoring to detect failures
- Remote Console and Power Control to remotely reboot and diagnose problems
- Q-Suite offers high availability and call survival based on U.S Patent and Trademark Office issued Patent US20110310773 A1 – Method and system for fail-safe call survival. This patent covers the technology to recover calls and successfully continue on-going calls and conversations in the event of a single point failure within an IP based phone and contact center system. A component of the call survival mechanism, the High Availability SIP proxy, also provides load balancing necessary for scaling to multiple Asterisk servers in a Cluster. The call center ACD within Q-Suite is capable of managing multiple servers in the Asterisk cluster to handle very large call volumes and still retain the ability to maintain the sequence and order of calls coming to each individual queue, irrespective of the Asterisk (media) server that they land on in the cluster.
- Ultra Monkey: The current solution I have uses UltraMonkey ( http://www.ultramonkey.org ) for load-balancing and failover and it works like a champ. There are obviously a lot of details there, and I’d be happy to detail them if people are interested. There is also a site that has two clusters with uniform reachability for all phones and PRIs. None of this requires a lot of dial plan tuning on a day-to-day basis.
See Also
- Asterisk High Availability Design: High Availability Design
- Mailing list for Asterisk High-Availability
- Asterisk administration
- Asterisk at large: Pairing Asterisk with the SIP proxy SER
- Asterisk failover case1
- TrixBox High Availability cluster using drbd
- HA, call recovery and call survival – system and method of fail-safe call recovery and survival for asterisk based contact centers