Asterisk High Availability Solutions
Ways to increase system availability and balancing:
The following is a brief HOWTO for installing High-Availability Asterisk using Open Source tools combined with fail-over capable & intelligent hardware (the fonebridge).
The heartbeat utility is used in a 'Passive-Active' scenario but could easily be modified to do 'Active-Active'.
Background
Some of our more demanding customers in the Call Center and Banking Industry are loathe to accept an implementation with no mechanism for fail-over and high-availability so this is the hardware/software combination we are using to meet their demands.
Client Background
The following scenario was used for a medium sized call center operation with about 60 analog stations, and a single T1 PRI.
Hardware
Software
Software Install
After a standard install of FC4, Asterisk, zaptel, libpri we installed all of the packages from Ultramonkey pretty much following their guidelines: http://www.ultramonkey.org/3/installation-rh.el.3.html
You may have a few dependencies issues, mainly perl libs, but we were able to satisfy all of them by using Yum. If you are running Apt you should be able to accomplish the same thing.
Configuring Hearbeat
After installing heartbeat there are only three files that need to be modified for your environment. They are ha.cf, haresources and authkeys. They should all be placed in the /etc/ha.d/ directory. The files should be absolutely identical on all machines that are part of your Asterisk high-availability cluster. We only have two servers running but you could easily scale to more using the exact same configurations. These are our config files. All comment lines have been removed but as you can see they are short and simple.
ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 200ms
deadtime 2
warntime 1
initdead 120
udpport 694
bcast eth0
node asterisk1
node asterisk2
haresources
asterisk1 10.10.10.110 fonulator asterisk
authkeys
auth 1
1 sha1 SuPerS&cretP@$$werd
Operation
Each Asterisk server has a unique IP address which is part of the LAN segment. This could be a NATed network or Internet facing with public IP addresses. Heartbeat manages the monitoring of the hardware state of each machine over Ethernet or serial port or a combination of both (recommended) and assigns the Virtual IP to the Asterisk server which is currently in an active state. Example;
Asterisk1= 10.10.10.100
Asterisk2= 10.10.10.120
Virtual IP= 10.10.10.110 (see haresources)
With Heartbeat it is important that your node names are identical to the host names reflected in #uname -n. You also may need to manually add IP/hosts statements to your /etc/hosts file so each machine knows how to reach the other via IP.
Following the rules in haresources, Heartbeat will assign machine name asterisk1 as the primary server when both systems start up. It will then start the following scripts; fonulator (this is the little script that configures the fonebridge) and asterisk which starts the Asterisk server. These are both standard startup scripts placed in /etc/init.d/ .
If the Primary server suffers a hardware fault or simply stops responding to the heartbeats going between the two nodes asterisk2 will execute /etc/init.d/fonulator start to reconfigure the fonebridge on the fly and begin redirecting traffic to asterisk2 followed by /etc/init.d/asterisk start to start the Asterisk server.
Results
With heartbeat, IP takeover occurs in under a second. The fonulator utility re-configures the fonebridge in just about the same amount of time and then depending on your hardware platform and the complexity of apps running in Asterisk it can take between 5-15 seconds for Asterisk to start up on your secondary server, load all config files, clear alarms and be ready to process calls. Total fail-over time about 15-20 seconds.
Resources
Ultramonkey http://www.ultramonkey.org (High Avail software packages)
Linux HA http://www.linux-ha.org (The High Availability Linux Project)
Redfone http://www.red-fone.com (Maker of the Quad T1/E1 fonebridge)
Overview
Use standard Ubuntu/Debian packages to create an Active/Passive high-availability solution for asterisk 1.4 using hearbeat 1.0 (and FreePBX) and using SIP (not redphone/PRI/analog/etc). Note: Use Debian server, do not use Ubuntu server until RAID-1 issues are solved (perhaps Ubuntu Intrepid?).
Background
Many ISP's are now providing "Dynamic T1" instead of (or in addition to) standard T1-PRI service. This "Dynamic T1" just means that they are providing highly prioritized VOIP/SIP between your customer site and them across a T1 (or other highspeed connection). So, it is now more and more possible to get cheaper service using VOIP only without T1-PRI and get very similar call quality. This solution deals with Debian/Ubuntu, but also the special issues that are raised with heartbeat when connecting to the upstream provider via SIP. Many clients want failover support to "seal the deal".
Issues
Heartbeat "takes over" an IP address by adding an "alias" to an interface IN ADDITION to an IP that must always be there so that heartbeat can communicate. For a PBX type install that is not behind a NAT, with no upstream SIP proxy (OpenSer), an alias will be added to BOTH the WAN interface and the LAN interface. Asterisk will need to bind to both the LAN and WAN to operate. Unless you do some routing/proxy magic outlined in this solution, you will run into trouble because asterisk will put the wrong SRC/VIA address in IP/SIP packets. This will cause problems upstream, because your ISP/SIP provider may authenticate based on IP and you will be appearing to send packets from the wrong IP. This will cause problems in the LAN for similar reasons.
Software Install
apt-get install asterisk
apt-get-install heartbeat
Heartbeat Config Generally
See the configuration info in the "Redfone" HOWTO above this one generally. I'm using the 10.10.10.0 addresses from above and 77.77.77.0 as a WAN address in my examples. I'm assuming that the shared LAN address is 10.10.10.110 and the shared WAN address is 77.77.77.110. Asterisk1 server's "other" WAN IP is 77.77.77.100. For sake of example: Asterisk2 machine has 77.77.77.120.
haresources
asterisk1 10.10.10.110 77.77.77.110 fixrouting asterisk
Routing fixes
For each interface to which Asterisk binds it gets the IP address by doing a routing lookup. If you look at 'ip route show' and the look after the word 'src' you will see which IP will be used for that interface (also look at 'ip route get'). It will put this IP into VIA headers and send all IP/UDP/SIP packets from this IP. When this server is primary we need to fix the routing so that all packets on LAN look like they are coming from the 'shared' IP of the two servers for the LAN... AND.. (for multi-homed) we need to fix the routing for the WAN interface also.
The 'fixrouting' script detailed below needs to be /etc/init.d/fixrouting
#! /bin/sh -e
set -e
case "$1" in
start)
ip route change 10.10.10.0/24 src 10.10.10.110 dev eth0
ip route change 77.77.77.0/24 src 77.77.77.110 dev eth1
;;
stop)
ip route change 10.10.10.0/24 src 10.10.10.100 dev eth0
ip route change 77.77.77.0/24 src 77.77.77.100 dev eth1
;;
force-reload|restart)
$0 stop
$0 start
;;
*)
echo "Usage: /etc/init.d/fixrouting {start|stop|restart|force-reload}"
exit 1
;;
esac
exit 0
Results
When a failover happens that makes this server primary the "shared" IPs will be taken over and then the routing fix will make sure that all packets look like they are coming from that IP in asterisk. When this server fails or becomes secondary IPs will be released and the routing fix will set things back to the Passive state so that the Active machine might still be able to communicate with it (and avoid IP conflicts).
Asterisk
- DNS SRV on the CPE side but not all phones handle this.
- SARK-HA from Aelintra Telecom offers High Availability Asterisk out-of-the box. Runs Aelintra's SARK UCS MVP Asterisk implementation on a pair of servers.... Real-time failover takes less than 20 seconds to complete. Setup requires only 4 additional data fields to filled out in the SARK globals panel. Illustrated set-up guide HERE.
- Ranch Networks offers High Availability White_Paper_one_one_HA.pdf solution for Asterisk. This is Hardware based solution. (Just for two asterisks boxes).
- Flip1405 Manages virtual IP between two Asterisk servers and queries UDP5060 for state changes
- Downtime less than 30 seconds
- Only 2 dependencies (nmap and arping)
- Incredibly easy to setup
- SERVERware. Fault tolerant and high availability solution with unlimited scalability. Commercial
- Failover switches to automatically switch connections (T1, Ethernet, etc.) to a backup system.
- CSS: You can make load-balancing with failover with multiple asterisk
- Altéon : A better tool with permit to load-balance RTP but there is problem is you use qualify=yes and nated phones
- Big-IP: You can make load-balancing with failover with multiple asterisk (coming soon the real SIP proxy functionalities)
- Ask me if you have questions about layers 7 switchs
- Vovida has a SIP load balancer. This allows several Asterisk servers to be setup and appear to be a single server to users. Other load balacing approaches involve the SER SIP proxy, UltraMonkey (see below) or simple DNS round-robin. And then there's also app_distributor as third party application or app_random.
- there are a lot of bugs and the last version was released in 2002
- Voxcom has a Turn-key IP PBX (Infinity) and Soft Switch (CallDirector). Australian integrator of high availability enterprise IP PBX and carrier soft switch solutions. Full support for local number porting (LNP), Emergency 000, Integrated Public Number Database (IPND).
- Use the Linux-HA software to provide high-availability (HA) failover on programmed conditions - by default node hang or crash. Linux-HA also has many telephony-oriented HA APIs as defined by the Service Availability Forum (SAF). It also provides sub-second failover, and works well with shared disk or without. It is commonly used with the DRBD package to provide HA with no single point of failure, and no special hardware requirements.
- Stratus, which as been making high-end continuous processing systems for 20 years, has just added an under $10,000 Linux based continuous processing solution: Stratus ftServer T Series Systems
- Network Monitoring to detect failures
- Remote Console and Power Control to remotely reboot and diagnose problems
- QueueMetrics is able to monitor clustered call-centers with the load distribuited over a number of Asterisk servers as if they were one big single box.
- OrderlyStats - Dedicated Real Time Call Centre Management and Statistics Package, can monitor single or clustered asterisk servers from a single page.
Asterisk High Availability HOWTO with Heartbeat and Redfone fonebridge
OverviewThe following is a brief HOWTO for installing High-Availability Asterisk using Open Source tools combined with fail-over capable & intelligent hardware (the fonebridge).
The heartbeat utility is used in a 'Passive-Active' scenario but could easily be modified to do 'Active-Active'.
Background
Some of our more demanding customers in the Call Center and Banking Industry are loathe to accept an implementation with no mechanism for fail-over and high-availability so this is the hardware/software combination we are using to meet their demands.
Client Background
The following scenario was used for a medium sized call center operation with about 60 analog stations, and a single T1 PRI.
Hardware
- 2 x 1U Supermicro Servers (P4, 512Mb, Dual Gig Eth, Dual SATA with RAID 0)
- 1 x Redfone Quad T1 fonebridge to terminate PRI connectivity, power channel banks and provide fail-over capability between the two Supermicros.
- 1 x T1 PRI
- 3 x Adtran 750 FXS channel banks to drive analog phones
- 2 x UPS/Surge Protectors
Software
- Fedora Core 4
- Asterisk, zaptel, libpri from CVS head
- Linux HA software suite from Ultramonkey. They have RPMs for RHE3 that install fine on Fedora Core 4
- Each server is a mirror image of the other in terms of Asterisk configs and software.
Software Install
After a standard install of FC4, Asterisk, zaptel, libpri we installed all of the packages from Ultramonkey pretty much following their guidelines: http://www.ultramonkey.org/3/installation-rh.el.3.html
You may have a few dependencies issues, mainly perl libs, but we were able to satisfy all of them by using Yum. If you are running Apt you should be able to accomplish the same thing.
Configuring Hearbeat
After installing heartbeat there are only three files that need to be modified for your environment. They are ha.cf, haresources and authkeys. They should all be placed in the /etc/ha.d/ directory. The files should be absolutely identical on all machines that are part of your Asterisk high-availability cluster. We only have two servers running but you could easily scale to more using the exact same configurations. These are our config files. All comment lines have been removed but as you can see they are short and simple.
ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 200ms
deadtime 2
warntime 1
initdead 120
udpport 694
bcast eth0
node asterisk1
node asterisk2
haresources
asterisk1 10.10.10.110 fonulator asterisk
authkeys
auth 1
1 sha1 SuPerS&cretP@$$werd
Operation
Each Asterisk server has a unique IP address which is part of the LAN segment. This could be a NATed network or Internet facing with public IP addresses. Heartbeat manages the monitoring of the hardware state of each machine over Ethernet or serial port or a combination of both (recommended) and assigns the Virtual IP to the Asterisk server which is currently in an active state. Example;
Asterisk1= 10.10.10.100
Asterisk2= 10.10.10.120
Virtual IP= 10.10.10.110 (see haresources)
With Heartbeat it is important that your node names are identical to the host names reflected in #uname -n. You also may need to manually add IP/hosts statements to your /etc/hosts file so each machine knows how to reach the other via IP.
Following the rules in haresources, Heartbeat will assign machine name asterisk1 as the primary server when both systems start up. It will then start the following scripts; fonulator (this is the little script that configures the fonebridge) and asterisk which starts the Asterisk server. These are both standard startup scripts placed in /etc/init.d/ .
If the Primary server suffers a hardware fault or simply stops responding to the heartbeats going between the two nodes asterisk2 will execute /etc/init.d/fonulator start to reconfigure the fonebridge on the fly and begin redirecting traffic to asterisk2 followed by /etc/init.d/asterisk start to start the Asterisk server.
Results
With heartbeat, IP takeover occurs in under a second. The fonulator utility re-configures the fonebridge in just about the same amount of time and then depending on your hardware platform and the complexity of apps running in Asterisk it can take between 5-15 seconds for Asterisk to start up on your secondary server, load all config files, clear alarms and be ready to process calls. Total fail-over time about 15-20 seconds.
Resources
Ultramonkey http://www.ultramonkey.org (High Avail software packages)
Linux HA http://www.linux-ha.org (The High Availability Linux Project)
Redfone http://www.red-fone.com (Maker of the Quad T1/E1 fonebridge)
Asterisk+Hearbeat+SIP+Multi-homed on Debian/Ubuntu
Overview
Use standard Ubuntu/Debian packages to create an Active/Passive high-availability solution for asterisk 1.4 using hearbeat 1.0 (and FreePBX) and using SIP (not redphone/PRI/analog/etc). Note: Use Debian server, do not use Ubuntu server until RAID-1 issues are solved (perhaps Ubuntu Intrepid?).
Background
Many ISP's are now providing "Dynamic T1" instead of (or in addition to) standard T1-PRI service. This "Dynamic T1" just means that they are providing highly prioritized VOIP/SIP between your customer site and them across a T1 (or other highspeed connection). So, it is now more and more possible to get cheaper service using VOIP only without T1-PRI and get very similar call quality. This solution deals with Debian/Ubuntu, but also the special issues that are raised with heartbeat when connecting to the upstream provider via SIP. Many clients want failover support to "seal the deal".
Issues
Heartbeat "takes over" an IP address by adding an "alias" to an interface IN ADDITION to an IP that must always be there so that heartbeat can communicate. For a PBX type install that is not behind a NAT, with no upstream SIP proxy (OpenSer), an alias will be added to BOTH the WAN interface and the LAN interface. Asterisk will need to bind to both the LAN and WAN to operate. Unless you do some routing/proxy magic outlined in this solution, you will run into trouble because asterisk will put the wrong SRC/VIA address in IP/SIP packets. This will cause problems upstream, because your ISP/SIP provider may authenticate based on IP and you will be appearing to send packets from the wrong IP. This will cause problems in the LAN for similar reasons.
Software Install
apt-get install asterisk
apt-get-install heartbeat
Heartbeat Config Generally
See the configuration info in the "Redfone" HOWTO above this one generally. I'm using the 10.10.10.0 addresses from above and 77.77.77.0 as a WAN address in my examples. I'm assuming that the shared LAN address is 10.10.10.110 and the shared WAN address is 77.77.77.110. Asterisk1 server's "other" WAN IP is 77.77.77.100. For sake of example: Asterisk2 machine has 77.77.77.120.
haresources
asterisk1 10.10.10.110 77.77.77.110 fixrouting asterisk
Routing fixes
For each interface to which Asterisk binds it gets the IP address by doing a routing lookup. If you look at 'ip route show' and the look after the word 'src' you will see which IP will be used for that interface (also look at 'ip route get'). It will put this IP into VIA headers and send all IP/UDP/SIP packets from this IP. When this server is primary we need to fix the routing so that all packets on LAN look like they are coming from the 'shared' IP of the two servers for the LAN... AND.. (for multi-homed) we need to fix the routing for the WAN interface also.
The 'fixrouting' script detailed below needs to be /etc/init.d/fixrouting
#! /bin/sh -e
set -e
case "$1" in
start)
ip route change 10.10.10.0/24 src 10.10.10.110 dev eth0
ip route change 77.77.77.0/24 src 77.77.77.110 dev eth1
;;
stop)
ip route change 10.10.10.0/24 src 10.10.10.100 dev eth0
ip route change 77.77.77.0/24 src 77.77.77.100 dev eth1
;;
force-reload|restart)
$0 stop
$0 start
;;
*)
echo "Usage: /etc/init.d/fixrouting {start|stop|restart|force-reload}"
exit 1
;;
esac
exit 0
Results
When a failover happens that makes this server primary the "shared" IPs will be taken over and then the routing fix will make sure that all packets look like they are coming from that IP in asterisk. When this server fails or becomes secondary IPs will be released and the routing fix will set things back to the Passive state so that the Active machine might still be able to communicate with it (and avoid IP conflicts).
Ultra Monkey
The current solution I have uses UltraMonkey ( http://www.ultramonkey.org ) for load-balancing and failover and it works like a champ. There are obviously a lot of details there, and I'd be happy to detail them if people are interested. There is also a site that has two clusters with uniform reachability for all phones and PRIs. None of this requires a lot of dialplan tuning on a day-to-day basis.See also
- Mailing list for Asterisk High-Availability
- Load balancing with DUNDi
- Asterisk administration
- Asterisk at large: Pairing Asterisk with the SIP proxy SER
- Asterisk failover case1
- Asterisk failover discussion (using ipvsadm)
- Failover switches
- Linux High Availability Project
- OpenSSI cluster for Linux
- Asterisk cluster for SIP users using Dundi (discussion)
- IAX routing/balancing discussion
- TrixBox High Availability cluster using drbd
Asterisk

Comments
333Re: Asterisk wih Ultramonkey load balancing; bug in real server health check
I'm very much interested on how to make ultramonkey work with asterisk. Currently i have ultramonkey setup to load balance my web traffic (port 80 and 443) and it's working fine.
I tried adding sip service, my ldirectord.cf looks like this:
virtual=12.13.14.155:5060
real=12.13.14.130:5060 gate
real=12.13.14.131:5060 gate
service=sip
scheduler=rr
persistent=600
protocol=udp
checktype=connect
but does not seem to work when i try configure my phone to register on 12.13.14.155
any help would be really appreciated.
333Asking for info
333Asterisk wih Ultramonkey load balancing; bug in real server health check
The bug:
Asterisk real server health check does not work reliabily from Ultramonkey. ldirectord from ultramonkey sends SIP OPTIONS request for real server health ckeck. Many a times Asterisk sends "200 OK" response for this request on a wrong port. So, the real server is deactivated.
Here are the details:
- Ultramonkey could set up to use SIP OPTIONS request for Asterisk real server health check. When you do that the script /etc/ha.d/resource.d/ldirectord uses the same call-id for all the OPTIONS requests it sends.
- In Asterisk, in chan_sip.c, when it receives a new SIP request it tries to see if there is an existing dialog setup for this request. If it doesn't find the exising dialog it will setup the new dialog. Since call-id, to, from and Cseq are same for every request sent from ldirectord it sometimes picks up the wrong earlier dialog and sends the response to this request on the wrong port.
- ldirectord never receives response in the above case and marks the real server down.
Solution:
Modify ldirectord to generate new call-id for each request. Here is the modified code for ldirectord. After this change there is no problem in real server health check.
Here is a quick modififications to ldirectord, check_sip subroutine. You can use any method to generate different call-id. I have used the following method.
my $range = 100000000000;
my $callid = int(rand($range));
my $request =
"OPTIONS sip:" . $$v{login} . " SIP/2.0\r\n" .
"Via: SIP/2.0/UDP $sip_s_addr_str:$sip_s_port;" . "rport;" .
"branch=z9hG4bKhjhs8ass877\r\n" .
"Max-Forwards: 70\r\n" .
"To: <sip:" . $$v{login} . ">\r\n" .
"From: <sip:" . $$v{login} . ">;tag=1928301774\r\n" .
"Call-ID: $callid\r\n" .
"CSeq: 63104 OPTIONS\r\n" .
"Contact: <sip:" . $$v{login} . ">\r\n" .
"Accept: application/sdp\r\n" .
"Content-Length: 0\r\n\r\n";
If anybody wants full details of how to get Asterisk working with ultramonkey load balancing and heartbeat let me know.
333Asterisk wih Ultramonkey load balancing; bug in real server health check
The bug:
Asterisk real server health check does not work reliabily from Ultramonkey. ldirectord from ultramonkey sends SIP OPTIONS request for real server health ckeck. Many a times Asterisk sends "200 OK" response for this request on a wrong port. So, the real server is deactivated.
Here are the details:
- Ultramonkey could set up to use SIP OPTIONS request for Asterisk real server health check. When you do that the script /etc/ha.d/resource.d/ldirectord uses the same call-id for all the OPTIONS requests it sends.
- In Asterisk, in chan_sip.c, when it receives a new SIP request it tries to see if there is an existing dialog setup for this request. If it doesn't find the exising dialog it will setup the new dialog. Since call-id, to, from and Cseq are same for every request sent from ldirectord it sometimes picks up the wrong earlier dialog and sends the response to this request on the wrong port.
- ldirectord never receives response in the above case and marks the real server down.
Solution:
Modify ldirectord to generate new call-id for each request. Here is the modified code for ldirectord. After this change there is no problem in real server health check.
Here is a quick modififications to ldirectord, check_sip subroutine. You can use any method to generate different call-id. I have used the following method.
my $range = 100000000000;
my $callid = int(rand($range));
my $request =
"OPTIONS sip:" . $$v{login} . " SIP/2.0\r\n" .
"Via: SIP/2.0/UDP $sip_s_addr_str:$sip_s_port;" . "rport;" .
"branch=z9hG4bKhjhs8ass877\r\n" .
"Max-Forwards: 70\r\n" .
"To: <sip:" . $$v{login} . ">\r\n" .
"From: <sip:" . $$v{login} . ">;tag=1928301774\r\n" .
"Call-ID: $callid\r\n" .
"CSeq: 63104 OPTIONS\r\n" .
"Contact: <sip:" . $$v{login} . ">\r\n" .
"Accept: application/sdp\r\n" .
"Content-Length: 0\r\n\r\n";
If anybody wants full details of how to get Asterisk working with ultramonkey load balancing and heartbeat let me know.
333
333CSS load balancing
It is written : "CSS: You can make load-balancing with failover with multiple asterisk"
But how to do it with registration on the css ip address?
Thank you for your help!
333foneBRIDGE2 setup
I've published my Asterisk/foneBRIDGE2/heartbeat setup: config files, scripts... along with a brief description of the architecture and working of the cluster. It's available here:
Asterisk clusters with a foneBRIDGE2
Hope somebody finds it useful. :)
Regards
333
My question is
who can i synchronize the contents of asterisk1"master" in real time in order to guarantee a certified copy to asterisk2 "slave"?.
and configurate the access physique to ISDN.
Thanks
333Asterisk High Availability Solutions
Morning
I tri to implemate a High Availability Solutions using UltraMonkey
Please can you describe all details...
Thanks
333