- OpenSER solution Fail-Over and Load-Balancing configuration sample
- DNS SRV on the CPE side but not all phones handle this.
- BioCluster is a peer-to-peer clustering platform for Asterisk, available under a dual license: GPL and Commercial license from Atelis PLC this site appears to be dead for some time now...
- Ranch Networks offers High Availability White_Paper_one_one_HA.pdf
-> Just for two asterisks box
- SERVERware. Fault tolerant and high availability solution with unlimited scalability. Commercial
- Failover switches to automatically switch connections (T1, Ethernet, etc.) to a backup system.
-> Altéon : A better tool with permit to load-balance RTP but there is problem is you use qualify=yes and nated phones
-> Big-IP: You can make load-balancing with failover with multiple asterisk (coming soon the real SIP proxy functionalities)
Ask me if you have questions about layers 7 switchs
- Vovida has a SIP load balancer. This allows several Asterisk servers to be setup and appear to be a single server to users. Other load balacing approaches involve the SER SIP proxy, UltraMonkey (see below) or simple DNS round-robin. And then there's also app_distributor as third party application or app_random.
- Use the Linux-HA software to provide high-availability (HA) failover on programmed conditions - by default node hang or crash. Linux-HA also has many telephony-oriented HA APIs as defined by the Service Availability Forum (SAF). It also provides sub-second failover, and works well with shared disk or without. It is commonly used with the DRBD package to provide HA with no single point of failure, and no special hardware requirements.
- Stratus, which as been making high-end continuous processing systems for 20 years, has just added an under $10,000 Linux based continuous processing solution: Stratus ftServer T Series Systems
- Network Monitoring to detect failures
- Remote Console and Power Control to remotely reboot and diagnose problems
- QueueMetrics is able to monitor clustered call-centers with the load distribuited over a number of Asterisk servers as if they were one big single box.
- OrderlyStats - Dedicated Real Time Call Centre Management and Statistics Package, can monitor single or clustered asterisk servers from a single page.
Asterisk High Availability HOWTO with Heartbeat and Redfone fonebridge
OverviewThe following is a brief HOWTO for installing High-Availability Asterisk using Open Source tools combined with fail-over capable & intelligent hardware (the fonebridge).
The heartbeat utility is used in a 'Passive-Active' scenario but could easily be modified to do 'Active-Active'.
Background
Some of our more demanding customers in the Call Center and Banking Industry are loathe to accept an implementation with no mechanism for fail-over and high-availability so this is the hardware/software combination we are using to meet their demands.
Client Background
The following scenario was used for a medium sized call center operation with about 60 analog stations, and a single T1 PRI.
Hardware
- 2 x 1U Supermicro Servers (P4, 512Mb, Dual Gig Eth, Dual SATA with RAID 0)
- 1 x Redfone Quad T1 fonebridge to terminate PRI connectivity, power channel banks and provide fail-over capability between the two Supermicros.
- 1 x T1 PRI
- 3 x Adtran 750 FXS channel banks to drive analog phones
- 2 x UPS/Surge Protectors
Software
- Fedora Core 4
- Asterisk, zaptel, libpri from CVS head
- Linux HA software suite from Ultramonkey. They have RPMs for RHE3 that install fine on Fedora Core 4
- Each server is a mirror image of the other in terms of Asterisk configs and software.
Software Install
After a standard install of FC4, Asterisk, zaptel, libpri we installed all of the packages from Ultramonkey pretty much following their guidelines: http://www.ultramonkey.org/3/installation-rh.el.3.html
You may have a few dependencies issues, mainly perl libs, but we were able to satisfy all of them by using Yum. If you are running Apt you should be able to accomplish the same thing.
Configuring Hearbeat
After installing heartbeat there are only three files that need to be modified for your environment. They are ha.cf, haresources and authkeys. They should all be placed in the /etc/ha.d/ directory. The files should be absolutely identical on all machines that are part of your Asterisk high-availability cluster. We only have two servers running but you could easily scale to more using the exact same configurations. These are our config files. All comment lines have been removed but as you can see they are short and simple.
ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 200ms
deadtime 2
warntime 1
initdead 120
udpport 694
bcast eth0
node asterisk1
node asterisk2
haresources
asterisk1 10.10.10.110 fonulator asterisk
authkeys
auth 1
1 sha1 SuPerS&cretP@$$werd
Operation
Each Asterisk server has a unique IP address which is part of the LAN segment. This could be a NATed network or Internet facing with public IP addresses. Heartbeat manages the monitoring of the hardware state of each machine over Ethernet or serial port or a combination of both (recommended) and assigns the Virtual IP to the Asterisk server which is currently in an active state. Example;
Asterisk1= 10.10.10.100
Asterisk2= 10.10.10.120
Virtual IP= 10.10.10.110 (see haresources)
With Heartbeat it is important that your node names are identical to the host names reflected in #uname -n. You also may need to manually add IP/hosts statements to your /etc/hosts file so each machine knows how to reach the other via IP.
Following the rules in haresources, Heartbeat will assign machine name asterisk1 as the primary server when both systems start up. It will then start the following scripts; fonulator (this is the little script that configures the fonebridge) and asterisk which starts the Asterisk server. These are both standard startup scripts placed in /etc/init.d/ .
If the Primary server suffers a hardware fault or simply stops responding to the heartbeats going between the two nodes asterisk2 will execute /etc/init.d/fonulator start to reconfigure the fonebridge on the fly and begin redirecting traffic to asterisk2 followed by /etc/init.d/asterisk start to start the Asterisk server.
Results
With heartbeat, IP takeover occurs in under a second. The fonulator utility re-configures the fonebridge in just about the same amount of time and then depending on your hardware platform and the complexity of apps running in Asterisk it can take between 5-15 seconds for Asterisk to start up on your secondary server, load all config files, clear alarms and be ready to process calls. Total fail-over time about 15-20 seconds.
Resources
Ultramonkey http://www.ultramonkey.org (High Avail software packages)
Linux HA http://www.linux-ha.org (The High Availability Linux Project)
Redfone http://www.red-fone.com (Maker of the Quad T1/E1 fonebridge)
Ultra Monkey
The current solution I have uses UltraMonkey ( http://www.ultramonkey.org ) for load-balancing and failover and it works like a champ. There are obviously a lot of details there, and I'd be happy to detail them if people are interested. There is also a site that has two clusters with uniform reachability for all phones and PRIs. None of this requires a lot of dialplan tuning on a day-to-day basis.See also
- Load balancing with DUNDi
- Asterisk administration
- Asterisk at large: Pairing Asterisk with the SIP proxy SER
- Asterisk failover case1
- Asterisk failover discussion (using ipvsadm)
- Failover switches
- Linux High Availability Project
- OpenSSI cluster for Linux
- Asterisk cluster for SIP users using Dundi (discussion)
- IAX routing/balancing discussion
Asterisk
Page Changes
Asking for info
Asterisk wih Ultramonkey load balancing; bug in real server health check
The bug:
Asterisk real server health check does not work reliabily from Ultramonkey. ldirectord from ultramonkey sends SIP OPTIONS request for real server health ckeck. Many a times Asterisk sends "200 OK" response for this request on a wrong port. So, the real server is deactivated.
Here are the details:
- Ultramonkey could set up to use SIP OPTIONS request for Asterisk real server health check. When you do that the script /etc/ha.d/resource.d/ldirectord uses the same call-id for all the OPTIONS requests it sends.
- In Asterisk, in chan_sip.c, when it receives a new SIP request it tries to see if there is an existing dialog setup for this request. If it doesn't find the exising dialog it will setup the new dialog. Since call-id, to, from and Cseq are same for every request sent from ldirectord it sometimes picks up the wrong earlier dialog and sends the response to this request on the wrong port.
- ldirectord never receives response in the above case and marks the real server down.
Solution:
Modify ldirectord to generate new call-id for each request. Here is the modified code for ldirectord. After this change there is no problem in real server health check.
Here is a quick modififications to ldirectord, check_sip subroutine. You can use any method to generate different call-id. I have used the following method.
my $range = 100000000000;
my $callid = int(rand($range));
my $request =
"OPTIONS sip:" . $$v{login} . " SIP/2.0\r\n" .
"Via: SIP/2.0/UDP $sip_s_addr_str:$sip_s_port;" . "rport;" .
"branch=z9hG4bKhjhs8ass877\r\n" .
"Max-Forwards: 70\r\n" .
"To: <sip:" . $$v{login} . ">\r\n" .
"From: <sip:" . $$v{login} . ">;tag=1928301774\r\n" .
"Call-ID: $callid\r\n" .
"CSeq: 63104 OPTIONS\r\n" .
"Contact: <sip:" . $$v{login} . ">\r\n" .
"Accept: application/sdp\r\n" .
"Content-Length: 0\r\n\r\n";
If anybody wants full details of how to get Asterisk working with ultramonkey load balancing and heartbeat let me know.
Asterisk wih Ultramonkey load balancing; bug in real server health check
The bug:
Asterisk real server health check does not work reliabily from Ultramonkey. ldirectord from ultramonkey sends SIP OPTIONS request for real server health ckeck. Many a times Asterisk sends "200 OK" response for this request on a wrong port. So, the real server is deactivated.
Here are the details:
- Ultramonkey could set up to use SIP OPTIONS request for Asterisk real server health check. When you do that the script /etc/ha.d/resource.d/ldirectord uses the same call-id for all the OPTIONS requests it sends.
- In Asterisk, in chan_sip.c, when it receives a new SIP request it tries to see if there is an existing dialog setup for this request. If it doesn't find the exising dialog it will setup the new dialog. Since call-id, to, from and Cseq are same for every request sent from ldirectord it sometimes picks up the wrong earlier dialog and sends the response to this request on the wrong port.
- ldirectord never receives response in the above case and marks the real server down.
Solution:
Modify ldirectord to generate new call-id for each request. Here is the modified code for ldirectord. After this change there is no problem in real server health check.
Here is a quick modififications to ldirectord, check_sip subroutine. You can use any method to generate different call-id. I have used the following method.
my $range = 100000000000;
my $callid = int(rand($range));
my $request =
"OPTIONS sip:" . $$v{login} . " SIP/2.0\r\n" .
"Via: SIP/2.0/UDP $sip_s_addr_str:$sip_s_port;" . "rport;" .
"branch=z9hG4bKhjhs8ass877\r\n" .
"Max-Forwards: 70\r\n" .
"To: <sip:" . $$v{login} . ">\r\n" .
"From: <sip:" . $$v{login} . ">;tag=1928301774\r\n" .
"Call-ID: $callid\r\n" .
"CSeq: 63104 OPTIONS\r\n" .
"Contact: <sip:" . $$v{login} . ">\r\n" .
"Accept: application/sdp\r\n" .
"Content-Length: 0\r\n\r\n";
If anybody wants full details of how to get Asterisk working with ultramonkey load balancing and heartbeat let me know.
CSS load balancing
It is written : "CSS: You can make load-balancing with failover with multiple asterisk"
But how to do it with registration on the css ip address?
Thank you for your help!
foneBRIDGE2 setup
I've published my Asterisk/foneBRIDGE2/heartbeat setup: config files, scripts... along with a brief description of the architecture and working of the cluster. It's available here:
Asterisk clusters with a foneBRIDGE2
Hope somebody finds it useful. :)
Regards
My question is
who can i synchronize the contents of asterisk1"master" in real time in order to guarantee a certified copy to asterisk2 "slave"?.
and configurate the access physique to ISDN.
Thanks
Asterisk High Availability Solutions
Morning
I tri to implemate a High Availability Solutions using UltraMonkey
Please can you describe all details...
Thanks
what about the details?!
Please describe those details... I don´t know how to get LVS working with SIP...