Problems with T1s and PRIs

grymlock

New Member
Joined
Feb 1, 2009
Messages
21
Reaction score
0
I am posting in the hope that you may be able to assist us with some issues we are having with two PiaF servers. They are located at the same organization. We are using 2 T1/PRIs in the main PBX and 2 straight T1s in the call center PBX. the issue is that Asterisk is sending the provider (Paetec/Verizon) a signal to shut down the lines and this all seems to point back to a Q.931 issue. We also noted that on the PiaF installs LIBPRI 1.4.8 and we have seen that 1.4.4 and higher has an issue with Q.931 as posted in the article http://bugs.digium.com/view.php?id=12655&nbn=10 . Has this been fixed in 1.4.8? There are about 85 users that get cut off of their calls every 15 minutes. We are using Sangoma A102 cards. What is really got me stumped is that both servers reset the lines at the exact same time, to the second and there is no onnection between them. They are on seperate network subnets as well.

Our telco ran extensive testing and worked with us for 36 hours straight and can find nothing wrong with the circuits. Sangoma checked everything out and said that this is an asterisk issue with the LIBPRI file.
[FONT=Calibri,sans-serif]
Here is what the Telco is seeing on their end:
[/FONT]
[FONT=Arial,sans-serif]4 different codes in the switch in which we are seeing protocol errors coming from the equipment on site.

[/FONT]code 2010 / 3160 / 3133 / 3143
2010 - Cause - The switch received a unexpected LAPD frame from the far end. The frame was unexpected in the current link state. the frame type was either a link establishment frame, a disconnect mode (DM) frame, or an unnumbered acknowledgement (UA) frame.

Corrective action - Determine why the unexpected frame was transmitted by the far end.

3160 - Cause - The switch sent a Q.931 disconnect message to the customer premises equipment on a PRI D channel and timer T305 expired while waiting for a Q.931 release from the PRI CPE.

Corrective action - Determine whey the PRI CPE is not responding to the Q.931 disconnect message that was sent by the switch. This could be caused by a loss of layer 2 signaling on the D-channel.[/font]

3133 - Cause - The switch received an unexpected Q.931 release complete message from the customer premises equipment on a PRI D-channel. the call was in the N4 or U7 state when it received the message for the CPE. Given the state of the call, the message was unexpected.

Corrective Action - Determine why the PRI CPE is sending unexpected Q.931 release complete message to the switch.

3143 - Cause - The switch receive an unexpected Q.931 release complete message from the customer premises equipment on a PRI D-channel. The call was in the N10 or U10 state when it received the message from the CPE. Given the state of the call the message was unexpected.

Corrective Action - Determine why the PRI CPE is sending an unexpected Q.931 release completed message to the switch.
[FONT=Calibri,sans-serif]
Here is what show i the mesages log when the circuit burps:
Apr 11 07:53:01 pwsdlr kernel: wanpipe1: OOF alarm is ON
Apr 11 07:53:01 pwsdlr kernel: wanpipe1: RED alarm is OFF
Apr 11 07:53:01 pwsdlr kernel: wanpipe1: T1 disconnected!
Apr 11 07:53:01 pwsdlr kernel: Zaptel: Master changed to WPT1/1
Apr 11 07:53:02 pwsdlr kernel: wanpipe1: OOF alarm is OFF
Apr 11 07:53:02 pwsdlr kernel: wanpipe1: AFT communications disabled!
Apr 11 07:53:02 pwsdlr kernel: wanpipe1: Starting TDMV 1ms Timer
Apr 11 07:53:08 pwsdlr kernel: wanpipe1: T1 connected!
Apr 11 07:53:08 pwsdlr kernel: Zaptel: Master changed to WPT1/0
Apr 11 07:53:08 pwsdlr kernel: wanpipe1: AFT communications enabled!
Apr 11 07:53:08 pwsdlr kernel: wanpipe1: AFT Global TDM Intr
Apr 11 07:53:08 pwsdlr kernel: ADDRCONF(NETDEV_CHANGE): w1g1: link becomes ready
Apr 11 07:53:08 pwsdlr kernel: wanpipe1: Global TDM Ring Resync
Apr 11 07:53:08 pwsdlr kernel: wanpipe1: Card TDM Rsync Rx=0 Tx=2
Apr 11 07:53:08 pwsdlr kernel: wanpipe2: Card TDM Rsync Rx=1 Tx=3
Apr 11 07:53:09 pwsdlr kernel: wanpipe1: RAI alarm is OFF
Apr 11 07:53:09 pwsdlr kernel: wanpipe1: OOF alarm is OFF
Apr 11 07:53:09 pwsdlr kernel: wanpipe1: RED alarm is OFF

In the asterisk log it shows a Red Alarm on all the channels and then it show a Cleared state for all the channels.

This down and up happens within a matter of seconds and when it does happen it cuts off all calls and then users can almost immediately start calling again.

Does anyone have any idea why this is happening?

[/FONT]
 
Post your zaptel.conf & zapata-channels.conf.

Sounds like a timing problem - What is timing source?

Bart

 
Timing source is the carrier

PRIs:
Zaptel.conf
#autogenerated by /usr/sbin/wancfg_zaptel do not hand edit
#autogenrated on 2009-04-11
#Zaptel Channels Configurations
#For detailed Zaptel options, view /etc/zaptel.conf.bak
loadzone=us
defaultzone=us

#Sangoma A102 port 1 [slot:4 bus:32 span:1] <wanpipe1>
span=1,0,0,esf,b8zs
bchan=1-23
hardhdlc=24

#Sangoma A102 port 2 [slot:4 bus:32 span:2] <wanpipe2>
span=2,0,0,esf,b8zs
bchan=25-47
hardhdlc=48

zapatel-channels.conf
; Autogenerated by /usr/local/sbin/genzaptelconf -- do not hand edit
; Zaptel Channels Configurations (zapata.conf)
;
; This is not intended to be a complete zapata.conf. Rather, it is intended
; to be #include-d by /etc/zapata.conf that will include the global settings
;

; Span 1: WPT1/0 "wanpipe1 card 0" (MASTER) B8ZS/ESF
group=0,11
context=from-pstn
switchtype = national
signalling = pri_cpe
channel => 1-23
group=
context=default

; Span 2: WPT1/1 "wanpipe2 card 1" B8ZS/ESF
group=0,12
context=from-pstn
switchtype = national
signalling = pri_cpe
channel => 25-47
group=
context=default

zapata.conf
;autogenerated by /usr/sbin/wancfg_zaptel do not hand edit
;autogenrated on 2009-04-11
;Zaptel Channels Configurations
;For detailed Zaptel options, view /etc/asterisk/zapata.conf.bak

[trunkgroups]

[channels]
context=default
usecallerid=yes
hidecallerid=no
callwaiting=yes
usecallingpres=yes
callwaitingcallerid=yes
threewaycalling=yes
transfer=yes
canpark=yes
cancallforward=yes
callreturn=yes
echocancel=yes
echocancelwhenbridged=yes
relaxdtmf=yes
rxgain=0.0
txgain=0.0
group=1
callgroup=1
pickupgroup=1
immediate=no

;Sangoma A102 port 1 [slot:4 bus:32 span:1] <wanpipe1>
switchtype=5ess
context=from-pstn
group=0
signalling=pri_cpe
channel =>1-23

;Sangoma A102 port 2 [slot:4 bus:32 span:2] <wanpipe2>
switchtype=5ess
context=from-pstn
group=0
signalling=pri_cpe
channel =>25-47


T1s:
zaptel.conf
# Autogenerated by /usr/local/sbin/sangoma/setup-sangoma -- do not hand edit
# Zaptel Channels Configurations (zaptel.conf)
#
loadzone=us
defaultzone=us

#Sangoma A102 port 1 [slot:4 bus:32 span:1] <wanpipe1>
span=1,0,0,d4,ami
e&m=1-24

#Sangoma A102 port 2 [slot:4 bus:32 span:2] <wanpipe2>
span=2,0,0,d4,ami
e&m=25-48

zapata-channels.conf
; Autogenerated by /usr/local/sbin/genzaptelconf -- do not hand edit
; Zaptel Channels Configurations (zapata.conf)
;
; This is not intended to be a complete zapata.conf. Rather, it is intended
; to be #include-d by /etc/zapata.conf that will include the global settings
;

; Span 1: WPT1/0 "wanpipe1 card 0" (MASTER) AMI/D4
group=0,11
context=from-pstn
switchtype = national
signalling = pri_cpe
channel => 1-23
group=
context=default

; Span 2: WPT1/1 "wanpipe2 card 1" AMI/D4
group=0,12
context=from-pstn
switchtype = national
signalling = pri_cpe
channel => 25-47
group=
context=default

zapata.conf
;autogenerated by /usr/local/sbin/config-zaptel do not hand edit
;Zaptel Channels Configurations (zapata.conf)
;
;For detailed zapata options, view /etc/asterisk/zapata.conf.orig

[trunkgroups]

[channels]
context=default
usecallerid=yes
hidecallerid=no
callwaiting=yes
usecallingpres=yes
callwaitingcallerid=yes
threewaycalling=yes
transfer=yes
canpark=yes
cancallforward=yes
callreturn=yes
echocancel=yes
echocancelwhenbridged=yes
relaxdtmf=yes
rxgain=0.0
txgain=0.0
group=1
callgroup=1
pickupgroup=1

immediate=no

;Sangoma A102 port 1 [slot:4 bus:32 span:1] <wanpipe1>
context=from-pstn
group=0
signalling=em_w
channel => 1-24

;Sangoma A102 port 2 [slot:4 bus:32 span:2] <wanpipe2>
context=from-pstn
group=0
signalling=em_w
channel => 25-48
 
Try changing span=1,0,0,esf,b8zs to span=1,1,0,esf,b8zs

Maybe change span=2,0,0,esf,b8zs to span=2,2,0,esf,b8zs

Make sure you are not sharing IRQ's (see http://astrecipes.net/?n=107)

Bart
 
There other item I noticed is it appears channel configs are duplicated of zapata.conf vs zapata-channels.conf - If you have the include for zapata-channels.conf this might be a bit ugly. I'd comment out or remove duplications in zapata and use zapata-channels only

Bart
 
Bart,

I made the changes you suggested, had everyone get off the phones and restarted the PBX....Lets see if your changes work. The include statement for zapata-channels is not there. I am assuming when I ran the configurator for the Sangoma card it romoved it.

Should these same changes apply to the other system with just straight T1s in it also?
 
In the asterisk log it shows a Red Alarm on all the channels

For me, this is where I would start looking first. This is considered a Major Alarm. Since you have two T1s and the failures happen at precisely the same time (are they at random times of day?), use this information as valuable clues.

Red alarms are serious failure 'somewhere', and usually 'suggest' that the the problem is physical, especially since you have two simultaneous failures. I would begin by double-checking ALL physical connections, particularly where the wiring is common, such as the wiring closet, or the relay rack where any common equipment might be located. It really sounds like an intermittent connection, or a wire shorting across terminals at the cable entrance. The problem may not be at "your end", and could be anywhere along the facility, all the way back to the service provider, including their Terminal equipment.

Once you are absolutely sure that your wiring is clean, tight, and doesn't arouse suspicion, then I would request that Verizon perform an extended Line Test. They can install a T1 Test Set at their end and monitor your line, and record any Red Alarms. This can be performed "In-Service", so your T1 is still In Service for this. However, to test the Bit Error Rate (BER), be aware this requires taking your T1 out of service for the testing period.

I guess I am assuming this is recent failure, and the T1s were once working? If not, then perhaps you still have a basic span issue that is allowing the T1 signal to occasionally lose sync. But, simultaneously? Wow![FONT=Calibri,sans-serif][/FONT]
 
Without the clock setting, asterisk will use it's own timing source for T1's - not as stable as Telco's. This applies to T1 spans only.

Setting the clock option on the second span just tell asterisk to span's 2 clock should span 1 fail.

Good luck

Bart
 
MGD4me, We did the testing for 36 hours and they said that all circuits are good...

I think Bart is right and that it is a timing issue....
 
Bart,
When I ran the Sangoma wancfg_zaptel configuration program I told it to use the Telco's timing....

Are you seing something that ays otherwise?
 
Bart, Thanks for the help. The suggested changes have been made and implemented and so far three hours since the change, I have not had an out of frame error. I also noticed in the logs that the cards are now initializing a bit differently and the log output is different when the channels are initialized.

Thanks again.....
 
The problem has returned. Both systems were running fine for approximately 8 hours with no error and then the Out of Frame alarms started again.

Here is what the log shows:

System with T1s:
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: OOF alarm is ON
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: RED alarm is OFF
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: T1 disconnected!
Apr 12 06:26:15 pwsdlr kernel: Zaptel: Master changed to WPT1/1
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: OOF alarm is OFF
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: AFT communications disabled!
Apr 12 06:26:15 pwsdlr kernel: wanpipe1: Starting TDMV 1ms Timer
Apr 12 06:26:21 pwsdlr kernel: wanpipe1: T1 connected!
Apr 12 06:26:21 pwsdlr kernel: Zaptel: Master changed to WPT1/0
Apr 12 06:26:21 pwsdlr kernel: wanpipe1: AFT communications enabled!
Apr 12 06:26:21 pwsdlr kernel: wanpipe1: AFT Global TDM Intr
Apr 12 06:26:21 pwsdlr kernel: ADDRCONF(NETDEV_CHANGE): w1g1: link becomes ready
Apr 12 06:26:21 pwsdlr kernel: wanpipe1: Global TDM Ring Resync
Apr 12 06:26:21 pwsdlr kernel: wanpipe1: Card TDM Rsync Rx=0 Tx=2
Apr 12 06:26:21 pwsdlr kernel: wanpipe2: Card TDM Rsync Rx=1 Tx=3
Apr 12 06:26:22 pwsdlr kernel: wanpipe1: RAI alarm is OFF
Apr 12 06:26:22 pwsdlr kernel: wanpipe1: OOF alarm is OFF
Apr 12 06:26:22 pwsdlr kernel: wanpipe1: RED alarm is OFF

System with PRIs:
Apr 12 05:56:32 pbx kernel: wanpipe1: OOF alarm is ON
Apr 12 05:56:32 pbx kernel: wanpipe1: T1 disconnected!
Apr 12 05:56:32 pbx kernel: Zaptel: Master changed to WPT1/1
Apr 12 05:56:33 pbx kernel: wanpipe1: OOF : OFF
Apr 12 05:56:33 pbx kernel: wanpipe1: AFT communications disabled! (Dev Cnt: 2 Cause: Link Down)
Apr 12 05:56:33 pbx kernel: wanpipe1: Starting TDMV 1ms Timer
Apr 12 05:56:39 pbx kernel: wanpipe1: T1 connected!
Apr 12 05:56:39 pbx kernel: Zaptel: Master changed to WPT1/0
Apr 12 05:56:39 pbx kernel: wanpipe1: AFT communications enabled!
Apr 12 05:56:39 pbx kernel: wanpipe1: AFT Global TDM Intr
Apr 12 05:56:39 pbx kernel: ADDRCONF(NETDEV_CHANGE): w1g1: link becomes ready
Apr 12 05:56:39 pbx kernel: wanpipe1: Global TDM Ring Resync
Apr 12 05:56:39 pbx kernel: wanpipe1: Card TDM Rsync Rx=0 Tx=2
Apr 12 05:56:39 pbx kernel: wanpipe2: Card TDM Rsync Rx=3 Tx=1
Apr 12 05:56:40 pbx kernel: wanpipe1: OOF alarm is OFF
 
Red Alarm

Perhaps I spoke too soon...

Your comment on the log, suggested that the Red Alarm was 'on', however the portion of the log that you posted does not support that conclusion. I just re-read your first post where the log shows:

[FONT=Calibri,sans-serif]RED alarm is OFF[/FONT]

So, there is NO red alarm indication, unless of course it occurs beyond the portion of log posted. However, there is clearly an OOF alarm. Since you are using a Sangamo device, try some tips offered here:

[FONT=Calibri,sans-serif]http://wiki.sangoma.com/wanpipe-linux-asterisk-debugging#pri_span_debugging[/FONT]
 
Problems ar still existant. I got Sangoma involved and they were a great help. They point out that it has to be a problem with the Telco so I had them dispatch a tech out for the third time. Tech showed up this morning and listened to what I had to say. He plugged his tester into the fiber mux and it showed the OOF errors coming from the Telco. Gee, what a suprise :banghead:. So now I am waiting for the Telco to get it fixed......:mad5:

Thanks to all for your assistance....:smile5:
 
Gee... they could have found that out by performing a loopback, and done all the testing from their end. They must have convinced themselves the problem was "yours", or they could have saved a truck roll. Interesting!
 
Loopback will not always find this

FWIW, I have seen this problem before, albeit not with asterisk. A problem with the smart jack itself may not be detected with a loopback test.

For those that are not T1/E1 savvy; the loopback test is when the telco will request (or at your request) take down the physical circuit and put particular patters of 111's and 000's on the line to detect physical failures (it's more complicated than that); it will also check voltage out on the line, etc. This verifies the low level functionality of the circuit itself and the circuit can not be in use when intrusive testing is taking place.

Some what ironically most new installs of T1 lines (not sure about E1) are delivered using hDSL on the circuit (only uses two wires; this type of DSL is symmetrical 1.5 Mbps up & down) that are then "output" as a 4 wire T1. In my experience I have seen more and more smart jack failures w/ this type of circuit (widely deployed since the late 90's) that previously. The telco will never/rarely want to roll a truck and put a test on the circuit itself. While much CPE equipment (Digium cards, etc) do support loopback testing (which would catch this type of fault) I haven't found a vendor that will 'play' with CPE gear.

Best of Luck,

Liam
 

Members online

No members online now.

Forum statistics

Threads
26,687
Messages
174,410
Members
20,257
Latest member
Dempan
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Back
Top