Asterisk debugging

Different Levels for Different Purposes

There are (at least) three different levels or types of debugging:
1) Telephony hardware configuration and fault finding in the physical equipment, connections and cabling
2) Configuration based on the configuration files and call handling
3) Core software, looking at the C code and modules

Telephony Hardware

Asterisk does not require any telephony hardware to operate VOIP, but it is flexible enough to connect to and from standard telephone lines and handsets, either instead of or in addition to VOIP terminals. Standard telephone lines (FXO, FXS, T1, E1 etc) are all handled via the Zaptel drivers.

ZAP Hardware

To confirm what devices are available and configured for use in the system, type ZAP SHOW CHANNELS at the console prompt. For more specific information about any individual channel, type ZAP SHOW CHANNEL X (where X is the number). Remember that channels will not work if they are not physically connected to a real, working telephone line – the Dial command will return all channels busy/congested. If channels are not shown, then time to revisit both zapata.conf and zaptel.conf ensure that modprobe(s) have been done and ztcfg has been run.

SIP Hardware

SIP Channels are only shown if registered
SIP SHOW INUSE will list all SIP extensions defined in SIP.CONF
SIP SHOW CHANNELS will list all SIP extensions registered at that time

Console

Asterisk will normally grab the sound card and use it as the console voice i/o. This means that running a SIP Softphone on the same server is at best difficult. Asterisk is designed to operate as a standalone without am XWindows GUI for best performance – expect poor audio quality if you try this.

Configuration

Where inspection of the configuration files fails, the simplest thing is to turn debugging/trace output on. This is done by editting the logger.conf file (in /etc/asterisk). To add this output to the console screen, simply add “,debug” at the end of the console configuration line – the sample files have an alternative line with this text already in place. Then type “LOGGER ROTATE” at the console screen, and all signfiicant telephony events will now pour out. These can also be written to a log file for more detailed analysis.

Call Log

Asterisk by default writes to the system log a complete call record, for example:

> cd /var/log/asterisk/cdr-csv
> tail -f Master.csv
"","5032273698","9714042975","sip","""DORGAN M"" <5032273698>","SIP/147.135.0.129-08100358","","AGI","
/usr/local/mipl/agnese|http://www.nextbus.com/nextbus3.mipl","2004-12-09 13:14:41"
,"2004-12-09 13:14:44","2004-12-09 13:14:53",12,9,"ANSWERED","DOCUMENTATION"

Message Log

Or, if you are having problems catching intermittent problems on your system, consider adding more information to the Asterisk message log. If you look in logger.conf you will see something like:
messages => notice,warning,error
consider changing this to:
messages => notice,warning,error,debug,verbose
(for short periods of time anyway — it can really eat disk space). Note you will need to restart Asterisk or type LOGGER ROTATE at the CLI to get this change to take effect (reload doesn’t do it)

Core Software Debugging

Where the behaviour of the application is not as expected, more serious program debugging may be required.

“strace” might give you some helpful info. See man strace. It will record system calls and signels.
“viewcore” is a useful tool for analysing coredumps: http://adimitrov.net/main/viewcore/

I’d run Asterisk in a debugger. I like “DDD”. Compile everything with the -g gcc option first so the debugger can display source code lines. When the program stops the curser will be left on the last line executed and you can cause it to “run backwards” and see how you got there.

An Important Note on Submitting Backtraces

When getting a backtrace to have developers help you debug a problem, it is VERY important that you build asterisk with “make clean; make valgrind; make install”. This will turn off optimizations so that we can get a readable backtrace.

Tips for Asterisk 1.4

Two things you need to make sure you enable when you compile asterisk:

DEBUG_THREADS
DONT_OPTIMIZE

These can be toggled under the “Compiler Flags” options when you do a “make menuselect”. Make sure that you have them turned on (I.E. an Asterisk next to them).

Then, follow the guidelines for obtaining a backtrace with GDB. These can be found in the file “asterisk/doc/backtrace.txt”.

I would suggest that you issue the following, to make your life easy;

script backtrace.txt <- This will start a typescript session, which captures the Input / Output from the session into a file called “backtrace.txt”

Once you get into gdb, then issue: “set pagination off” to turn of paging.

Then, get your backtrace, exit out of GDB and find a related issue on the bugtracker (or create a new issue) to post your backtrace to. Do NOT post your backtrace to #asterisk-dev or #asterisk-bugs on Freenode, as this will only serve to annoy the developers who are typically busy helping others.

Backtracing a core dump file in /tmp

start Asterisk with safe_asterisk
enter “gdb asterisk core.xxxx”
enter “bt” while in gdb (or do a “bt full”)
enter “thread apply all bt”

Naturally you’ll need to have gdb installed on your system

100% CPU

Are you sure the Asterisk process died and is not just hung either in an infinite loop or waiting for something that will never happen? What does “ps” say?

CONSOLE=no

Are you running safe_asterisk? If so try to modify safe_asterisk … CONSOLE=yes to CONSOLE=no.

Debugging a running asterisk

List all the asterisk threads with ps axum -C asterisk to find the thread that takes the most CPU. Now connect with gdb:

gdb /usr/sbin/asterisk pid

and do “bt” and post the last few lines to the mailing list …

ulimit

If asterisk is crashing as in exiting, issue the command

ulimit -c unlimited

and this should allow asterisk to drop a core file if it can.

HowTo Debug a DeadLock in Asterisk

1) In the asterisk makefile you need to uncomment

Optional debugging parameters

DEBUG_THREADS = #-DDEBUG_THREADS #-DDO_CRASH

NB* the DO_ CRASH arg will force a core dump on certain conditions that indicate a possible deadlock

that otherwise will just generate a verbose warning message, you prolly dont wanta do this on production box but
for testing this is a usefull option because the core will show off what was happending between the threads.
grep the sources for the CRASH macro do see under what conditons this might occur

2) Turn on Verbose logging
Current cvs as of Feb 1 2004 allows verbose msg to be logged see logger.conf add VERBOSE to the messages file.

This will allow you to log ast_verbose msg’s to the logs so we can

see what the bt threads are doing in time sequence order
re-create the situation that lead to core or deadlock

3) When you deadlock don’t restart the box or restart asterisk
Instead take the 5 mins while everyone is freaking out to attach gdb to the running asterisk process and do
gdb /usr/sbin/asterisk <pid of main * process>
…you can get the asterisk “Process Identification Number (PID) by asterisk -r (“-> currently running on blah (pid =9075)”). Note: If the box is truly hosed & blocked on all I/O this will fail also, you must use ps ax) or look for lowest pid after doing ps ax -C asterisk.

4) after gdb loads do
info thread
thread apply all bt
At the very least you are now going to save that bt output to a file and post that to bugs.digium.com

5) Identify dead locked threads by this pattern
Note the “_pthread_wait_for_restart_signal”. That means we are in wait loop wanting the mutex lock

Thread 23 (Thread 3576854 (LWP 2910)):

0 0x400c787e in sigsuspend () from /lib/libc.so.6
1 0x40022879 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0
2 0x40024a36 in __pthread_alt_lock () from /lib/libpthread.so.0
3 0x40020fd2 in pthread_mutex_lock () from /lib/libpthread.so.0

Note apparently not all systems implement the “pthread_wait_for_restart_signal”, so I guess you might just want to scan for at least “pthread_mutex_lock”. You will usually find more than one of these patterns because once a thread is dead locked on a mutex lock, other threads that want the same lock will pile up quickly.

6) Try to identify the first thread, that is dead locked.
The sequence number of bt threads is not relevent, because threads are re-used.

Look in your log files at the time stamps and try to corrolate the THREAD number (e.g. “Thread 23 (Thread 3576854 (LWP 2910))”) to the earliest entry in the log file with that same THREAD number (e.g. “VERBOSE[3576854]”). Note the FRAME number just before doing “pthread_mutex_lock()”, (that is the #0, #1, #2, number right after bt THREAD number).

Log files are usually in /var/log/asterisk/. Check the files messages and debug in this directory.

7) Find the position
Now that we have our potential guilty party as the first in line for the lock do
thread <sequence number> for the THREAD of interest
frame <fame number> for the frame # before the pthread_mutex_lock ()
This now should be in our asterisk sources right where we call ast_mutex_lock(). Record the name of the lock it was trying to get eg ast_mutex_lock(&agentlock).

8) Check who has the lock
Now if we have properly turned on thread debugging we are going to be able to see into include/asterisk/lock.f ast_mutex_t struct which looks like this

pthread_mutex_t mutex;
char *file;
int lineno;
char *func;
pthread_t thread;

so now that we have our lock we can see who has it & what we are waiting on, .. do the following bt cmds

p somelockIjustFound->thread
p somelockIjustFound->file
p somelockIjustFound->func
p somelockIjustFound->lineno

This is the guility code that is holding the lock that we want to look at.

9) Now comes the hard part …:)
Why is this Code in that thread, file, function, lineno not releasing our lock? We have to now scour the code checking all places where that lock is set & released looking for

places where there is a lock hierarachy and locks are set and released in different order,
(there is a same rule for locking rows in a sql db: you have to lock & release in the same order everwhere)

places where a lock is held too long in for and while loops, before calling longer running functions etc
(there is a same rule for sql db transactions: get in and out quickly)

Do not mutex lock at critical section where we might receive operating system signals

Don’t Forget Asterisk’s Logging Features (see above)

If you really want to see what’s going on under the hood you can start Asterisk using:
/usr/sbin/asterisk -vvvvvv -g -dddddd -c
Finally issue “iax2 debug” or “sip debug” at the CLI prompt. You can easily expect 5 to 10mb an hour of output.

Log parser

Utility for exploring Asterisk logs and SIP. First enable SIP debug and full log. Explore SIP messages on particular call, search through calls,…

https://github.com/slsolucije/astlog

Asterisk debugging

Page Contents

Different Levels for Different Purposes

Telephony Hardware

ZAP Hardware

SIP Hardware

Console

Configuration

Call Log

Message Log

Core Software Debugging

An Important Note on Submitting Backtraces

Tips for Asterisk 1.4

Backtracing a core dump file in /tmp

100% CPU

CONSOLE=no

Debugging a running asterisk

ulimit

HowTo Debug a DeadLock in Asterisk

Don’t Forget Asterisk’s Logging Features (see above)

Log parser

See also

Article Reviews

Write a Review Cancel reply

Page Contents

Different Levels for Different Purposes

Telephony Hardware

ZAP Hardware

SIP Hardware

Console

Configuration

Call Log

Message Log

Core Software Debugging

An Important Note on Submitting Backtraces

Tips for Asterisk 1.4

Backtracing a core dump file in /tmp

100% CPU

CONSOLE=no

Debugging a running asterisk

ulimit

HowTo Debug a DeadLock in Asterisk

Don’t Forget Asterisk’s Logging Features (see above)

Log parser

See also

Article Reviews

Write a Review Cancel reply

Related Posts: