Speech-to-Text Dir Assistance

wardmundy

Nerd Uno
Joined
Oct 12, 2007
Messages
20,217
Reaction score
5,974
Speech-to-Text Directory Assistance Comes to Asterisk



If you are running an existing copy of Incredible PBX, be sure to load this update after performing the steps in the article:

Code:
cd /var/lib/asterisk/agi-bin
wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
tar zxvf callwho21.tgz
rm callwho21.tgz
 
Wasn't Too Hard

OK. Here's the query you'd need to use in nv-callwho.php to implement soundex() lookups. Works great!

Code:
$query = "SELECT * FROM user1 where strcmp(soundex(name), soundex('$dialcode')) = 0 order by name asc";
 
I wonder if it's possible to incorporate some sort of fuzzy matching to the MySQL query, so if the search is off by a single character it would still yield a match. Maybe run an exact search and if no results come up then run a fuzzy search. I guess we are going to have to start building contact databases with a field for alternate pronunciations.

The "right" (TM) :rolleyes: way to do that, and what is actually used in cases like this in speech recognition applications is the use of grammar rules. By defining a grammar for your application you can specify words and patterns of words to be recognised in a certain way. For instance we can have Joshua or Josh always recognised as Joshua, Mark and Marky recognised the same way, we can have optionally use of surnames, we can Have Delta air lines always regognised as "delta air lines" and American airlines as "american airlines" and so on. In the case of 'Directory Assistance' a well defined grammar by the user can make the database lookup a lot more reliable.

There is already a defined format for grammar syntax by w3c.

Asterisk has an internal API for speech recognition that supports grammar rules.

Google's STT engine supports grammars but as far as i know it supports only some predefined grammars which cannot be edited by the user. (This might not be true, I ll have to search a bit deeper in their code but the only thing i have seen so far is the use of predefined grammars.)

The speech recognition AGI script for asterisk has no support of grammars yet. It is high on my TODO list but its not yet implemented. My first thought was to use the internal asterisk speech recognition API but that doesn't seem very practical, especially for an AGI script. What I'm actually planning to do is to add support for Augmented Backus-Naur Form (ABNF) grammars without the use of any 3rd party APIs. This will take some time and it all depends on the amount of free time I ll have the next few weeks. ;)
 
My thoughts would be to use php to get an array of the words, then do the sql query finding all entries with with first word, then from that subset find the second word etc....

php has this function

str_word_count("Your String",1));

would return and array with
[0] => Your
[5] => String

With the numbers being the starting positions of each word.
 
Soundex typically returns 4 characters, but not in MySQL. So... you may want to experiment with a substring of the result to get broader hits. For example, if you want to obtain all the Smith's when you say "Smith John" or when you just say "Smith" and you have "Smith John" and "Smith Mary" in your database, then use the function below. With MySQL's implementation of Soundex(), searching for "Smith" would not return "Smith John" without using this substring approach.

Code:
$query = "SELECT * FROM user1 where strcmp(substring(soundex(name),1,4), substring(soundex('$dialcode'),1,4)) = 0 order by name asc";

You can retrieve and install our latest version of the CallWho AGI script by logging into your server as root and...

Code:
cd /var/lib/asterisk/agi-bin
wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
tar zxvf callwho21.tgz
rm callwho21.tgz
 
Just thinking out loud...

Soundex is not perfect. It's a computer algorithm with all the limitations that come with that. For example, Katherine and Catherine don't match using soundex even though they sound exactly the same.

Most folks won't want to match every single airline in a database when they say American Airlines or Delta Airlines. Using the 4-character soundex, saying American Airlines will match American, American Air, American Air Lines, and American Airlines. I think that's probably the best we can do. I've included a link in my previous post to the current version of the code.
 
Got it now. Try this one...

Code:
cd /var/lib/asterisk/agi-bin
wget http://bestof.nerdvittles.com/applications/asteridex4/callwho21.tgz
tar zxvf callwho21.tgz
rm callwho21.tgz


The way this works is that, if the words spoken (Mary Smith) don't match anything in the database, then it turns the words around (Smith Mary) and tries again. In this way, we avoid returning every airline if someone says American Airlines when there is an actual match initially.
 
This is a fun tool.

I am looking at the nv-callwho.php and I am trying to figure out how to edit it so that instead of using flite or swift, you can use googles tts engine.

Thanks
 
Haven't gotten that far yet, but... SOON. One of the issues is that Google breaks up speech into separate files of about 10 words each. I don't think it will be a big deal, but we'll have a look.
 
I tried just swapping in the googletts.agi, but that didnt work as the command structure is different then flite or swift.

Ill fool aroud a bit more.

Tanks
 
Here's the syntax:

Code:
exten => 444,n,agi(googletts.agi,"Have a nice day! Good bye.",en)

As you can see, this is dialplan code which is the "normal way" that AGI scripts are called. The complexity goes up considerably when you want to call a Perl AGI script from within a PHP AGI script. There's probably a way, but it's gonna be U-G-L-Y. It has memory leak written all over it... even if it works. This might need to await some additional perl magic from lzaf.
 
Thanks for the heads up, now I know that I shouldnt even try.
 
The complexity goes up considerably when you want to call a Perl AGI script from within a PHP AGI script. There's probably a way, but it's gonna be U-G-L-Y. It has memory leak written all over it... even if it works. This might need to await some additional perl magic from lzaf.

Calling directly an AGI from another AGI script will not work.
You can generally call an external dialplan app from an AGI script by using the exec command. This can also work for calling agi scripts (exec "agi" "arguments"). It is definitely not pretty, and might lead to some errors but still possible.
 

Members online

No members online now.

Forum statistics

Threads
26,687
Messages
174,410
Members
20,257
Latest member
Dempan
Get 3CX - Absolutely Free!

Link up your team and customers Phone System Live Chat Video Conferencing

Hosted or Self-managed. Up to 10 users free forever. No credit card. Try risk free.

3CX
A 3CX Account with that email already exists. You will be redirected to the Customer Portal to sign in or reset your password if you've forgotten it.
Back
Top