Speech applications that improve caller experience

Sample customers and partners:

OOPS. Your Flash player is missing or outdated.Click here to update your player so you can see this content.
Home arrow Services arrow Tools arrow Development and Tuning Tools
Application Development and Tuning Tools

Dialog Design, Implementation, and Testing

Capturing and expressing recurring dialog patterns in a speech application has always been a challenge. Traditional development tools — graphical or otherwise — are usually poor at expressing these patterns, forcing application developers to extensively duplicate code, leading to applications that are difficult to maintain and evolve.

The Nü Echo Dialog BuilderTM, an Eclipse-based, graphical environment for developing speech applications, was specifically developed in order to address this challenge. This unique tool completely transforms the process of designing and implementing dialogs by specifically orienting the design process around the identification, definition and reuse of hierarchical dialog patterns. This results in a very compact design where any common dialog behavior only needs to be described once.

In addition to its powerful and intuitive visual dialog design capabilities, the Nü Echo Dialog BuilderTM provides a complete suite of tools that enable us to efficiently build robust, high quality applications.

Grammar Development

Because grammar development is such a critical component of a speech application development project, we have built a complete suite of tools to support this activity, including:

  • Grammar Development Environment (GDE) Eclipse Plugin — The environment features a sophisticated ABNF grammar editor, grammar coverage tools, a parse tree generator, a semantic single-stepper, and an interactive sentence explorer. See Grammar IDE for more information.
  • Grammar templating language and engine — Many speech applications require grammars to be dynamically generated based on run-time information. Based on the ABNF format, to which templating directives have been added, the Nü Echo grammar templating language greatly facilitates the development of dynamic grammars by representing them in a clear and intuitive way and making sure that generated grammars are always syntactically valid. Additional features include:
    • The ability to generate ABNF, XML, or Nuance GSL grammars from the same template;
    • The ability to parse recognition results in the application, thereby enabling the use of additional semantic slots in the application;
    • The ability to run automated dynamic grammar coverage tests using the grammar coverage testing framework.
  • Text normalization framework — Dynamic grammar generation often require advanced processing of raw input text, for match specific patterns and implement rules for text normalization, acronym expansion, or synonym generation.
  • Grammar coverage testing framework — Grammar coverage tests, for both static and dynamic grammars, can be run every time the application is built in order to verify the integrity and coverage of all grammars in the application. This is particularly important for dynamic grammars, which often build upon a complex infrastructure of grammar templates, programmatic grammar fragments, and normalization logic, any of which could have been accidentally broken.

Multilingual Phonetic Pronunciation Management

High performance speech applications require accurate phonetic pronunciations. This is as true for speech recognition — where incorrect pronunciations may result in speech recognition errors — as for text-to-speech — where incorrect pronunciations are likely to result in a call automation failure. Getting accurate pronunciations, however, is not trivial, especially in multilingual environments. To address this challenge, Nü Echo has developed a sophisticated multilingual pronunciation framework that enables us to efficiently produce accurate multilingual pronunciations for large specialized vocabularies.

Call Analysis

The ability to perform in-depth call analysis is critical in order to rapidly identify and fix problems with a speech application. Nü Echo has a full suite of tools to analyze calls, including:

  • The Call Logs Database — The Call Logs Database stores all call information required in order to produce reports and analyze calls. The database is incrementally loaded with new calls using a script that merges application logs with recognition engine logs. Manual transcriptions of caller utterances can also be added to the database.
  • The Call Viewer — The Call Viewer is a graphical application for viewing, analyzing, and annotating calls to any application developed within the Nü Echo VoiceXML application platform or producing logs in a compatible format. It is packaged as a stand-alone Java Application that works within the Eclipse framework. It connects to the Call Logs Database through a JDBC interface. The Call Viewer is extensively used in the Nü Echo Speech Practice to identify potential usability issues with the application or annote calls using a set of user-defined tags. This annotation can then be used in the production of reports, for instance to quantify the frequency of occurrence of certain types of events, problems, or user behavior.
  • The Application Reporting Environment — The Application Reporting Environment uses the Call Logs Database in order to produce a variety of reports about application calls.
  • The Transcription Environment — The ability to produce a large volume of high quality caller utterance transcriptions is a fundamental element of any serious Speech Practice. Utterance transcriptions are required for a number of key activities, e.g., for grammar coverage analysis, for off-line speech recognition experiments, to train SLMs, to produce application reports, etc. Our Transcription Environment allows us to efficiently produce large volume, high quality transcriptions.

PerformanceTuning — The Nü Echo Workbench

Nü Echo's tuning activities are supported by a full suite of tools, the most important of which are:

  • Batch Speech Recognition Environment — Advanced tuning of a speech application requires the ability to iteratively perform off-line speech recognition experiments with the grammars — static or dynamic — used by an application and analyze the results in order to improve speech recognition accuracy, set confidence thresholds, test post-processing algorithms, etc.
  • Confidence Score Training Framework — Confidence scores are extremely important for implementing effective speech applications. For instance, they are used to decide when a confirmation or a reprompt is required. Unfortunately, the confidence scores produced by commercial speech recognition engines are often far from optimal for a given grammar context. The reason is that these confidence scores are normally designed to produce acceptable results for any type of grammar, which basically means that we get sub-optimal results for any given grammar. When field data is available for a given grammar context, it is possible to train significantly better confidence scores, therefore making it possible to significantly enhance an application's performance.
  • Automated grammar weights tuning — Sometimes, recognition grammars need to be tuned with grammar weights in order to compensate for frequent recognition problems (e.g., insertions of small words). Manually tuning these grammar weights, however, is a very cumbersome and sub-optimal process. We have developed specialized algorithms that enable us to automatically tune grammar weights in order to maximize speech recognition accuracy.
  • Simulation & Optimization Environment for Response Verification Applications — Speech applications that require validating that the response provided by the user is the expected one (e.g., for identity verification using security questions) are very different from other types of speech applications in that the expected response is known beforehand and the goal of the application is to decide whether or not the caller has indeed provided the response. Tuning such applications, therefore, substantially benefits from a dedicated simulation and tuning environment.