A few years ago, when one picked up the phone to reach an organization, it was mostly to ask simple questions or perform simple tasks. The phone channel was the main doorway to obtain information and services. Over the years, however, digital channels and self-services, whether on the web or mobile, have been absorbing more and more of these requests. Nowadays, the phone channel is often used as a last recourse. For example, having a highly interactive conversation is easier on the phone than on digital channels. Or, when a technical issue occurs while using a self-service, one doesn’t have much choice but to pick up the phone to get human assistance to solve the problem.
Consequently, the phone channel now handles more complex problems, and it does so in a larger proportion than before. Since it has become the ultimate channel for these more complex or problematic customer journeys, organizations should strategically pay special attention to their phone channel. No one today would raise doubts about the relevance of offering clients a simple and efficient phone service experience to gain and maintain their loyalty.
The word “conversational” on everyone’s lips
The word “conversational” has been part of the customer experience vocabulary for a few years now; for more on that concept, you may refer to the “Whaddya mean, conversational IVR?” post. Essentially, so-called conversational approaches allow users to express themselves freely by describing their issue in their own words, while supporting a low constraint dialogue structure. The role of the system in the conversation is to collect missing information while adapting to interruptions and topic changes.
When added to interactive voice response (IVR) systems, conversational interfaces revolutionize customer experience by allowing the caller to speak their entire request at once instead of navigating through a maze of menu options. The customer doesn’t need to adapt anymore. Rather, it’s the system that must.
The wake up phone call
For those who have yet to deploy conversational solutions on the phone channel, defining a roadmap for this transition is a crucial step. The strategy must consider potential business benefits, as well as technological risks and organizational impacts. So, how do we get there?
We need a multi-step plan that will allow the organization to progressively assimilate transformations while maximizing benefits. The organization must learn and adjust. Indeed, developing and deploying conversational systems demand that we change the way we define specifications, design the user interface, develop and test the solution. In addition, natural language understanding (NLU) requires large quantities of user data, that is, recordings of calls along with their transcriptions. This is a new but essential element that we need to take into account during project planning.
First words: natural language call steering
As a first step in the transition towards a conversational IVR, it’s quite interesting to consider a natural language call steering solution (NLCS). The goal of such a solution is to allow callers to speak the reason for their call in their own words. The system then analyzes and interprets the caller’s request and routes the call to the right destination, that is, either to appropriate agents or to a self-service module. This replaces menus, at least as the main interface, keeping menus as a useful fallback strategy. Illustrated below is a scenario of a caller interacting with a NLCS solution:
What level of effort is required to deploy a NLCS solution? First, we must run a data collection. On the one hand, collecting speech data from real clients allows to understand how they express themselves when they communicate with the organization. On the other hand, an inventory of products and services, internal and external terminology as well as the structure of the contact center (agent groups, skills, etc.) will help us understand the business domain.
Secondly, the data will be analyzed so that we can define an intent catalogue, that is, all the categories of customer requests/reasons for calling that can be handled by the system. For vague intents requiring a more precise classification, we will need to define a disambiguation strategy, which consists of defining the questions to ask the caller to help them clarify their intent. Finally, for every specific intent, we will need to determine the right destination, whether an agent group or a self-service, able to respond to the caller’s request.
With the data that was gathered during the data collection step, we will then train and tune the speech recognition and NLU modules to successfully recognize the utterances spoken by the callers and correctly classify their intents.
The IVR application for a NLCS is simple. This is partly what makes it an ideal first step in the transition towards conversational IVR. In the flow of such an application, the system asks an initial question, after which it may ask for a confirmation depending on the confidence level of the NLU module. Then, if the request is vague, a disambiguation dialogue may be necessary. Finally, based on the intent identified from the caller’s request, the call will be routed to the right destination.
There are thus very few states in this application. This significantly simplifies design, development, as well as testing.
Once the application is completed, we will proceed with a pilot phase. Typically, during the pilot, the new NLCS solution will only be exposed to a fraction of the customer base. This will allow not only to measure, but also to improve the application performance. Caller utterances will be collected and will be used to enhance the NLU model. During the pilot, it would also be wise to take the opportunity to survey users on their new experience. Once adjustments are made, it will be possible to make the application available at a larger scale.
Following deployment in production, it is important to periodically retrieve caller data in order to improve the NLU model. These post deployment optimizations allow the application to adapt to evolving customer requests.
Here is a simplified view of the development process:
Harvesting the fruits
Reducing the number of incorrect transfers is the main benefit of a NLCS solution and will contribute to improving agent utilization. As a consequence, contact center operational efficiency will increase. By the same token, users will feel less frustrated, which will improve not only customer but also agent experience.
The efficiency and user-friendliness of the conversational interface, in conjunction with the fact that clients can express themselves naturally in their own words, are additional factors that contribute to a better customer experience. The contact center being the connection between the organization and its customers, a better customer experience translates into an increased loyalty.
The new solution will shed light on the actual call reasons. A traditional IVR only reflects what callers select among options that the organization assumes they’re calling about, as opposed to unfiltered natural language utterances that can help identify recurring problems or questions that could be handled by self-services (phone or otherwise). Having this information on hand will also allow us to focus our efforts on optimizing design for the most frequent queries.
The NLCS also offers the possibility of regrouping many phone numbers under a single point of contact. It makes it unnecessary to separate services in order to reduce the complexity and size of the menus. All is available at the first interaction. With a single number, we avoid frustrating callers by forcing them to look for the right number to use or having them use the wrong doorway.
The NLCS allows the organization to gain experience in several aspects of conversational IVR. For example, when working with natural language speech recognition, it is not possible to use deterministic technical specifications, which are typically what we do when developing traditional IVR. Language being a human phenomenon, it is impossible to anticipate all possible phrasings, which is why we resort to artificial intelligence systems that learn from input data. It is thus necessary to deal with the concept of uncertainty. The system is comprised of two steps, speech recognition and natural language understanding, each adding a degree of uncertainty to the end result.
Therefore if, before, it was enough to say…
“When the user presses the Y key, the call is transferred to department X”
…we must now describe the behavior as follows:
“When the system recognizes intent Y, the call is transferred to department X”
The way the intent is recognized cannot be formally described; it must be considered as a black box. This not only impacts specifications, but also testing. We must consider the following aspects separately :
- The performance of intent detection, i.e. the combined performance of speech recognition and NLU
- The application behavior following intent detection, in other words, once a given intent is recognized, the application will have to behave in a predetermined manner
This principle will stay relevant throughout the evolution towards a complete conversational solution.
This is only an example of one of many practice aspects that require particular attention. We will also need to pay attention to performance evaluation reports, audio data collection and storage, execution of pilot experiments, post deployment tuning, as well as continuous improvement process.
Once the NLCS solution in place, there are many possible avenues to explore the conversational approach. We could, for instance, add a knowledge base query module. This solution would allow answering the user’s question directly instead of routing the call to an agent who would then need to run the search herself in order to answer the caller.
It is also possible to offer transactional self-service modules in the phone channel. We could modify the existing self-services and introduce new ones. To help identify the most relevant self-services to add, we could leverage the data obtained through the call steering solution. These self-services will allow users to speak complex requests right off the bat. Contrary to the NLCS, whose dialogue structure is simple, conversational self-services demand a sophisticated dialogue engine that can take charge of mixed initiative dialogues and adequately manage specific dialogue events like digressions, topic changes and corrections (see this two-part post on corrections).
Since many self-services need to know the caller’s identity to execute operations linked to their profile, we will have to implement an identification and authentication module. And to preserve the naturalness of conversational interfaces, it would be sensible to opt for a voice biometrics solution to authenticate the caller. This method, which authenticates a client simply with their voice, is user-friendly, quick and efficient.
Having deployed a NLCS solution will allow the organization to face new challenges with confidence. The experience acquired will help the organization to anticipate technical challenges and will allow it to discover how conversational interfaces can bring value.
About the author: Jean-Philippe Gariépy
Software Architect, Conversational IVR