This post is the third installment of our series on developing a banking conversational IVR using Rasa. In this post, we will explore dialogue management strategies. I will first describe what our dialogue requirements were, then I will explain why we opted for a deterministic approach to implement these requirements. Next, I will describe how this approach was implemented with Rasa.
If you haven’t already done so, I strongly suggest that you read the two previous posts of this series:
Developing Conversational IVR Using Rasa
Developing Conversational IVR Using Rasa Part 2: The Rivr Bridge
Our banking IVR is a self-service application in which a caller can execute tasks like paying a bill or getting an account balance. The dialogue includes a loop that allows the caller to carry out tasks as many times as he or she wants:
The application accommodates expert users by allowing them to quickly complete their tasks:
It also adapts to less experienced users by providing them with a more directed dialogue when needed:
The IVR supports mixed initiative strategies like digressions:
As well as change requests or corrections:
The dialogue also handles error recovery, including errors that are specific to the voice channel:
This is only an overview of the patterns handled by our dialogue model; several others are already implemented. We are also planning on adding other dialogue strategies, such as cancelling a task or handling more complex corrections and change requests.
Implementation: why we opted for a deterministic approach
Early on, we decided to rely on a deterministic approach for our use cases. The main reasons why machine learning was less adapted to our current use cases are as follows:
- Task related dialogues are predictable and relatively directed
- Tasks can be executed repeatedly and the application behavior should be identical every time
- Tasks should be interrupted and resumed reliably
- Tokens of information that are collected for a given task are not relevant for another
- Tasks are independent and must not interact with each other
In addition, our requirements for error recovery and change management were rather strict.
Our deterministic approach consists in managing actions using a stack. A stack is a data structure of type last in first out; in other words, the last item added to the stack is the first one to be removed. The action at the top of the stack is the one in focus. When we add an action to a stack, it becomes in focus. When the action in focus is completed, it is removed from the stack and the dialogue goes back to the previous action. This allows to interrupt and resume actions in a predictable and robust manner.
This can be illustrated as follows:
Here is an example of the state of a stack for a dialogue in which the user interrupts the ongoing action (digresses) to ask for a list of their accounts. Once the digression to hear the list of accounts is completed, this action is removed from the stack and the focus comes back to the previous action, that is, the bill payment action.
In addition to allowing us to manage digressions elegantly, our approach allows us to define isolated contexts for each action. This ensures that they remain independent.
Next are some additional details on how this was implemented using Rasa.
In Rasa, a policy is what allows to predict/specify the next action to be executed in the dialogue depending on context (tracker). Out-of-the-box, Rasa offers a combination of deterministic and machine learning based policies. It is also possible to create our own policy. This is what we did by implementing a deterministic policy that alternates between waiting for the next user input and triggering an action used as an action manager (responsible for managing the action stack). Since we do not make predictions as to what action must be executed, our policy does not use stories and does not require a training phase. This is the only policy that we use.
We have created an abstraction to manage the action stack that was described above. The stack is a complex object that we store in an unfeaturized slot. The action to be executed depends on the user’s input as well as the state of the stack.
Custom dialogue patterns
One frequent dialogue pattern is information token collection, or slot-filling. This pattern is used, for instance, by the bill payment action. Rasa provides an action for this pattern: the FormAction. However, we needed to support other, more complex patterns than what is offered by the FormAction, for example: slot confirmation when the speech recognition confidence score is low, final confirmation at the end of an action, etc. We have therefore created a custom class “Task” that handles these more complex patterns. Some of our actions inherit this class. We appreciate that Rasa offers the flexibility that we need to implement our own dialogue management strategies.
Unit test framework
Since we have implemented our dialogues using a deterministic approach, we were able to build our own unit test framework to test our dialogues. This allowed us to increase our application’s reliability.
Although we have been relying on a deterministic approach to develop our banking use cases, we are also currently experimenting with machine learning to develop dialogues for different use cases.
- Here are some of the next items that we will explore:
- Use the Recurrent Embedding Dialogue Policy to support uncooperative user behavior
- Use Rasa X to learn from real conversations
- Create more natural dialogues by using recorded prompts instead of TTS
- Try to integrate machine learning in our deterministic model
We will share our experience in future posts as we move forward. Stay tuned!
About the author: Laurence Dupont
Software Developer, Conversational IVR