Takeaways from the VUX World Live Google Contact Centre AI with Antony Passemard

by | Mar 19, 2021 | Blog

Last week, there was a very interesting interview with Antony Passemard, Head of Conversational AI at Google Cloud by Kane Simms and Dustin Coates of VUX World. The interview was primarily about the three core components – Virtual Agents/Dialogflow, Agent Assist, and Insights – of Google Contact Center AI (CCAI) and how they are made available to customers. Here are a few takeaways. 

Dialogflow CX vs. ES

The interview started with a comparison between Dialogflow CX and ES. CX is not just an incremental improvement over ES. It is in fact a complete redesign, with a more powerful and more intuitive dialog model. It also has a clean separation between intents and dialogue that greatly increases intent reusability and dialogue manageability and a visual builder that can be easily used by Conversational Architects to create complex dialogues with less code.

According to Passemard, this had long been requested by many customers. While Dialogflow ES, which Google Cloud will continue to support and improve, is appropriate for simple dialogues, Dialogflow CX should be the platform of choice for longer and more complex dialogues. In addition, Dialoglfow CX provides several advantages over ES:

  • More predictable (although not necessarily lower) pricing
  • Several IVR features (including barge-in, DTMF support, timeouts, retries), which were necessary in order to build conversational IVR.
  • Support for up to 40,000 intents (compared to 2,000 with ES)
  • More collaboration features that enable teams to work on large projects more efficiently
  • Better support for analytics, experiments, and feedback loops
  • A better NLU engine, based on the latest BERT model.

Anybody can use Dialogflow today. However, for conversational IVR, integrating Dialogflow with a contact centre platform generally remains a challenge. Most IVR specific features require a good integration with the IVR platform and depend on events or parameters to be provided to Dialogflow, whether it is to leverage DTMF for use cases other than numerical parameters, or to use incremental no-input event handlers.

Passemard mentioned that some solutions, such as Audiocodes, can facilitate this integration. Interestingly, he also mentioned that it is best to stream the audio directly to Dialogflow rather than using Google STT to transcribe the audio and send the transcription to Dialogflow. The reason for this is that Dialogflow has an Auto Speech Adaptation feature that automatically optimizes the transcription accuracy based on the agent’s training phrases. That said, our own experience shows that we can often achieve as good or better results by streaming the audio directly to Google STT, using speech adaptation. Moreover, it is often necessary to post-process transcription results in order to make them compatible with Dialogflow’s NLU, which is not possible when streaming audio directly to Dialogflow.

Agent Assist for Voice

The next topic covered in the interview was Agent Assist. This is an important topic for at least two reasons. The first is that there are very promising use cases for Agent Assist. The second is that we’ve heard a lot about CCAI Agent Assist in the past couple of years, but it’s been hard to understand exactly how to access this capability. About this last point, Passemard confirmed what we suspected: there is no public API for Agent Assist voice; Google decided to only make it available through CCAI telephony partners. As mentioned by Simms, this could be a smart business strategy. By working aggressively with telephony partners to integrate Agent Assist with their platforms and reselling only through them, Google may ensure that it becomes the de facto choice for Agent Assist.

The downside, however, is that enterprises are entirely dependent on the contact center vendors’ motivation and ability to make CCAI available to their customer base. It might be a while before many enterprises can leverage CCAI and, when that happens, it might require very expensive upgrades to their contact center infrastructure. For this reason, customers may end up looking for these alternative solutions that will inevitably become available.

This brings me to the Agent Assist use cases. Passemard mentioned that proposing relevant documents to agents based on the conversation wasn’t found to be very useful by customers. Agents don’t want to read through full documents to find the answer to the customer needs. They want extractive search, that can automatically extract the document’s relevant portion. And, we heard, that’s coming soon. What is really taking off at the moment according to Passemard is the ability to automatically fill in forms in real time with information provided by the caller. That’s really powerful. And, of course, a side benefit of Agent Assist is getting a transcription of every single call.

Agent Assist for Chat

Passemard said that Agent Assist for chat has been shown to provide great improvements of agent productivity and satisfaction and CSAT scores. In particular, Smart Reply and Smart Compose capabilities are provided using predictive models trained on the customer’s data, which makes them much more accurate. Agent Assist for chat is currently only available from chat vendors, but a public API is coming out soon.


The last CCAI capability mentioned is Insights, which is Google’s name for speech analytics. Insights is still in preview, but the good news is that it will be available to all with a public API. Insights is about understanding conversations that are happening in the contact center. Using Insights, enterprises will be able to look at conversations, index them, search through them, do topic modeling and sentiment analysis, navigate within a conversation, and perform NLU-based phrase matching (e.g., “Give me all conversations with a greeting”). Google will support a SIPREC integration.

Final Notes

Passemard mentioned that Conversational AI is probably the first application of AI that has a massive impact on customers. That’s an intriguing claim; it would be interesting to see some data that supports this. He also concluded by strongly advising against underestimating the value of a good Conversational Architect. We couldn’t agree more. It’s definitely not something you learn in two weeks. The very good ones have years of experience and they are critical to the success of any conversational project.

About the author: <a href="https://www.nuecho.com/news-events/author/ynormandin/" target="_self">Yves Normandin</a>

About the author: Yves Normandin

A leading authority in speech recognition, natural language processing and machine learning, Yves brings over 30 years of experience to the team. His career has included research, product and application development, and business development. Today, he’s responsible for defining the corporate direction and technological vision of Nu Echo, as well as leading our speech platform and building strategic alliances.
Share This