Samuel Stern

Blog Post Title

Generative AI, the area of artificial intelligence and machine learning focused on ‘generating’ new content rather than just labelling or structuring existing data, has emerged as the latest buzzword within the AI community. Despite some claims of generative AI’s capabilities being perhaps somewhat exaggerated, it can offer compelling value-added across a number of business applications. One of these application areas is in its capability to improve conversational agents to make automated dialogues with end users more engaging, relevant and personable. 

Conversational agents, such as chatbots and digital assistants, are becoming increasingly ubiquitous in our everyday lives, from customer service bots to digital concierges. Unfortunately, most interactions with a conversational agent offer a pretty poor user experience, especially if you need to chat with it for more than one or two back-and-forth messages. Responses often feel generic and don’t really address the nuances of what you’re trying to tell it.

Both the practitioners who build such chatbots, and the researchers who study them know this and keep looking for ways to build conversational agents that are more engaging and personable. Often times the changes they come up with offer just minimal improvements, such as slightly better ways of inferring what the user is requesting, or marginally more accurate ways to perform information retrieval for answering general-knowledge questions. The underlying process remains largely the same though. However, recently there have been some significant innovations in the AI technology, specifically in generative AI, that are resulting in a paradigm shift for the chatbot and conversational AI landscape. We’re now starting to see the emergence of Generative Conversational AI – the intersection between generative AI and conversational agents. The impact will be systems that offer both a better end-user experience and a reduction in the cost for businesses who wish to build and maintain their chatbots and digital assistants. 

So, what exactly is Generative Conversational AI? How does it differ from current methods? And how can I adopt this emerging technology within my business?  

Retrieval-Based

The vast majority of chatbots on the market use what’s known as the retrieval-based approach. The way this works is, when you send a message to a chatbot, it runs through a list of pre-defined ‘permitted’ responses, determines which one seems to fit best, retrieves it, and then sends the response back to you.     

With very few exceptions, retrieval-based models dominate the market. Part of this is because, until recently, they were the only available option – generating syntactically and semantically correct natural language on-the-fly is not an easy thing to teach computers to do. That’s not to say they’re all bad, retrieval-based chatbots do have some distinct advantages.  

For one, they are very convenient for PoCs and demos. Low-/No-code chatbot builders like Landbot and Drift let you build and deploy a basic chatbot in minutes, which is great for testing whether it’s a worthwhile investment for your business before investing hundreds of thousands, or even millions, of dollars into the product.   

Another advantage of retrieval-based chatbots for businesses is that, by restricting the AI’s vocabulary to a set of pre-defined messages, you maintain a great deal of control. It can’t simply go ‘off script’ or say something completely unexpected. One of the high-profile examples of where this can go wrong was Microsoft’s Tay which, back in 2016, started sending racist and sexist tweets after learning to mimic what Twitter users were sending it.  

The problem is, retrieval-based chatbots don’t scale well. Keeping track of all the different phrases you want your chatbot to be able to say and when to say them is easy enough at first, but it quickly grows in complexity, exponentially, in fact. This is a major reason why so many people complain that chatbot interactions are too impersonal. Teaching a bot to say generic phrases that are appropriate in multiple contexts is much easier than trying to keep track of a long list of messages, each of which might only be applicable in a very specific scenario. The result, however, is an interaction where the user really doesn’t feel like they are being listened to.  

The incongruity between business making the effort to scale up their responsiveness to stakeholders and those very same stakeholders feeling invalidated as a result of those efforts has negative repercussions for everyone.  Bringing generative AI technology into conversational agents, however, offers a viable solution. 

Generative

Generative Conversational AI removes the limitations of chatbots relying on a pre-defined list of allowed messages.  It can create new message in real time whilst incorporating elements of what the user has just said.  The responses generated share attributes with the original message sent by the user, so it comes across as more personable and personalised.  For example, if I were to type, “I’m feeling very stressed because work has been really busy lately”, a generative conversational AI would respond with something like, “I’m sorry to hear that your work is causing you stress”. In contrast, if I were to send a message to a retrieval-based chatbot, I would be likely to receive the more generic, “I’m sorry to hear that” response due to the chatbot having to rely on a limited set of responses that can be applicable to a wider range of user utterances. 

It is not surprising, then, that generative Conversational AIs are very much seen as the ‘next-generation’ of conversational agents. In fact, they are part of a much broader trend we’re seeing within AI of generative models, such as AI-generated art and music. The title image of this post, for instance, was created using OpenAI’s Dall E. While the concept of generative conversational AI has been around for a while, especially within academia, until recently, generative Conversational AI models haven’t been reliable enough to put into production. This is part of the reason why the chatbots that we see in industry are retrieval-based. People couldn’t be certain that AI-generated text would be syntactically correct, let alone semantically coherent.  

This is changing, though. Large foundational language models (LLMs) like GTP-3 and LaMDA, while not dialogue systems themselves, have demonstrated their ability to generate high-quality text that is (almost) indistinguishable from what a human could come up with. We’re already seeing them be used in other Natural Language Processing (NLP) texts such as translation and text summarisation. The next generation of Conversational AI, such as the recent ChatGPT, are building on top of the LLMs by exploiting their ability to create very accurate and highly contextual texts without the need for manually maintaining a script for the AI to follow.  

All that said, the inherent power of generative Conversational AI presents its own unique set of challenges for those considering this technology in their own setting. LLMs can generate coherent text across a wide range of different subjects and in different styles. The generative Conversational AI is built on top of the LLMs can also, to a large extent, have a domain-agnostic dialogue. Sounds good in theory, but most of the time you want to put in some restrictions on the scope. After all, you wouldn’t want your sales bot to be giving out medical advice.    

So how exactly do you restrict your conversational agent so they can have robust dialogues about the subjects you want them to without veering into areas outside your intended scope? Or similarly, how do you teach them to follow the same procedural steps for providing guidance, support or advice that you would follow in conversations with your customers or clients? As is inevitably the case with AI, the solutions lie in the data.  Generative AIs learn behaviour from the examples they’ve seen. So, to control your AIs’ responses, you need to control the data they are exposed to so that it only learns to replicate the good behaviour and not the bad. 

How do I adopt a Generative Conversational AI?

When it comes to the practicalities of building conversational agents, the difference between retrieval-based and generative approaches is that of people vs data. It’s the difference between teaching by explicit instruction vs teaching by giving examples.   

From a practical implementation perspective, instead of needing to hire expensive teams of specialists to manage increasingly complex retrieval-based chatbots, businesses wishing to leverage the power of Generative AI can instead focus their resources on procuring domain-specific data. Having the right type of data in the right format is essential to producing a Generative Conversational AI that is capable of replicating domain-specific tasks and services. 

At Affiniti AI, we’ve been working on one such tool. Affiniti is a platform that lets you easily procure high-quality domain specific conversational data, and seamlessly train a generative conversational AI chatbot to replicate and automate your services.  

If you want to learn more about how you can create your own Generative Conversational AI then come check us out at www.affiniti.ai