Conversational Systems aka chatbots are starting to become mainstream – here’s why you should stay ahead of the game:
The shape-shifting of the yin-yang between humans and technology is one of the hallmarks of digital technologies, but it is perhaps most pronounced and exploit in the area of Conversational Systems. But to truly appreciate conversational systems, we need to go back a few steps.
For the longest part of the evolution of information technology, the technology has been the unwieldy and intransigent partner requiring humans to contort in order to fit. Mainframe and ERP system were largely built to defend the single version of truth and cared little for the experience. Cue hours of training, anti-intuitive interfaces, clunky experiences, and flows designed by analysts, not designers. Most of us have lived through many ages of this type of IT will have experienced this first hand. If these systems were buildings they would be warehouses and fortresses, not homes or palaces. Too bad if you didn’t like it. What’s ‘like’ got to do with it! (As Tina Turner might have sung!)
Digital technology started to change this model. Because of its roots in consumer technology rather than enterprise, design and adoption were very much the problem of the providers. This story weaves it’s way through the emergence of web, social media and culminates with the launch of the iPhone. There is no doubt – the iPhone made technology sexy. To extend the oft-quoted NASA analogy, it was the rocket in your pocket! With the emergence of the app environment and broadband internet, which was key to Web 2.0, it suddenly introduced a whole new ingredient into the technology cookbook – emotion! Steve Jobs didn’t just want technology to be likable, he wanted it to be lickable.
The balance between humans and technology has since been redressed significantly – apps and websites focus on intuitiveness, and molding the process around the user. It means that to deal with a bank, you don’t have to follow the banks’ convenience, for time and place, and follow their processes of filling a lifetime’s worth of forms. Instead, banks work hard to make it work for you. And you want it 24/7, on the train, at bus stops, in the elevator and before you get out from under your blanket in the morning. And the banks have to make that happen. The mouse has given way to the finger. Humans and technology are ever closer. This was almost a meeting of equals.
But now the pendulum is swinging the other way. Technology wants to make it even easier for humans. Why should you learn to use an iPhone or figure out how to install and manage an app? You should just ask for it the way you would, in any other situation, and technology should do your bidding. Instead of downloading, installing and launching an app, you should simply ask the question in plain English (or a language of your choice) and the bank should respond. Welcome to the world of Conversational Systems. Ask Siri, ask Alexa, or Cortana, or Google or Bixby. But wait, we’ve gotten ahead of ourselves again.
The starting point for conversational systems is a chatbot. And a chatbot is an intelligent tool. Yes, we’re talking about AI and machine learning. Conversational systems are one of the early and universal applications of artificial intelligence. But it’s not so simple as just calling it AI. There are actually multiple points of intelligence in a conversational system. How does a chatbot work? Well for a user, you just type as though you were chatting with a human and you get human-like responses back in spoken language. Your experience is no different from talking on WhatsApp or Facebook Messenger for example, with another person. The point here is that you are able to ‘speak’ in a way that you are used to and the technology bend itself around you – your words, expressions, context, dialect, questions and even your mistakes.
Let’s look at that in a little more detail. This picture from Gartner does an excellent job of describing what goes into a chatbot:
The user interface is supported by a language processing and response generation engine. This means that the system needs to understand the users’ language. And it needs to generate responses that linguistically match the language of the user, and often the be cognizant of the mood. There are language engines like Microsoft’s LUIS, or Google’s language processing tool.
Behind this, the system needs to understand the user’s intent. Is this person trying to pay a bill? Change a password? Make a complaint? Ask a question? And to be able to qualify the question or issue, understand the urgency, etc. The third key area of intelligence is the contextual awareness. A customer talking to an insurance company in a flood-hit area has a fundamentally different context from a new prospect, though they may be asking the same question ‘does this policy cover xxx’. And of course, the context needs to be maintained through the conversation. An area which Amazon Alexa is just about fixing now. So when you say ‘Alexa who was the last president of the US’ and Alexa says ‘Barack Obama’ and you say ‘how tall is he?’ – Alexa doesn’t understand who ‘he’ is, because it hasn’t retained the context of the conversation.
And finally, the system needs to connect to a load of other systems to extract or enter data. And needless to say, when something goes wrong, it needs to ‘fail gracefully’: such as “Hmm… I don’t seem to know the answer to that. Let me check…” rather than “incorrect command” or “error, file not found”. These components are the building blocks of any conversational system. Just as with any AI application, we also need the data to train the chatbot, or allow it to learn ‘on the job’. One of the challenges in the latter approach is that the chatbot is prone to the biases of the data and real-time data may well have biases, as Microsoft discovered, with a Twitter-based chatbot.
We believe that chatbots are individually modular and very narrow in scope. You need to think of a network of chatbots, each doing a very small and focused task. One chatbot may just focus on verifying the customer’s information and authenticating her. Another may just do password changes. Although as far as the user is concerned, they may not know they’re communicating with many bots. The network of bots, therefore, acts as a single entity. We can even have humans and bots working in the same network with customers moving seamlessly between bots and human interactions depending on the state of the conversation. In fact, triaging the initial conversation and deciding whether a human or a bot needs to address the issue is also something a bot can be trained to do. My colleagues have built demos for bots which can walk a utility customer through a meter reading submission, for example, and also generate a bill for the customer.
Bots are by themselves individual micro-apps which are trained to perform certain tasks. You can have a meeting room bot which just helps you find and book the best available meeting room for your next meeting. Or a personal assistant bot that just manages your calendar, such as x.ai. We are building a number of these for our clients. Bots are excellent at handling multi-modal complexity – for example when the source of complexity is that there are many sources of information. The most classic case is 5 people trying to figure out the best time to meet, based on their calendars. As you well know, this is a repetitive, cyclical, time-consuming and often frustrating exercise, with dozens of emails and messages being exchanged. This is the kind of thing a bot can do very well, i.e. identify (say) the 3 best slots that fit everybody’s criteria on their calendars, keeping in mind travel and distances. Chatbots are just a special kind of bot that can also accept commands, and generate responses in natural language. Another kind of bot is a mailbot which can read an inbound email, contextualise it, and generate a response while capturing the relevant information in a data store. In our labs we have examples of mailbots which can respond to customers looking to change their address, for example.
Coming back to chatbots, if you also add a voice i.e. a speech to text engine to the interface, you get an Alexa or Siri kind of experience. Note that now we’re adding yet more intelligence that needs to recognise spoken words, often against background noises, and with a range of accents (yes, including Scottish ones). Of course, when it’s on the phone, there are many additional cues to the context of the user. The golden mean is in the space between recognising context and making appropriate suggestions, without making the user feel that their privacy is being compromised. Quite apart from the intelligence, one of the real benefits for users is often the design of the guided interface that allows a user to be walked step by step through what might be a daunting set of instructions or forms or a complex transaction – such as an insurance claim or a mortgage quote.
Gartner suggest that organisations will spend more on conversational systems in the next 3 years than they do on mobile applications. This would suggest a shift to a ‘conversation first’ interface model. There are already some excellent examples of early movers here. Babylon offers a conversational interface for providing initial medical inputs and is approved by the NHS. Quartz delivers news using a conversational model. You can also build conversational applications on Facebook to connect with customers and users. Chatbots are also being used to target online human trafficking. Needless to say, all those clunky corporate systems could well do with more conversational interfaces. Imagine just typing in “TravelBot – I need a ticket to Glasgow on Friday the 9th of February. Get me the first flight out from Heathrow and the last flight back to either Heathrow or Gatwick. The project code is 100153.” And sit back while the bot pulls up options for you, and also asks you whether you need to book conveyance.
Conversational systems will certainly make technology friendlier. It will humanise them in ways we have never experienced before. I often find myself saying please and thank you to Alexa and we will increasingly anthropomorphise technology via the nicknames we give these assistants. You may already have seen the movie “Her”. We should expect that this will bring many new great ideas, brilliant solutions and equally pose new social and psychological questions. Consider for example the chatbot that is desi§gned just for conversation – somebody to talk to when we need it. We often talk about how AI may take over the world and destroy us. But what if AI just wants to be our best friend?
My thanks to my colleagues and all the discussions which have sharpened my thinking about this – especially Anantha Sekar – who is my go-to person for all things Chatbots.
My book: Doing Digital – Connect, Quantify, Optimise – is available here, for the price of a coffee!
As with all my posts, all opinions here are my own – and not reflective of or necessarily shared by my employers.