Blog · 07 Sep 2018

Is voice the key interface in a 'post PC' world?

As we wave goodbye to the mouse and keyboard, will voice-based technology step in to fill the gap?

Nicola Milard
Principal Innovation Partner

In the next few years, it’s likely that we will increasingly be talking to our technology, our houses, our cars.

Voice is the way that we talk to each other, so it seems reasonable to expect that it will become the key way in which we talk to our technologies.

We may be entering a “post-PC” era, as we wave goodbye to the mouse and the keyboard. The smartphone is our window on the world today, but it is being joined by voice-driven smart home devices, smart speakers, smart cars, augmented reality, virtual reality and a myriad of connected devices with microphones. None of these lend themselves well to keyboards, so voice may be the new way that we interact with things.

BT and Cisco’s recent global ‘Chat, Tap, Talk’ research found that 28 per cent of customers thought a voice based chatbot would be an effective way of interacting with a company (versus 37 per cent text based). Seventy three per cent thought that chatbots would improve the customer experience, particularly for simple interactions such as getting train times, meter readings and flight check-in.

The art of conversation

The idea of conversational user interfaces has been around since Alan Turing proposed the Turing Test in the 1940s. But, despite much recent success in natural language processing, communication between human and machine is still in its infancy. Access to massive amounts of data, as well as processing power in the cloud, has meant that voice technologies can start to have more meaningful interactions – even though they are far from being scintillating conversationalists.

This is because language is one of the world’s biggest and most tangled data sets. For a proper conversation to take place, the machine needs to be able to answer back – because conversation is a two-way affair. This requires a surprisingly complex set of skills including speech recognition, speech synthesis, semantic analysis, syntactic analysis, sentiment analysis, common sense, real-world understanding and real-world knowledge. All these have to work together seamlessly to create what we know as conversation.

The nirvana of “ask me anything” is still a long way off.

Ease of use

One of the big factors driving growth of voice is that it can be fast and easy to use – especially for specific task-based activities. We can speak 150 words per minute, whilst only being able to type 40. According to Forrester, 73 per cent of routine phone interactions are just a quick glance to see their notifications, see the time, or read or send a text. This kind of simple interaction is ideal for voice because it is emotion neutral and doesn’t typically require much sophistication in terms of the language used – “turn on the lights” or “turn up the temperature” are very specific commands that don’t need much understanding. However, if commands get more complex – e.g. “turn on the lights on the top landing” - it might be easier to flick a switch.

“I’m sorry, I don’t understand that!”

Those hated words are becoming the bane of customers’ lives.

This is why humans can’t always be cut out of the loop. Seventy four per cent of customers in our ‘Chat, Tap, Talk’ research believed that there needed to be human ‘checks and balances’ in these responses – which means that there needs to be a route into more conventional human channels such as the contact centre from voice devices. One possibility is that voice assistants may increase the number of voice calls coming into the contact centre.

Do these systems need to pretend to be human?

We tend to interact with the world in human terms. As a result, we very easily anthropomorphise a human personality onto our technology. But should we bestow human characteristics, or a personality, onto these virtual agents? Should we give them human names – Alexa, Siri, Cortana? Should we, like the controversial recent demonstration of Google Duplex, build in the “um’s”, “errs” and “uh-ha’s” which punctuate human conversation?

If we think the voice assistant is human, it could create a dangerous belief that you can say anything to it and it will understand. This can result in frustration, which leads to the high levels of drop off in use in the current set of personal voice assistants – particularly when you hit the “I don’t understand ” dead end.

It is undoubtedly early days for voice assistants. But they show potential as a future way of natural, easy and effective engagement with technologies which are gradually disappearing into the fabric of our everyday lives…they don’t need a line in witty banter, they just need to work.

Why not find out how you can embrace new channels and technologies to transform your customer experience, exploit fresh opportunities and stand out from the crowd?