Types of Voice Assistants
--
Before diving into the different types of voice and chat assistants, I’d advise having a quick read about the fundamentals of Voice User Interface (VUI) design.
More complex key terms will be explained in another article, such as Automatic Speech Recognition (ASR), natural language processing (NLP), Natural Language Understanding (NLU), Natural Language Generation (NLG), Text-to-Speech (TTS), Speech-to-Text (STT), among others.
However, for now, we’ll simply refer to the back-end system of a voice assistant as its engine.
Most Common Voice Assistants
Rule-Based Bots
These bots are frequently used to fulfil intents that fall within a small scope of requirements, for example “How can I retrieve my password?”. This question doesn’t need a complex algorithm to provide an answer, and can be scripted in the back-end
However, the design process can be fairly complex when it involves a fair amount of analysis and producing dialog flows (or conversation paths). Once you’ve built a dialog flow, you can validate it by filling out the elements with examples before testing the flow with users.
Text-based bots on the other hand, are often used for certain parts of an application given their limited scope. For example, the bot when a user is browsing items on an e-commerce site could be different from the bot popping up during the check-out process. This is because the scope of these bots is limited and are related to certain parts of the service. Nonetheless, complex rule-based bots can be used throughout the entire application.
When building and using rule-based bots, analytics are extremely important. As a VUI designer, you will have to prioritise user queries by figuring out which questions are most frequently asked, frustrate the user, or block the user from completing a task.
Artificial Intelligence Assistants
These assistants are capable to analyse the user, fulfil complex questions and predict the user’s behaviour. Rather than running on hard-coded rules, artificial intelligence is about producing its own rules through learning. In order to produce these algorithms, the Artificial Intelligence (AI) system is given instructions and a set of training data As a consequence, AI-powered assistants can complete tasks that would be impossible using traditionally scripted algorithms. In order to improve the AI-powered assistant, data is necessary, and a lot of it.
These machines learn to comprehend and answer complex queries and are even capable of imitating human voices that make it almost impossible to tell the difference between real and artificial. A famous example is Google’s CEO Sundar Pichai at Google I/O demonstrating Google Duplex, a technology that uses natural language conversations over the phone to carry out tasks, such as booking an appointment at a restaurant.
AI can be extremely useful in the field of VUI design, but, as mentioned before, requires enormous sets of data as it uses an on-going learning mechanism. For example, as we grow older, our faces and voice gradually change. Facial recognition applications keep updating their memory every single time they scan our faces and listen to our voices.
Tech giants like Google, Amazon, Apple, and Amazon invest heavily in AI-based assistants, whereas smaller companies rely more on scripted bots.
Grouping Voice Assistants
Although rare, more than one voice assistant can be used to create a voice user experience. This type of experience consists of having multiple artificial assistants, which could increase credibility and engagement. For example, grouping voice assistants together where each of them has a specialty. Think about a news show, where a different person would present and discuss a different topic, such as the weather, sports, politics, and so on.
Let’s say you’re building an assistant for your fashion application, you could have a section around trends. A female voice assistant could be used to talk about trends for women and a male voice assistant for trends for men.
Custom Voices
Celebrities, Podcasts, and VUIs
As celebrities started lending their voices to VUIs, with John Legend being the first one to do so (for a while), key opinion influencers that became famous through Youtube, Facebook, Instagram, Twitter, etc. have the opportunity to do the same as well. After John Legend left Google Assistant, Youtube star Issa Rae became available as one of Google Assistant’s custom voices.
User-Generated VUIs
Most successful digital services you can think of are made of user-generated content. Medium, for example, completely consists of user-generated content. This is merely a prediction, but VUI platforms could be provided to enable users to integrate their voice as well. In other words, building VUI applications with user-generated content using the author’s voice.
For example, a teacher putting complex material on a voice application with the added benefits of finding out which questions students struggle with (through analytics) and save time by providing answers through voice rather than typing. If we think multimodal, visuals, such as sketches, could be added as well.
Companies Providing AI-Powered VUI Technologies
Primarily in Western regions, companies have built their VUIs on assistants offered by Apple (Siri), Google (Assistant), Amazon (Alexa), and Microsoft (Cortana). Since these companies have stand-alone systems as well, such as smart speakers, and their voice assistants installed on their devices, it’s often the most strategic choice to do so. In addition to the aforementioned companies, Facebook is building their AI-powered assistant as well.
At the same time, Chinese tech giants have been delivering incredible voice assistants as well: Huawei Celia, Baidu DuerOS, Tencent Xiaowei, and Alibaba AliGenie.
Lastly, our focus is on designing applications for voice assistants. However, these assistants offer text-based input and output as well and who knows what else they will bring to the market. The key theme of these articles is designing VUIs, which falls under Human-Computer Conversations (HCC).
We offer multimodal product design and strategy to deliver experiences that your users will love.