Exploring Voice User Interface Design — A Voiceflow Guide and Tutorial

8 min readJan 7, 2021

After using several applications that focus on producing Voice User Interfaces (VUIs), I’m more than certain that Voiceflow will become the industry-standard.

If you’re interested in designing and launching VUIs, I highly recommend you to pick up Voiceflow.

Before We Get Started

When you fire up Voiceflow, you will be asked to create a workspace and within that workspace, you can create projects.

Your workspace could represent an entity like your company, freelance business, or agency. You can change this at any time.

Pick a workspace in order to start creating projects

Once you’ve created your workspace, depending on whether or not you have a free plan or paid plan (which I highly recommend), you can start creating projects.

The next step is creating your first project

Once you created a workspace and a project, you will have to choose between building a VUI for Amazon Alexa (AA) or Google Assistant (GA). I tend to go for Google Assistant given Alexa’s regional restrictions. However, especially for US citizens, Alexa is an extremely powerful assistant.

Choose your Assistant and click “design & prototype”

The Building Blocks

Before we start designing, let’s get familiar with the “blocks” or features.

Response

As the name suggests, this is the response that the user will receive from the assistant.

Voiceflow doesn’t only provide simple speak and audio playback features. What’s really sets the application apart are the card and stream blocks:

Speak: Make the assistant talk using text input.
Audio: Add audio to the experience. Don’t just think of music, but radio quotes, and so on.
Card: Personally one of my favourite features, perhaps because of my background in UX and UI design. This feature allows the user to add images. Think Alexa Show, which combines voice and standardised visuals.
Stream: Similar to audio and card, stream allows you to add long audio files and images that will the user out of the voice application (or skill), until the user uses a keyword like “next”, “previous”. In other words, the user is streaming through media.

User Input

VUI designers are familiar with intents, basically what the user wants from the assistant. For example, “book me a taxi to work”.

Choice: When asking the user what s/he wants, there are often several options. When we go back to the taxi booking example, once the user opens the booking app, AA or GA could ask “do you want to book a taxi now?”. This is where you provide options such as “yes” or “later”. You will also have to enter utterances and slots, which we will discuss below.
Prompt: Users tend to jump from topic to topic. Prompt makes non-linear conversations possible by listening to intents when the user and the assistant are already having a conversation. This will trigger the assistant to bring up and answer that particular intent.
Intent: An intent is the user’s intention, which is often used at the beginning of a session. The choice block on the other hand is used to provide options to the user when being asked a question by the assistant.
Utterances: Utterances are the exact phrases that users use to communicate their intent. Make sure to enter as many utterances as possible to make sure that the assistant has enough to understand what the user wants.
Slots: These are comparable to variables and make a user’s intent and response more dynamic and specific. For example, Voiceflow has a set of predefined slots, and additionally, the user can define their own slots.

Logic

Changing variables, if-statements, randomise items, etc are features that impact the logic and structure of your VUI. Many of these features are used in coding and traditional UX design software like Axure RP.

IF: Just like in programming, we can use if-statements in a plethora of scenarios, ranging from matching distance to the times that a user has entered the application.
Set: Voiceflow offers variables, which can be defined by the designer and the user, and manipulated during the experience through set. Let’s say that a user is going through a test, with the help of set, we can change the score and add a point for every correct answer.
Capture: In this case, the assistant will capture the user’s response and store it into a variable.
Random: As you might have guessed, you can provide several items to the user, and the system will randomly pick one and feed it back to the user.
Flow: The best way to describe this is to compare to symbols in Sketch for example. With flow, you can turn a part of the design into a re-usable component.
Exit: A simple feature that lets the user exit the application.

Integration

Comparable to a plug-in section, integration allows the user to integrate external assets. The most powerful tool here is the API. However, a lot is possible through Zapier too and there’s also the option to add custom code. Interestingly, Google Sheets is also one of the features. This allows the user to connect the voice application with external data.

API: Many APIs are already available and some are even free to use. However, custom-made APIs can make the voice assistant extremely powerful and connect with almost anything digital.
Google Sheets: As mentioned before, Google Sheets allows the designer to connect the voice application with an external source of data.
Zapier: This feature allows the user to create a sequence of actions using different applications. In other words, automating online services.
Customer Code: If needed, there is always the option to add JavaScript. A lot can be found online, especially in the field of VUI production.

Channel

These blocks are made specifically for AA and automate a lot of the shopping experience. For example, permissions and user info allow the assistant to request certain information about the user.

A Simple Example — Registration Form

Before creating a comprehensive VUI with Voiceflow, we’ll start with a simple concept and show how it’s designed in Voiceflow. In our example, we will build a registration form where the user’s registration details are stored in a Google Sheets document.

Easing in the User

When the user opens a voice application, s/he doesn’t want to waste time, but to avoid errors and make it more consumable, we start the session with a quick question if the user wants to register for a particular event. This block consists of a speak item combined with a choice item.

The starting point for your VUI prototype

Storing Information in a Google Sheets Document

Since we want to link our experience to a Google Sheets document to store information, we create a Sheets document. When you create a spreadsheet to work with Voiceflow, make sure your first row contains the labels.

We can then go back and create the block that creates the variables {Name}, {Job}, and {Company}. Next, we can add the Google Sheet item to that block.

The questionnaire and variables that link to Google Sheets

I’ve used the set feature here to create the variables.

However, a better practice is to go to “Model” and add the variables there. You can find the entry point to this section in the left-bottom corner of the canvas, next to start, commenting, markup, and the zoom functionalities.

To create the actual sequence of questions, we create a block with questions, using the speak item, and get answers from the user that will be stored in a variable through the capture item. The information, stored in their respective variable, will then be added to the spreadsheet.

Closing the Application

Lastly, once the registration is completed, we thank the user and close the voice application. Make sure to not forget any of the error scenarios.

Testing the Prototype

Once we’re done or think we’re done, we can test our prototype.

When we go to our Google Sheets document, we can see that the requested information is added within their columns. When we run the test again, another row of information is added.

Slots and variables

Both terms are very similar but slightly different. Slots are used in utterances, for example when mentioning a time or destination, whereas variables are broader in terms of storing information.

As a rule of thumb, use slots in utterances, such as destination, and variables when storing information about the user, such as contact details.

If you’re working with different flows that apply to several users, use global variables since local variables are meant to be used for one flow only.

More Comprehensive Flows

Voiceflow is capable of much more than what we’ve shown in the example and there is a difference in functionality between Amazon Alexa and Google Assistant.

Using APIs and custom code, the possibilities are almost endless. As an advocate of Voiceflow, I can’t wait for the next set of features to come out.

We offer multimodal product design and strategy to deliver experiences that your users will love.