Sous Chefbot UX/VUI Case Study

The Problem

VUI is still new and we lack standardization in designing and documenting this interface. Voice technology is still limited by a lack of quantitative data. We are still working towards users become fluent and confident in voice interaction as a supplement to their visual devices; not as a last resort.

How Might We?

Design a Voice Smart Assistant for cooking that anticipates user needs and provide assistance quickly and efficiently.

User Interviews

In this project I didn't set out with a direct improvement in mind. I had no stakeholder telling me: 'x performed as such and we think it could do better. We want you to streamline y feature to promote usage.'

I went into research interviews with a very open mind and vague, exploratory questions in my script. I interviewed 4 smart assistant users — Google Assistant, Alexa, Siri, or some combination of the three. Due to the pandemic, I conducted the interviews over Zoom.

Interview Goals

What smart assistants were people using? If a combination, what differences did they notice if any?

Mini usability test: I provided tasks and took note of how participants navigated errors and engaged the assistant overall.

How do participants use their smart assistants? E.g. Routine tasks vs. exploring new features?

Interview Trends & Outcomes

Needs

3 out of 4 participants used multiple assistants. Use cases depended on device type.
All participants cited efficiency as motive for VUI use over alternatives (searching for info, playing music etc).
2 out of 4 participants liked receiving feedback about the interaction while it was occurring. E.g. visual feedback or “listening” beeps.

Behaviors

All participants framed questions in simple, concise language. If the question was complex, users opted to manually search.
3 out of 4 participants used their smart assistant for trivia, telling them a joke, or other ‘fun’ uses.
One participant customized and bundled actions using Siri/Google (shortcuts/routines)

Frustrations

All participants recalled or demonstrated abandoning a task if the assistant didn’t execute as expected on the first try.
No participants perceived their assistant learning from them through continuous usage.
All participants categorized tasks as innocuous or too invasive. All expressed wariness about the latter.

Research Synthesis

Personas were fun in this project — not only did I create personas to depict the main user groups that I identified in user interviews; I also had to create a system persona.

By cultivating a system personality we can put the user at ease and set the tone of the interaction. Doing so furthers brand recognition, and personifies the product (which is a good thing -we tend to be more patient and forgiving with a person than we are with a robot).

System Persona

The Investigator

The delegator

Information Architecture & Content Planning

I created task flow to map out the main product tasks.

View Flows on whimsical

To map out the IA, I was initially stumped. After a bit of research, I was able to visually sketch out my understanding of how interactions work when broken down into scenes:

Linear Task Flow Architecture

Potential Intents (while in main task flow)

I want the "follow the recipe" intent to be a thru-line to which all extra queries return to once completed. Each of the satellite intents would go through the linear task flow architecture (see above) then return to the main task (see below). Each step in the recipe completes in this manner.

Information architecture is not nested, when agent is invoked, if any of the following queries are posed, the agent will return to ‘follow the recipe’ upon completion.

Putting it all together

I initially wanted to use Adobe XD for the prototype but it only accepts a single utterance per interaction. In order to test conversational abilities, training phrases, and natural user utterances I used the Google Actions Console.

I wanted the design to blend in with Google's own Assistant design: chips, cards, etc.

Some examples of styling and design system in the Google Assistant for Android (Android 12)

Design Progression

Due to technical and time constraints I completed/fine-tuned sample dialogs (best case scenario for each interaction), possible intents, and working prototype prior to getting user feedback.

Minimum Viable Prototype - Version 1

The MVP prototype consists of: the voice scripts and the Google Actions console voice prototype.

The Google Actions UI for testing is bare bones, but it allows users to speak conversationally and the system will understand prompts outside of the verbatim suggestion chip intention (unlike Adobe XD).

I plugged my conversation components into the interface in order to use the Actions Console for voice prototype testing:

Usability Test Planning

Working prototype done, time to test. Due to tech constraints of using a Google Actions Portal prototype, I performed in-person, moderated usability tests. I was able to recruit 4 smart speaker users for the test.

Task #1 - Invoke the action and select a recipe (chicken soup)

"You’re going to call up the action, and choose a recipe to get started with. The recipe is going to be for “chicken soup”

Task #2 - Cooking tip / Substitution

"Ask for a chocolate cake recipe. In step 1, there is going to be an ingredient you don’t have: muscovado sugar. Ask for an alternate ingredient that you can use instead."

Task #3 - Make a conversion

"Repeat the last flow: invoke the assistant, start the cake recipe, but this time you want to convert measurements to cups."

Usability Test Findings - Key Patterns

All 4 participants used utterances that didn't match the chips. Users naturally want to be conversational.

2.5 out of 4 participants encountered an error state prompt from the system and successfully navigated back to the flow/their desired action without getting kicked out of the conversation.

All 4 participants had trouble with task #2: they told the bot their intent and what they wanted done at the same time. V1 had this process broken up over 3 scenes. The IA needs to be flattened so these intents (cooking tip, substitution, conversion) are recognized at any time.

All participants found Sous Chefbot to be friendly, playful and eager.

Iterations, Creating V2

The one big experience change that was needed: flatten the hierarchy allowing users to access features more broadly, allow the system to pivot/accept more intents rather than request rephrasing when error handling.

New Technologies: The Learning Curve

I’m excited to work on more VUI projects in the future with a team. I would have loved to publish this as an actual Google Action but in typical Google fashion, they announced the sunsetting of the Google Actions program a few weeks after I finished the MVP prototype.

‍I’ve never used YAML before and while it was super straightforward, I don’t know what I don’t know and I did basically no styling to the VUI prototype. I may be wrong, but I think designing for voice is even more closely intertwined with development than normal visual design. I can’t just verbalize: "At this point the user should be able to access x, y, and z." Intents need to be standardized and parameters/errors need to be in the correct section of each scene.

If I had the chance to implement it I would need to learn more about what a web hook actually is, how to set that up, and probably a heck of a lot more about different cooking techniques.

What I learned...

Retrospectively, I got really lucky: the issues found with usability testing prototype V1 were structural and fixable. In the future, I'd be interested in working on a similar project but with more time to implement testing phases along the way.
‍
Through V2 adjustments, I can confidently say that creating wider error handling produces less perceived stakes for the user. We should encourage this, see error states as an opportunity to act as checkpoints: the user may not know what they want and may want a moment to re-calibrate.

Project Overview