Let voice assistants sound like a machine: Voice and task type effects on perceived fluency, competence, and consumer attitude

Hyunjoo Im, Billy Sung, Garim Lee, Keegan Qi Xian Kok

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Voice Assistants (VA) are increasingly penetrating consumers’ daily lives. This study aimed to investigate the effects of synthetic vs. human voice on users’ perception of the voice, social judgments of the VA, and attitudes towards VAs. Drawing from CASA(Computers-as-socialactors) framework and social perception literature, we developed a theoretical model that explains the psychological underlying mechanism of the voice effects. Through two online experiments, we rejected our initial hypotheses that human voice would increase users’ perception and evaluation of the VAs. Instead, our findings support that the VAs were favored when they spoke with a synthetic voice only when the users engaged in functional tasks. There was no difference between the voices for social tasks. A further investigation revealed that participants perceived the synthetic voice to be more fluent when VA responds to functional tasks. This enhanced perception of fluency increased competence perception and attitudes. The findings imply that VAs should not be designed to closely resemble humans. Rather, consideration of usage contexts and consumer expectations should be prioritized in developing most likable VAs.

Original languageEnglish (US)
Article number107791
JournalComputers in Human Behavior
Volume145
DOIs
StatePublished - Aug 2023

Bibliographical note

Funding Information:
Humans treat people and objects (e.g., people in TV, puppets, computers) that display social behaviors as if they are actual humans because human brains do not distinguish actual people from mediated representations of people (Reeves & Nass, 1996). The CASA framework confirms this ‘media equation’ behaviors of humans in the context of human-computer interactions and proposes that humans subconsciously and automatically apply social rules to computers and respond to computers as interaction partners (Gambino et al., 2020; Nass & Moon, 2000). The CASA framework has guided many studies to explain how users behave and interact with computers (Nass & Lee, 2001; Nijholt, 2003; Wang, 2017). Existent literature provides robust evidence to support the main premise of the CASA framework. Participants in previous studies automatically applied social rules they would typically use in human-to-human social interactions when interacting with computers (Lee & Nass, 2010; Nass & Lee, 2001). A few recent studies reported corroborating results for VAs, providing supporting evidence that people treat VAs as social interaction partners (Carolus et al., 2021; Seymour & Van Kleek, 2021; Whang & Im, 2021). The current study shares the view of the CASA framework and assumes that consumers interact and respond to the VAs as if they interact with another person.

Publisher Copyright:
© 2023

Keywords

  • Synthetic voice
  • Voice assistant
  • Voice perception

Fingerprint

Dive into the research topics of 'Let voice assistants sound like a machine: Voice and task type effects on perceived fluency, competence, and consumer attitude'. Together they form a unique fingerprint.

Cite this