Subgroup Identification of Speakers with and without ASD Using Network Models of Acoustic Features of Prosody and Voice

Ethan Weed, Riccardo Fusaroli, Jessica Mayo, Inge-Marie Eigsti

Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review

Abstract

Background: Untrained peers can recognize an atypical quality to the speech of some people with ASD (Grossman, 2015; Redford et al., 2018). Although certain acoustic features such as pitch range (Nadig & Shaw, 2012) have been shown to be important groupwise factors for distinguishing between ASD and TD speech, no clear pattern of features associated with the speech of people with ASD has yet been shown, and the speech of people with ASD has been described confusingly as e.g. both "monotonous" and "variable," "stilted" and "exaggerated" (Fusaroli et al., 2017). This raises the intriguing possibility that while autistic speech may sound atypical, this atypicality may manifest in different ways in different individuals. Because speech quality is defined by a complex set of interrelated acoustic variables, we used network models to explore the multivariate space describing the speech qualities to which raters respond. Objectives: Study goals were (1) to identify potential sub-groups of typical and atypical speakers, using acoustic features of prosody and voice, and (2) to describe the vocal and prosodic qualities characteristic of these subgroups. Methods: We analyzed speech data (8 scripted sentences per participant) from 15 adolescents diagnosed with ASD (mean age = 14.4 years, SD = 1.48) with IQ scores in the typical range, and 15 adolescents with typical development (TD; mean age = 14.1 years, SD = 1.91); groups did not differ on chronological age or full-scale IQ. Participants in both the ASD and the NT groups demonstrated average to high average performance on standardized language measures (see Mayo, 2015, for details). These recording were categorized by 15 naive raters as “typical,” “somewhat unusual,” or “definitely atypical.” Acoustic features extracted from the participants’ speech were used to build a network of speakers, using positive and negative partial correlations between acoustic variables to define the strength and direction of the connections. A community-detection algorithm (Reichardt & Bornholt, 2006) was used to identify acoustic profiles of speakers. Results: When speakers were modeled as nodes in a network, they clustered roughly in correspondence to their classification as "typical" or "atypical" by naive raters. The community detection algorithm identified four acoustic profiles among the speakers. One profile contained primarily speakers characterized as "typical," while the other three showed distinct patterns of acoustic quantities. Atypical speakers were distinguished from typical speakers by slower speech rate, but atypical speakers were also distinguished from each other primarily by different patterns and degrees of breathiness and nasality. Conclusions: These results suggest that while the acoustic patterns of typical speakers tend to resemble one another, there may exist identifiable subgroups of atypical speakers. Although the community detection algorithm was free to identify up to ten different groups, it identified only four meaningfully different clusters of speakers, suggesting that similarities between atypical speakers within each subgroup were stronger than their differences. These data point toward the possibility of moving beyond broad descriptions such as “monotonous” toward a more nuanced understanding of the different types of atypical prosody and voice quality in ASD.
Original languageEnglish
Publication date2020
Publication statusPublished - 2020
EventINSAR 2020 Virtual -
Duration: 3 Jun 2020 → …
https://insar.confex.com/insar/2020/meetingapp.cgi/Home/0

Conference

ConferenceINSAR 2020 Virtual
Period03/06/2020 → …
Internet address

Keywords

  • Autism spectrum disorder
  • voice analysis

Fingerprint

Dive into the research topics of 'Subgroup Identification of Speakers with and without ASD Using Network Models of Acoustic Features of Prosody and Voice'. Together they form a unique fingerprint.

Cite this