Relative Importance of Prosody Versus Voice Quality for Clinician Assessments of Speech in ASD

Ethan Weed, Riccardo Fusaroli, Jessica Mayo, Inge-Marie Eigsti

Research output: Contribution to conferenceConference abstract for conferenceResearchpeer-review


Background: Trained clinicians (Nadig and Shaw, 2012) and untrained peers (Grossman, 2015) are able to distinguish speakers with ASD from NT speakers on the basis of short samples of speech. However, they are not always aware of which acoustic features or combination of features they are responding to (Nadig and Shaw, 2012; Redford et al., 2018). Prosody and voice quality both contribute to the overall impression of a speaker, but a recent meta-analysis (Fusaroli et al, 2017) found almost no studies of voice quality in ASD. Objectives: To assess whether trained clinicians' classification of adolescent speakers as either ASD or NT on the basis of speech alone is based primarily on acoustic features supporting prosody or voice quality. Methods: We analyzed speech data (8 scripted sentences per participant) from 15 adolescents diagnosed with ASD (mean age = 14.4 years, SD = 1.48) with IQ scores in the typical range, and 15 adolescents with typical development (TD; mean age = 14.1 years, SD = 1.91); groups did not differ on chronological age or full-scale IQ. Participants in both ASD and the NT groups demonstrated average to high average performance on standardized language measures (see Mayo, 2015, for details). Using acoustic features selected on the basis of previous literature (Fusaroli et al, 2017; McCann & Peppé 2003), we constructed two Bayesian logistic regression models. Model 1 predicted clinicians’ classifications on the basis of prosody measures: standard deviation of the fundamental frequency, utterance duration, and articulation rate. Model 2 added measures of voice quality (creak, jitter, shimmer, and H1H2) to the prosody measures of Model 1. Results: Model 1 (prosodic measures only) had an accuracy of 0.74 (CI: 0.72 - 0.76), sensitivity of 0.66 (CI: 0.62 - 0.70), and specificity of 0.82 (CI: 0.78 - 0.85). Model 2 (prosodic plus voice measures) had an accuracy of 0.72 (CI: 0.69 - 0.75), sensitivity of 0.68 (CI: 0.63 - 0.73) and specificity of 0.76 (CI: 0.72 - 0.80). Thus, there was no gain in adding measures of voice quality when predicting clinical rater intuitions. Conclusions: Judgements based on clinical intuition display high sensitivity (0.86) and specificity of (0.86) in classifying these samples (Eigsti, Mayo and Simmons, INSAR 2016). Although the literature suggests that the speech of people with ASD may be atypical in terms of both prosody and voice quality, these data suggest that clinicians may base their assessment primarily on prosodic features supporting pitch and rhythm.
Original languageEnglish
Publication date2020
Publication statusPublished - 2020
EventINSAR 2020 Virtual -
Duration: 3 Jun 2020 → …


ConferenceINSAR 2020 Virtual
Period03/06/2020 → …
Internet address


  • Autism spectrum disorder
  • Prosody
  • voice analysis


Dive into the research topics of 'Relative Importance of Prosody Versus Voice Quality for Clinician Assessments of Speech in ASD'. Together they form a unique fingerprint.

Cite this