Aarhus University Seal

Speech disturbances in schizophrenia: Assessing cross-linguistic generalizability of NLP automated measures of coherence

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

  • Alberto Parola
  • Jessica Mary Lin
  • ,
  • Arndis Simonsen
  • Vibeke Bliksted
  • ,
  • Yuan Zhou, Institute of Psychology Chinese Academy of Sciences
  • ,
  • Huiling Wang
  • ,
  • Lana Inoue, University of Duisburg-Essen
  • ,
  • Katja Koelkebeck, University of Duisburg-Essen
  • ,
  • Riccardo Fusaroli

Introduction: Language disorders – disorganized and incoherent speech in particular - are distinctive features of schizophrenia. Natural language processing (NLP) offers automated measures of incoherent speech as promising markers for schizophrenia. However, the scientific and clinical impact of NLP markers depends on their generalizability across contexts, samples, and languages, which we systematically assessed in the present study relying on a large, novel, cross-linguistic corpus. Methods: We collected a Danish (DK), German (GE), and Chinese (CH) cross-linguistic dataset involving transcripts from 187 participants with schizophrenia (111DK, 25GE, 51CH) and 200 matched controls (129DK, 29GE, 42CH) performing the Animated Triangles Task. Fourteen previously published NLP coherence measures were calculated, and between-groups differences and association with symptoms were tested for cross-linguistic generalizability. Results: One coherence measure, i.e. second-order coherence, robustly generalized across samples and languages. We found several language-specific effects, some of which partially replicated previous findings (lower coherence in German and Chinese patients), while others did not (higher coherence in Danish patients). We found several associations between symptoms and measures of coherence, but the effects were generally inconsistent across languages and rating scales. Conclusions: Using a cumulative approach, we have shown that NLP findings of reduced semantic coherence in schizophrenia have limited generalizability across different languages, samples, and measures. We argue that several factors such as sociodemographic and clinical heterogeneity, cross-linguistic variation, and the different NLP measures reflecting different clinical aspects may be responsible for this variability. Future studies should take this variability into account in order to develop effective clinical applications targeting different patient populations.

Original languageEnglish
JournalSchizophrenia Research
ISSN0920-9964
DOIs
Publication statusE-pub ahead of print - 1 Aug 2022

Bibliographical note

Copyright © 2022 The Authors. Published by Elsevier B.V. All rights reserved.

    Research areas

  • Biomarker, Communication disorders, Digital phenotyping, Natural language processing, Schizophrenia spectrum disorder, Semantic coherence, Thought disorder

See relations at Aarhus University Citationformats

ID: 278657040