Aarhus University Seal

Towards a machine-readable literature: finding relevant papers based on an uploaded powder diffraction pattern

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

  • Berrak Ozer, Columbia University
  • ,
  • Martin A. Karlsen, University of Southern Denmark
  • ,
  • Zachary Thatcher, Columbia University
  • ,
  • Ling Lan, Columbia University
  • ,
  • Brian McMahon, International Union of Crystallography
  • ,
  • Peter R. Strickland, International Union of Crystallography
  • ,
  • Simon P. Westrip, International Union of Crystallography
  • ,
  • Koh S. Sang, International Union of Crystallography
  • ,
  • David G. Billing, University of the Witwatersrand
  • ,
  • Dorthe B. Ravnsbaek
  • Simon J. L. Billinge, Columbia University, Brookhaven National Laboratory

A prototype application for machine-readable literature is investigated. The program is called pyDataRecognition and serves as an example of a data-driven literature search, where the literature search query is an experimental data set provided by the user. The user uploads a powder pattern together with the radiation wavelength. The program compares the user data to a database of existing powder patterns associated with published papers and produces a rank ordered according to their similarity score. The program returns the digital object identifier and full reference of top-ranked papers together with a stack plot of the user data alongside the top-five database entries. The paper describes the approach and explores successes and challenges.

Original languageEnglish
JournalActa Crystallographica Section A: Foundations and Advances
IssuePart 5
Pages (from-to)386-394
Number of pages9
Publication statusPublished - Sep 2022

Bibliographical note

Publisher Copyright:
© 2022 International Union of Crystallography. All rights reserved.

    Research areas

  • machine-readable scientific literature, data-driven literature search, powder diffraction, data similarity, CIF, CRYSTAL-STRUCTURE, DATABASE, PHASES, FILE

See relations at Aarhus University Citationformats

ID: 282613282