Protein NMR assignment by isotope pattern recognition

Uluk Rasulov, Harrison K. Wang, Thibault Viennet, Maxim A. Droemer, Srđan Matosin, Sebastian Schindler, Zhen Yu J. Sun, Luca Mureddu, Geerten W. Vuister, Scott A. Robson, Haribabu Arthanari*, Ilya Kuprov*

*Corresponding author for this work

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Abstract

The current standard method for amino acid signal identification in protein NMR spectra is sequential assignment using triple-resonance experiments. Good software and elaborate heuristics exist, but the process remains laboriously manual. Machine learning does help, but its training databases need millions of samples that cover all relevant physics and every kind of instrumental artifact. In this communication, we offer a solution to this problem. We propose polyadic decompositions to store millions of simulated three-dimensional NMR spectra, on-the-fly generation of artifacts during training, a probabilistic way to incorporate prior and posterior information, and integration with the industry standard CcpNmr software framework. The resulting neural nets take [1H,13C] slices of mixed pyruvate–labeled HNCA spectra (different CA signal shapes for different residue types) and return an amino acid probability table. In combination with primary sequence information, backbones of common proteins (GB1, MBP, and INMT) are rapidly assigned from just the HNCA spectrum.

Original languageEnglish
Article numbereado0403
JournalScience Advances
Volume10
Issue36
ISSN2375-2548
DOIs
Publication statusPublished - 6 Sept 2024

Fingerprint

Dive into the research topics of 'Protein NMR assignment by isotope pattern recognition'. Together they form a unique fingerprint.

Cite this