Modeling 0.6 million genes for the rational design of functional cis-regulatory variants and de novo design of cis-regulatory sequences

Tianyi Li, Hui Xu, Shouzhen Teng, Mingrui Suo, Revocatus Bahitwa, Mingchi Xu, Yiheng Qian, Guillaume P. Ramstein, Baoxing Song, Edward S. Buckler, Hai Wang

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

7 Citationer (Scopus)

Abstract

Rational design of plant cis-regulatory DNA sequences without expert intervention or prior domain knowledge is still a daunting task. Here, we developed PhytoExpr, a deep learning framework capable of predicting both mRNA abundance and plant species using the proximal regulatory sequence as the sole input. PhytoExpr was trained over 17 species representative of major clades of the plant kingdom to enhance its generalizability. Via input perturbation, quantitative functional annotation of the input sequence was achieved at single-nucleotide resolution, revealing an abundance of predicted high-impact nucleotides in conserved noncoding sequences and transcription factor binding sites. Evaluation of maize HapMap3 single-nucleotide polymorphisms (SNPs) by PhytoExpr demonstrates an enrichment of predicted high-impact SNPs in cis-eQTL. Additionally, we provided two algorithms that harnessed the power of PhytoExpr in designing functional cis-regulatory variants, and de novo creation of species-specific cis-regulatory sequences through in silico evolution of random DNA sequences. Our model represents a general and robust approach for functional variant discovery in population genetics and rational design of regulatory sequences for genome editing and synthetic biology.

OriginalsprogEngelsk
Artikelnummere2319811121
TidsskriftProceedings of the National Academy of Sciences of the United States of America
Vol/bind121
Nummer26
ISSN0027-8424
DOI
StatusUdgivet - 1 jun. 2024

Fingeraftryk

Dyk ned i forskningsemnerne om 'Modeling 0.6 million genes for the rational design of functional cis-regulatory variants and de novo design of cis-regulatory sequences'. Sammen danner de et unikt fingeraftryk.

Citationsformater