Characterising RNA secondary structure space using information entropy

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

  • Zsuzsanna Sükösd
  • ,
  • Bjarne Knudsen, CLC bio, Aarhus, Danmark
  • James WJ Anderson, Department of Statistics, University of Oxford, Storbritannien
  • Adám Novák, Department of Statistics and Oxford Centre for Integrative Systems Biology, University of Oxford, Storbritannien
  • Jørgen Kjems
  • Christian Nørgaard Storm Pedersen
Comparative methods for RNA secondary structure prediction use evolutionary information from RNA alignments to increase prediction accuracy. The model is often described in terms of stochastic context-free grammars (SCFGs), which generate a probability distribution over secondary structures. It is, however, unclear how this probability distribution changes as a function of the input alignment. As prediction programs typically only return a single secondary structure, better characterisation of the underlying probability space of RNA secondary structures is of great interest. In this work, we show how to efficiently compute the information entropy of the probability distribution over RNA secondary structures produced for RNA alignments by a phylo-SCFG, and implement it for the PPfold model. We also discuss interpretations and applications of this quantity, including how it can clarify reasons for low prediction reliability scores. PPfold and its source code are available from http://birc.au.dk/software/ppfold/.
OriginalsprogEngelsk
ArtikelnummerS22
TidsskriftB M C Bioinformatics
Vol/bind14
NummerSuppl 2
Sider (fra-til)1-9
Antal sider9
ISSN1471-2105
DOI
StatusUdgivet - 21 jan. 2013
BegivenhedThe Eleventh Asia Pacific Bioinformatic Conference - Vancouver, Canada
Varighed: 21 jan. 201324 jan. 2013
Konferencens nummer: 11th

Konference

KonferenceThe Eleventh Asia Pacific Bioinformatic Conference
Nummer11th
LandCanada
ByVancouver
Periode21/01/201324/01/2013

Se relationer på Aarhus Universitet Citationsformater

Download-statistik

Ingen data tilgængelig

ID: 52558394