Comparison of methods for calculating conditional expectations of sufficient statistics for continuous time Markov chains

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

BACKGROUND:
Continuous time Markov chains (CTMCs) is a widely used model for describing the evolution of DNA sequences on the nucleotide, amino acid or codon level. The sufficient statistics for CTMCs are the time spent in a state and the number of changes between any two states. In applications past evolutionary events (exact times and types of changes) are unaccessible and the past must be inferred from DNA sequence data observed in the present.

RESULTS:
We describe and implement three algorithms for computing linear combinations of expected values of the sufficient statistics, conditioned on the end-points of the chain, and compare their performance with respect to accuracy and running time. The first algorithm is based on an eigenvalue decomposition of the rate matrix (EVD), the second on uniformization (UNI), and the third on integrals of matrix exponentials (EXPM). The implementation in R of the algorithms is available at www.birc.au.dk/~paula/.

CONCLUSIONS:
We use two different models to analyze the accuracy and eight experiments to investigate the speed of the three algorithms. We find that they have similar accuracy and that EXPM is the slowest method. Furthermore we find that UNI is usually faster than EVD.
OriginalsprogEngelsk
TidsskriftB M C Bioinformatics
Vol/bind12
Nummer1
Sider (fra-til)465
ISSN1471-2105
DOI
StatusUdgivet - 5 dec. 2011

Se relationer på Aarhus Universitet Citationsformater

Aktiviteter

ID: 44029169