Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Standard

Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling. / Simonsen, Martin; Maetschke, S.R.; Ragan, M.A.

In: Bioinformatics, Vol. 28, No. 6, 01.03.2012, p. 851-857.

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Harvard

APA

CBE

MLA

Vancouver

Author

Simonsen, Martin ; Maetschke, S.R. ; Ragan, M.A. / Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling. In: Bioinformatics. 2012 ; Vol. 28, No. 6. pp. 851-857.

Bibtex

@article{aa8a55fd69934c78be85671d5c94e892,
title = "Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling",
abstract = "Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available.Results: We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.",
author = "Martin Simonsen and S.R. Maetschke and M.A. Ragan",
note = "Copyright 2012 Elsevier B.V., All rights reserved.",
year = "2012",
month = mar,
day = "1",
doi = "10.1093/bioinformatics/btr720",
language = "English",
volume = "28",
pages = "851--857",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

RIS

TY - JOUR

T1 - Automatic selection of reference taxa for protein-protein interaction prediction with phylogenetic profiling

AU - Simonsen, Martin

AU - Maetschke, S.R.

AU - Ragan, M.A.

N1 - Copyright 2012 Elsevier B.V., All rights reserved.

PY - 2012/3/1

Y1 - 2012/3/1

N2 - Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available.Results: We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.

AB - Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein-protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available.Results: We present three novel methods for automating the selection of RT, using machine learning based on known protein-protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.

UR - http://www.scopus.com/inward/record.url?scp=84859055139&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btr720

DO - 10.1093/bioinformatics/btr720

M3 - Journal article

C2 - 22219205

AN - SCOPUS:84859055139

VL - 28

SP - 851

EP - 857

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -