PINCAGE: Probabilistic integration of cancer genomics data for perturbed gene identification and sample classification

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

MOTIVATION: Cancer development and progression is driven by a complex pattern of genomic and epigenomic perturbations. Both types of perturbations can affect gene expression levels and disease outcome. Integrative analysis of cancer genomics data may therefore improve detection of perturbed genes and prediction of disease state. As different data types are usually dependent, analysis based on independence assumptions will make inefficient use of the data and potentially lead to false conclusions.

MODEL: Here we present PINCAGE, a method that uses probabilistic integration of cancer genomics data for combined evaluation of RNA-seq gene expression and 450K array DNA methylation measurements of promoters as well as gene bodies. It models the dependence between expression and methylation using modular graphical models, which also allows future inclusion of additional data types.

RESULTS: We apply our approach to a Breast Invasive Carcinoma data set from The Cancer Genome Atlas consortium, which includes 82 adjacent normal and 730 cancer samples. We identify new biomarker candidates of breast cancer development (PTF1A, RABIF, RAG1AP1, TIMM17A, LOC148145) and progression (SERPINE3, ZNF706). PINCAGE discriminates better between normal and tumour tissue and between progressing and non-progressing tumours in comparison with established methods that assume independence between tested data types, especially when using evidence from multiple genes. Our method can be applied to any type of cancer or, more generally, to any genomic disease for which sufficient amount of molecular data is available.

AVAILABILITY: R scripts available at http://moma.ki.au.dk/prj/pincage/ CONTACT: : michal.switnicki@clin.au.dk, jakob.skou@clin.au.dk SUPPLEMENTARY INFORMATION: : available at Bioinformatics online.

OriginalsprogEngelsk
TidsskriftBioinformatics
Vol/bind32
Nummer9
Sider (fra-til)1353-1365
ISSN1367-4803
DOI
StatusUdgivet - 6 jan. 2016

Se relationer på Aarhus Universitet Citationsformater

ID: 96063002