Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights

Research output: Contribution to book/anthology/report/proceedingBook chapterResearchpeer-review

Standard

Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. / Abu-Jamous, Basel; Liu, Chao; Roberts, David, J.; Brattico, Elvira; Nandi, Asoke.

Frontiers in Electronic Technologies. Singapore : Springer, 2017.

Research output: Contribution to book/anthology/report/proceedingBook chapterResearchpeer-review

Harvard

Abu-Jamous, B, Liu, C, Roberts, DJ, Brattico, E & Nandi, A 2017, Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. in Frontiers in Electronic Technologies. Springer, Singapore.

APA

Abu-Jamous, B., Liu, C., Roberts, D. J., Brattico, E., & Nandi, A. (2017). Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. In Frontiers in Electronic Technologies Singapore: Springer.

CBE

Abu-Jamous B, Liu C, Roberts DJ, Brattico E, Nandi A. 2017. Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. In Frontiers in Electronic Technologies. Singapore: Springer.

MLA

Abu-Jamous, Basel et al. "Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights". Frontiers in Electronic Technologies. Chapter 2, Singapore: Springer. 2017.

Vancouver

Abu-Jamous B, Liu C, Roberts DJ, Brattico E, Nandi A. Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. In Frontiers in Electronic Technologies. Singapore: Springer. 2017

Author

Abu-Jamous, Basel ; Liu, Chao ; Roberts, David, J. ; Brattico, Elvira ; Nandi, Asoke. / Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights. Frontiers in Electronic Technologies. Singapore : Springer, 2017.

Bibtex

@inbook{7acc75822b5146bba5a71d7cec2e0902,
title = "Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights",
abstract = "Massive amounts of data have recently been, and are increasingly being, generated from various fields, such as bioinformatics, neuroscience and social networks. Many of these big datasets were generated to answer specific research questions, and were analysed accordingly. However, the scope of information contained in these datasets can usually answer much broader questions than what was originally intended. Moreover, many existing big datasets are related to each other but have different detailed specifications, and the mutual information that can be extracted from them collectively has been not commonly considered. To bridge this gap between the fast pace of data generation and the slower pace of data analysis, and to exploit the massive amounts of existing data, we suggest employing data-driven explorations to analyse collections of related big datasets. This approach aims at extracting field-specific novel findings which can be revealed from the data without being driven by specific questions or hypotheses. To realise this paradigm, we introduced the binarisation of consensus partition matrices (Bi- CoPaM) method, with the ability of analysing collections of heterogeneous big datasets to identify clusters of consistently correlated objects. We demonstrate the power of data-driven explorations by applying the Bi-CoPaM to two collections of big datasets from two distinct fields, namely bioinformatics and neuroscience. In the first application, the collective analysis of forty yeast gene expression datasets identified a novel cluster of genes and some new biological hypotheses regarding its function and regulation. In the other application, the analysis of 1,856 big fMRI datasets identified three functionally connected neural networks related to visual, reward and auditory systems during affective processing. These experiments reveal the broad applicability of this paradigm to various fields, and thus encourage exploring the large amounts of partially exploited existing datasets, preferably as collections of related datasets, with a similar approach.",
author = "Basel Abu-Jamous and Chao Liu and Roberts, {David, J.} and Elvira Brattico and Asoke Nandi",
year = "2017",
language = "English",
isbn = "978-981-10-4234-8",
booktitle = "Frontiers in Electronic Technologies",
publisher = "Springer",

}

RIS

TY - CHAP

T1 - Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights

AU - Abu-Jamous, Basel

AU - Liu, Chao

AU - Roberts, David, J.

AU - Brattico, Elvira

AU - Nandi, Asoke

PY - 2017

Y1 - 2017

N2 - Massive amounts of data have recently been, and are increasingly being, generated from various fields, such as bioinformatics, neuroscience and social networks. Many of these big datasets were generated to answer specific research questions, and were analysed accordingly. However, the scope of information contained in these datasets can usually answer much broader questions than what was originally intended. Moreover, many existing big datasets are related to each other but have different detailed specifications, and the mutual information that can be extracted from them collectively has been not commonly considered. To bridge this gap between the fast pace of data generation and the slower pace of data analysis, and to exploit the massive amounts of existing data, we suggest employing data-driven explorations to analyse collections of related big datasets. This approach aims at extracting field-specific novel findings which can be revealed from the data without being driven by specific questions or hypotheses. To realise this paradigm, we introduced the binarisation of consensus partition matrices (Bi- CoPaM) method, with the ability of analysing collections of heterogeneous big datasets to identify clusters of consistently correlated objects. We demonstrate the power of data-driven explorations by applying the Bi-CoPaM to two collections of big datasets from two distinct fields, namely bioinformatics and neuroscience. In the first application, the collective analysis of forty yeast gene expression datasets identified a novel cluster of genes and some new biological hypotheses regarding its function and regulation. In the other application, the analysis of 1,856 big fMRI datasets identified three functionally connected neural networks related to visual, reward and auditory systems during affective processing. These experiments reveal the broad applicability of this paradigm to various fields, and thus encourage exploring the large amounts of partially exploited existing datasets, preferably as collections of related datasets, with a similar approach.

AB - Massive amounts of data have recently been, and are increasingly being, generated from various fields, such as bioinformatics, neuroscience and social networks. Many of these big datasets were generated to answer specific research questions, and were analysed accordingly. However, the scope of information contained in these datasets can usually answer much broader questions than what was originally intended. Moreover, many existing big datasets are related to each other but have different detailed specifications, and the mutual information that can be extracted from them collectively has been not commonly considered. To bridge this gap between the fast pace of data generation and the slower pace of data analysis, and to exploit the massive amounts of existing data, we suggest employing data-driven explorations to analyse collections of related big datasets. This approach aims at extracting field-specific novel findings which can be revealed from the data without being driven by specific questions or hypotheses. To realise this paradigm, we introduced the binarisation of consensus partition matrices (Bi- CoPaM) method, with the ability of analysing collections of heterogeneous big datasets to identify clusters of consistently correlated objects. We demonstrate the power of data-driven explorations by applying the Bi-CoPaM to two collections of big datasets from two distinct fields, namely bioinformatics and neuroscience. In the first application, the collective analysis of forty yeast gene expression datasets identified a novel cluster of genes and some new biological hypotheses regarding its function and regulation. In the other application, the analysis of 1,856 big fMRI datasets identified three functionally connected neural networks related to visual, reward and auditory systems during affective processing. These experiments reveal the broad applicability of this paradigm to various fields, and thus encourage exploring the large amounts of partially exploited existing datasets, preferably as collections of related datasets, with a similar approach.

M3 - Book chapter

SN - 978-981-10-4234-8

BT - Frontiers in Electronic Technologies

PB - Springer

CY - Singapore

ER -