Network-based Computational Frameworks for Multi-omics Integration and Genome-wide Epistasis Detection in Complex Disease

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Advances in high-throughput technologies have enabled the generation of large-scale, high-dimensional genomic datasets, creating new opportunities to investigate the molecular basis of complex traits and diseases. In parallel, developments in computer science and applied mathematics have driven the adoption of machine learning methods, ranging from statistical models to network-based algorithms, within systems biology. However, existing approaches often face limitations, including reliance on restrictive biological assumptions, computational inefficiencies, and limited scalability. This thesis develops and customizes novel network-based frameworks to address key challenges in systems genetics, with a particular focus on multi-omics data integration and genome-wide epistatic interaction analysis. The proposed methods integrate topological network modeling, dynamic data fusion, GPU-accelerated computation, heterogeneous computing architectures, and adaptations of unsupervised learning algorithms, combined with biological validation using both publicly available resources and data from the Canadian Healthy Infant Longitudinal Development (CHILD) cohort. A central contribution is the Epistatic SNP Network Analysis (ESNA) framework, an efficient, scalable method for detecting higher-order SNP interactions across the genome. Unlike conventional epistasis approaches restricted to pairwise interactions or predefined genomic regions, ESNA employs a parallelized, scale-free network construction algorithm to identify modules of interacting SNPs from genome-wide data. Applied to the CHILD Cohort Study, ESNA analyzed 775,569 SNPs from 1,899 children and identified 914 network modules, of which nine were significantly associated with recurrent wheeze in early childhood. Notably, seven of these modules were also associated with asthma by age five, with several mapping to genes and pathways previously implicated in airway disease, including immune signaling and nervous system development. Benchmark evaluations demonstrated substantial computational gains over existing network-based epistasis methods, achieving a 48-fold increase in processing speed and a 50% reduction in memory usage. Collectively, this thesis presents integrated computational frameworks for multi-omics integration and epistasis detection, demonstrating their utility through real-world biological applications. These contributions advance the capacity of systems genetics to interpret complex, polygenic architectures underlying human disease.

Description

Keywords

network-based algorithm, genome-wide epistasis, scale-free network, block-wise parallel computing, multi-omics integration model, unsupervised learning, asthma, recurrent wheeze, algorithm development

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NoDerivatives 4.0 International