Author ORCID Identifier

https://orcid.org/0000-0002-3634-8370

Document Type

Dissertation

Date of Award

2023

Degree Name

Doctor of Philosophy (PhD)

Department

Biomedical Engineering

First Advisor

Etienne Z Gnimpieba

Abstract

The binary model of existence consists of health and disease, with health being complete physical, mental, and social well-being and disease being a pathological process. Humanity has been plagued by genetic and infectious diseases since civilization's dawn. These diseases cause changes in this model at the system biology level. Omics technologies, data science methodologies, and big data are being used to study these changes at the molecular level, leveraging biological processes' interconnectedness. Also, advanced computational methodologies are essential for efficient analytics and actionable discovery. In this dissertation, genetic and infectious-based diseases are investigated using omics data and data science methods and techniques by developing a framework called GOMICs: a Generalizable Multi-OMICs Framework for Biomedical Data Analytics and Discovery. The GOMICs framework is applied to investigate the molecular signature of disease versus control in Friedreich ataxia (FRDA) disease. Herein, BioID and Co-IP proteomes and bulky transcriptomes were analyzed to elucidate the patho-mechanism of FRDA disease. Moreover, multivariate and machine learning models were used in the vertical integration analysis of transcriptomes. As a result, DEGs genes that discriminated the disease versus control from a multi-omics level of the systems were obtained. Subsequently, the 42 optimal genes were subjected to functional enrichment analysis and protein-protein (PPI) discovery. Consequentially, mitochondrial fatty acid oxidation was the most enriched pathway within our analysis. This key finding informed computational drug repurposing in FRDA. By developing a machine learning model, five candidate drugs were identified as repositionable in FRDA. The GOMICS framework was also used to profile the human lice microbiome and possible pathogens that vectorize louse. It was also used to discover new biosynthetic gene clusters (BGCs) that make natural products in the microbiome of the subsurface biosphere in a harsh host environment; assemble the first draft genome of H. verbena; and analyze de novo molecular signatures in germline variants that cause "infant leukemias," the two rare types of leukemia that affect children younger than a year. In summary, this work used omics data, data mining, and machine learning methods to solve biomedicine problems such as genetic and infectious-based diseases.

Subject Categories

Biomedical Engineering and Bioengineering

Keywords

BIOMEDICINE INVESTIGATION, GENETIC DISEASES, INFECTIOUS-BASED DISEASES, OMICS DATA, DATA SCIENCE METHODOLOGIES.

Number of Pages

187

Publisher

University of South Dakota

Available for download on Sunday, July 14, 2024

Share

COinS