Dissertations and Theses

BiofilmGeneSet: Leveraging Multi-Omics Data Mining and ICA To Discover Biofilm Stage Genes of Interest from Condition-Specific Expression Dataset

Mathew Olakunle Alaba

Author ORCID Identifier

https://orcid.org/0000-0002-6802-8021

Document Type

Thesis

Date of Award

2022

Degree Name

Master of Science (MS)

Department

Biomedical Engineering

First Advisor

Etienne Z Gnimpieba

Abstract

Biofilm formation occurs in the attachment, colony, maturation, and dispersion stages. Understanding the molecular basis at every point of this process is essential to developing efficient diagnostics devices and effective antibiofilm agents. Gene expression data provide molecular insight for both static and temporal biofilm development. The most used analytic techniques for biofilm gene expression data are clustering and network inference algorithms, which class genes with similar expressions across the samples. However, these methods are inherently deficient because they do not capture gene(s) expressed in a subset of the samples. These subsets might be unique to a developmental stage, for example. Secondly, these methods perform a nonoverlapping gene assignment to the classes. This also leads to loss of information because gene expression is combinatorial, and a gene product can simultaneously participate more or less in different pathways. In this study, I developed an analysis Framework referred BiofilmGeneSet to classify genes significantly contributing to biofilm developmental stages. I applied the JADE algorithm to Expression data (X) to extract statistically independent expression modules (S) and their module activity (A). Next, Pearson correlation coefficients between the module activity and expression profile were computed to determine significant modules. BioNERO: an all-in-one Bioconductor package for comprehensive and easy biological network reconstruction was applied to the same data to evaluate the performance of this workflow. Of the 15 independent expression modules, modules 14, 11, and 4 were significantly associated with the attachment, colony, and maturation stages. The significance of this work can be summarized as follows: (i) a new data mining and expression gene classification framework with high accuracy compared to weighted gene co-expression network methods for problem-based gene set identification; (ii) a new gene set as a potential biomarker for each biofilm development stage; (iii) the generalization of our framework allows us to find gene sets relevant to several other related biological events such as quorum sensing, EPS, antibiotic resistance, etc.; (iv) a relevant functional annotation that will guide scientist in designing an experiment to validate our newly discovered marker gene sets.

Subject Categories

Bioinformatics

Keywords

Biofilm, Class Discovery, Clustering, Independent Component Analysis, RNA-Seq

Number of Pages

Publisher

University of South Dakota

Recommended Citation

Alaba, Mathew Olakunle, "BiofilmGeneSet: Leveraging Multi-Omics Data Mining and ICA To Discover Biofilm Stage Genes of Interest from Condition-Specific Expression Dataset" (2022). Dissertations and Theses. 98.
https://red.library.usd.edu/diss-thesis/98

Download

Included in

Bioinformatics Commons

COinS

Dissertations and Theses

BiofilmGeneSet: Leveraging Multi-Omics Data Mining and ICA To Discover Biofilm Stage Genes of Interest from Condition-Specific Expression Dataset

Author ORCID Identifier

Document Type

Date of Award

Degree Name

Department

First Advisor

Abstract

Subject Categories

Keywords

Number of Pages

Publisher

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Dissertations and Theses

BiofilmGeneSet: Leveraging Multi-Omics Data Mining and ICA To Discover Biofilm Stage Genes of Interest from Condition-Specific Expression Dataset

Author

Author ORCID Identifier

Document Type

Date of Award

Degree Name

Department

First Advisor

Abstract

Subject Categories

Keywords

Number of Pages

Publisher

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links