Author ORCID Identifier
Date of Award
Doctor of Philosophy (PhD)
One approach to interrogating the complexities of human systems in their well-regulated and dysregulated states is through the use of digital twins. Digital twins are virtual representations of physical systems that are descriptive of an individual's state of health, an object fundamentally related to precision medicine. A key element for building a functional digital twin type for a disease or predicting the therapeutic efficacy of a potential treatment is harmonized, machine-parsable domain knowledge. Hypothesis-driven investigations are the gold standard for representing subsystems, but their results encompass a limited knowledge of the full biosystem. Multi-omics data is one rich source of knowledge for characterizing disease- and therapy-induced shifts across the systems biology landscape. However, systematic biases in and between the data types limits the functionality of big multi-omics data. In this dissertation, the generation of and results from transcriptomic analysis pipelines are assessed in their biological context and respective to their usability for applications such as digital twins. This latter is achieved by assessing the adherence of the workflows to the FAIR principles --- Findability, Accessibility, Interoperability, and Reusability --- and the extent to which they connect to the broader systems biology landscape. The first two specific aims of this work emphasize the transcriptomic shifts induced by atypical teratoid rhabdoid tumors (ATRT) relative to the normal brain and those induced by treatment of tumor models by 4SC-202 across disease states including medulloblastoma, ATRT, triple negative breast cancer, osteosarcoma, and pancreatic cancer. These are problem-driven workflows, tightly connected to biological hypotheses that contribute to disease and therapy-specific domain knowledge. In contrast, the third specific aim introduces a domain-agnostic approach for developing transcriptomic pipelines to harmonize bulk RNA-sequencing datasets. This framework does not directly contribute to a given biological domain, but instead provides a generalized approach for integrating large RNA-sequencing datasets and assessing the resultant representation for biological meaningfulness. This harmonization framework may also have utility in assessing the clinical relevance of in vitro biomodels. Collectively, this work presents and assesses the efficacy of multiple transcriptomic workflows within their biological context and broader machine learning applicability.
Bioinformatics | Biomedical Engineering and Bioengineering | Biostatistics
Bioinformatics, Data mining, Digital twin, FAIR, Machine learning, Systems biology
Number of Pages
University of South Dakota
Hoffman, Mariah Marie, "FRAMEWORK FOR THE EVALUATION OF PERTURBATIONS IN THE SYSTEMS BIOLOGY LANDSCAPE AND INTER-SAMPLE SIMILARITY FROM TRANSCRIPTOMIC DATASETS — A DIGITAL TWIN PERSPECTIVE" (2022). Dissertations and Theses. 94.
Available for download on Tuesday, February 20, 2024