Tutorial: William Yang
School of Computer Science, Carnegie Mellon University
Rapid advancement of high-throughput next-generation sequencing technologies has generated sheer volumes of multidimensional big-data that will ultimately transform the current healthcare based on average patient to individualized precision diagnosis and accurate treatment. This tutorial presents our newly developed synergistic high-performance computing and big-data analytics approaches to facilitate the advancement of precision medicine research.
In particular, identifying cancer-causing genetic alterations and their disrupted pathways remain highly challenging due to the complex biological interactions and the heterogeneity of the disease even with the power of single-cell genomics. Genetic mutations in disease causing genes can disturb signaling pathways that impact the expressions of sets of genes, each performing certain biological functions. We consider that driver mutations are likely to affect disease-associated functional gene expressions, and the causal relationship between the mutations and the perturbed signals of transcription can be reconstructed from the profiles of differential gene expression pattern and disturbed gene networks. Therefore the first step to improving personalized treatment of tumors is to systematically identify the differentially expressed genes in cancer. We will present a novel online tool called IDEAS to Identify Differential Expression of genes for Applications in genome-wide Studies, and further utilize this tool for the integrative acquisition and analysis of multi-layer genomic big-data. The utilization of IDEAS along with pathway analysis facilitates the construction of high-level gene signaling networks that will eventually lead to find disease-casing genomic alterations and effective drug targets. Developing synergistic high-performance computing and big-data analytics approaches has been efficiently used in multidimensional genomic big-data integration, hence we will demonstrate our newly developed computational framework to automate data quality assessment, mapping to reference genome, variant identification and annotation, single nucleotide polymorphism and differentially expressed gene identification. We combine multiplatform genomic big-data to enhance detection power of genomic alterations and drug targets. Synergistic development of high- performance computing and big-data analytics methods utilizing high-dimensional data has provided computational solutions for important precision medicine research that can ultimately lead to the improvement of human health and prolongation of human life.
William Yang is an American software developer, researcher, educator, advocator and writer in synergistic computer science and big-data genomics research. He completed United States National Science Foundation (NSF) Research Scholarship from University of Texas at Austin for the NSF iPlant collaborative project. Although he only holds computer science degree with honors, he has contributed significantly to both computer science and biomedicine and published extensively at both computer and biomedical journals including Journal of Supercomputing, Human Genomics, BMC Bioinformatics and BMC Genomics. He also frequently presented his research at IEEE, ACM, international conferences and academic events to promote the synergies of high-performance computing and big-data analytics in precision medicine research. He has accomplished highly in the interdisciplinary fields, and one of his award-winning highly accessed scientific research articles (https://www.researchgate.net/publication/282747684) was selected by Harvard University in Cambridge, Massachusetts, U.S.A. for open digital access to scholarships (https://dash.harvard.edu/handle/1/14065393). William has received numerous recognitions including ACM SIGHPC (Special Interest Group on High Performance Computing) Travel Fellowship, Academic All Star Award, Best Hack Award in hackathon competition, Best Programmer Award in non COBOL competition, Highly Accessed Distinction in journal and Best Paper Award in research conference. William is the founder of LearnCTF (Capture The Flag), an online codecademy for cybersecurity research and education. He considers his cybersecurity research as a hobby to hone his brain and problem-solving skills, but his key interest is to develop powerful computer science methods to solve difficult and challenging biomedical problems. William is currently conducting cutting-edge supercomputing and artificial intelligence research at School of Computer Science of Carnegie Mellon University in Pittsburgh, Pennsylvania, U.S.A.