Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (938 download)
Book Synopsis Statistical Methods for Reliable Inference in RNA-seq Experiments to Facilitate Regenerative Medicine by :
Download or read book Statistical Methods for Reliable Inference in RNA-seq Experiments to Facilitate Regenerative Medicine written by and published by . This book was released on 2015 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: The last decade of genome research has led to major technological advances in sequencing, genotyping, and phenotyping. However, how best to derive useful information from them still remains to be explored by statistical scientists. In this dissertation, I develop, implement, evaluate and apply three statistical methods for high-dimensional data analysis to facilitate efforts in regenerative medicine. The first method is an empirical Bayes model called EBSeq for identifying differentially expressed (DE) genes and isoforms. Unlike microarrays, RNA-seq experiments allow for the identification of not only DE genes, but also their corresponding isoforms on a genome-wide scale. Taking advantage of the merits of empirical Bayesian methods, we developed EBSeq which models the uncertainty groups via different priors. Our results demonstrate substantially improved power and performance of EBSeq for identifying DE isoforms compared to other competing methods. The second method is an auto-regressive hidden Markov model called EBSeq-HMM for identifying expression changes across ordered conditions. With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. In EBSeq-HMM, an autoregressive hidden Markov model is implemented to accommodate dependence in gene expression across ordered conditions. As demonstrated in simulation and case studies, the output proves useful in identifying DE genes, characterizing their changes over conditions, and classifying genes into particular expression paths. The third method is a statistical pipeline called Oscope for identifying oscillatory gene sets using unsynchronized single-cell RNA-seq data. Recent advance of single-cell RNA-seq enables precise quantification of gene expression among individual cells. This provides the potential to uncover oscillatory systems at single-cell level. However, methods to identify candidate oscillatory gene sets in an unsynchronized cell population are still lacking. Here we developed a statistical pipeline with 3 main modules - a paired-sine model to identify co-oscillating gene paires, a K-Medoid clustering module to group gene pairs into oscillatory gene sets, and an extended nearest insertion algorithm to recover base cycle profile of oscillatory genes.