Estimation and Selection in High-Dimensional Genomic Studies

Download Estimation and Selection in High-Dimensional Genomic Studies PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 9784431555667
Total Pages : 90 pages
Book Rating : 4.5/5 (556 download)

DOWNLOAD NOW!


Book Synopsis Estimation and Selection in High-Dimensional Genomic Studies by : Hisashi Noma

Download or read book Estimation and Selection in High-Dimensional Genomic Studies written by Hisashi Noma and published by Springer. This book was released on 2020-04-23 with total page 90 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book provides an overview of the statistical methods used in genome-wide screening of relevant genomic features or genes. Gene screening can facilitate deeper understanding of disease biology at the molecular level, possibly leading to discovery of new molecular targets for developing new treatments and developing diagnostic tests to predict patients’ prognosis or response to treatment. The most common approach to such gene screening studies is to apply multiple univariate analysis based on separate statistical tests for individual genes to test the null hypothesis of no association with clinical variables. This book first provides an overview of the state of the art of such multiple testing methodologies for gene screening, including frequentist multiple tests, empirical Bayes, and full-Bayes model-based methods for controlling the family-wise error rate or false discovery rate. Optimal discovery procedures and model-based variants are also discussed. Although great endeavor has been directed toward developing multiple testing methods, there are other, more relevant and effective analyses that should be given much attention in gene screening, including gene ranking, estimation of effect sizes, and classification accuracy based on selected genes. The core contents of this book provide a framework for integrated gene screening analysis based on hierarchical mixture modeling and empirical Bayes. Within this framework effective tools for multiple testing, ranking, estimation of effect size, and classification accuracy are derived. Methods for sample size determination for gene screening studies are also provided. With this content, the book is certain to expand the existing framework of statistical analysis based on multiple testing for gene screening to one based on estimation and selection.

High-Dimensional Data Analysis in Cancer Research

Download High-Dimensional Data Analysis in Cancer Research PDF Online Free

Author :
Publisher : Springer Science & Business Media
ISBN 13 : 0387697659
Total Pages : 164 pages
Book Rating : 4.3/5 (876 download)

DOWNLOAD NOW!


Book Synopsis High-Dimensional Data Analysis in Cancer Research by : Xiaochun Li

Download or read book High-Dimensional Data Analysis in Cancer Research written by Xiaochun Li and published by Springer Science & Business Media. This book was released on 2008-12-19 with total page 164 pages. Available in PDF, EPUB and Kindle. Book excerpt: Multivariate analysis is a mainstay of statistical tools in the analysis of biomedical data. It concerns with associating data matrices of n rows by p columns, with rows representing samples (or patients) and columns attributes of samples, to some response variables, e.g., patients outcome. Classically, the sample size n is much larger than p, the number of variables. The properties of statistical models have been mostly discussed under the assumption of fixed p and infinite n. The advance of biological sciences and technologies has revolutionized the process of investigations of cancer. The biomedical data collection has become more automatic and more extensive. We are in the era of p as a large fraction of n, and even much larger than n. Take proteomics as an example. Although proteomic techniques have been researched and developed for many decades to identify proteins or peptides uniquely associated with a given disease state, until recently this has been mostly a laborious process, carried out one protein at a time. The advent of high throughput proteome-wide technologies such as liquid chromatography-tandem mass spectroscopy make it possible to generate proteomic signatures that facilitate rapid development of new strategies for proteomics-based detection of disease. This poses new challenges and calls for scalable solutions to the analysis of such high dimensional data. In this volume, we will present the systematic and analytical approaches and strategies from both biostatistics and bioinformatics to the analysis of correlated and high-dimensional data.

High-dimensional Variable Selection for Genomics Data, from Both Frequentist and Bayesian Perspectives

Download High-dimensional Variable Selection for Genomics Data, from Both Frequentist and Bayesian Perspectives PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (124 download)

DOWNLOAD NOW!


Book Synopsis High-dimensional Variable Selection for Genomics Data, from Both Frequentist and Bayesian Perspectives by : Jie Ren

Download or read book High-dimensional Variable Selection for Genomics Data, from Both Frequentist and Bayesian Perspectives written by Jie Ren and published by . This book was released on 2020 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Variable selection is one of the most popular tools for analyzing high-dimensional genomic data. It has been developed to accommodate complex data structures and lead to structured sparse identification of important genomics features. We focus on the network and interaction structure that commonly exist in genomic data, and develop novel variable selection methods from both frequentist and Bayesian perspectives. Network-based regularization has achieved success in variable selections for high-dimensional cancer genomic data, due to its ability to incorporate the correlations among genomic features. However, as survival time data usually follow skewed distributions, and are contaminated by outliers, network-constrained regularization that does not take the robustness into account leads to false identifications of network structure and biased estimation of patients' survival. In the first project, we develop a novel robust network-based variable selection method under the accelerated failure time (AFT) model. Extensive simulation studies show the advantage of the proposed method over the alternative methods. Promising findings are made in two case studies of lung cancer datasets with high dimensional gene expression measurements. Gene-environment (G×E) interactions are important for the elucidation of disease etiology beyond the main genetic and environmental effects. In the second project, a novel and powerful semi-parametric Bayesian variable selection model has been proposed to investigate linear and nonlinear G×E interactions simultaneously. It can further conduct structural identification by distinguishing nonlinear interactions from main-effects-only case within the Bayesian framework. The proposed method conducts Bayesian variable selection more efficiently and accurately than alternatives. Simulation shows that the proposed model outperforms competing alternatives in terms of both identification and prediction. In the case study, the proposed Bayesian method leads to the identification of effects with important implications in a high-throughput profiling study with high-dimensional SNP data. In the last project, a robust Bayesian variable selection method has been developed for G×E interaction studies. The proposed robust Bayesian method can effectively accommodate heavy-tailed errors and outliers in the response variable while conducting variable selection by accounting for structural sparsity. Spike and slab priors are incorporated on both individual and group levels to identify the sparse main and interaction effects. Extensive simulation studies and analysis of both the diabetes data with SNP measurements from the Nurses' Health Study and TCGA melanoma data with gene expression measurements demonstrate the superior performance of the proposed method over multiple competing alternatives. To facilitate reproducible research and fast computation, we have developed open source R packages for each project, which provide highly efficient C++ implementation for all the proposed and alternative approaches. The R packages regnet and spinBayes, associated with the first and second project correspondingly, are available on CRAN. For the third project, the R package robin is available from GitHub and will be submitted to CRAN soon.

Design and Analysis of Clinical Trials for Predictive Medicine

Download Design and Analysis of Clinical Trials for Predictive Medicine PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1466558164
Total Pages : 394 pages
Book Rating : 4.4/5 (665 download)

DOWNLOAD NOW!


Book Synopsis Design and Analysis of Clinical Trials for Predictive Medicine by : Shigeyuki Matsui

Download or read book Design and Analysis of Clinical Trials for Predictive Medicine written by Shigeyuki Matsui and published by CRC Press. This book was released on 2015-03-19 with total page 394 pages. Available in PDF, EPUB and Kindle. Book excerpt: Design and Analysis of Clinical Trials for Predictive Medicine provides statistical guidance on conducting clinical trials for predictive medicine. It covers statistical topics relevant to the main clinical research phases for developing molecular diagnostics and therapeutics-from identifying molecular biomarkers using DNA microarrays to confirming

Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes

Download Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 189 pages
Book Rating : 4.:/5 (128 download)

DOWNLOAD NOW!


Book Synopsis Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes by : Lauren Nicole Spirko

Download or read book Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes written by Lauren Nicole Spirko and published by . This book was released on 2017 with total page 189 pages. Available in PDF, EPUB and Kindle. Book excerpt: One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes, providing insight into the disease's process. With the rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of thousands of genes and proteins resulting in enormous data sets where the number of genomic variables (covariates) is far greater than the number of subjects. It is also typical for such data sets to have a high proportion of censored observations. Methods based on univariate Cox regression are often used to select genes related to survival outcome. However, the Cox model assumes proportional hazards (PH), which is unlikely to hold for each gene. When applied to genes exhibiting some form of non-proportional hazards (NPH), these methods could lead to an under- or over-estimation of the effects. In this thesis, we develop methods that will directly address t.

16th International Conference on Information Technology-New Generations (ITNG 2019)

Download 16th International Conference on Information Technology-New Generations (ITNG 2019) PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 3030140709
Total Pages : 652 pages
Book Rating : 4.0/5 (31 download)

DOWNLOAD NOW!


Book Synopsis 16th International Conference on Information Technology-New Generations (ITNG 2019) by : Shahram Latifi

Download or read book 16th International Conference on Information Technology-New Generations (ITNG 2019) written by Shahram Latifi and published by Springer. This book was released on 2019-05-22 with total page 652 pages. Available in PDF, EPUB and Kindle. Book excerpt: This 16th International Conference on Information Technology - New Generations (ITNG), continues an annual event focusing on state of the art technologies pertaining to digital information and communications. The applications of advanced information technology to such domains as astronomy, biology, education, geosciences, security and health care are among topics of relevance to ITNG. Visionary ideas, theoretical and experimental results, as well as prototypes, designs, and tools that help the information readily flow to the user are of special interest. Machine Learning, Robotics, High Performance Computing, and Innovative Methods of Computing are examples of related topics. The conference features keynote speakers, the best student award, poster award, service award, a technical open panel, and workshops/exhibits from industry, government and academia.

Statistical Methods for High-dimensional Genomic Data

Download Statistical Methods for High-dimensional Genomic Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 200 pages
Book Rating : 4.:/5 (477 download)

DOWNLOAD NOW!


Book Synopsis Statistical Methods for High-dimensional Genomic Data by : Michael Chiao-An Wu

Download or read book Statistical Methods for High-dimensional Genomic Data written by Michael Chiao-An Wu and published by . This book was released on 2009 with total page 200 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-throughput genomic studies hold great promise for providing insight into key biological and medical problems, but the high-dimensionality of the data from these studies constitutes a great challenge for researchers. This thesis seeks to address some of the methodological challenges posed by high-dimensional genomic data. First, the need to develop accurate classifiers based on genomic markers motivated the development of sparse linear discriminant analysis (sLDA), a regularized form of linear discriminant analysis, which performs simultaneous classification and variable selection. The second and third chapters of this thesis are concerned with multifeature testing. In the gene expression setting, we apply sLDA to test for differential expression of gene pathways by using the sLDA weights to reduce each pathway to a univariate score which may be evaluated via permutation. Then for genome wide association studies, we consider using the logistic kernel machine based testing framework to evaluate the significance of SNPs grouped on the basis of proximity to known genomic features. Finally, in the last chapter we study the use of sparse regularized regression for making inference in high dimensional data. Specifically, we develop a parametric permutation test based on the LASSO estimator for testing the effect of individual markers in "omics" settings.

Handbook of Statistics in Clinical Oncology, Third Edition

Download Handbook of Statistics in Clinical Oncology, Third Edition PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1439862001
Total Pages : 661 pages
Book Rating : 4.4/5 (398 download)

DOWNLOAD NOW!


Book Synopsis Handbook of Statistics in Clinical Oncology, Third Edition by : John Crowley

Download or read book Handbook of Statistics in Clinical Oncology, Third Edition written by John Crowley and published by CRC Press. This book was released on 2012-03-26 with total page 661 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many new challenges have arisen in the area of oncology clinical trials. New cancer therapies are often based on cytostatic or targeted agents, which pose new challenges in the design and analysis of all phases of trials. The literature on adaptive trial designs and early stopping has been exploding. Inclusion of high-dimensional data and imaging techniques have become common practice, and statistical methods on how to analyse such data have been refined in this area. A compilation of statistical topics relevant to these new advances in cancer research, this third edition of Handbook of Statistics in Clinical Oncology focuses on the design and analysis of oncology clinical trials and translational research. Addressing the many challenges that have arisen since the publication of its predecessor, this third edition covers the newest developments involved in the design and analysis of cancer clinical trials, incorporating updates to all four parts: Phase I trials: Updated recommendations regarding the standard 3 + 3 and continual reassessment approaches, along with new chapters on phase 0 trials and phase I trial design for targeted agents. Phase II trials: Updates to current experience in single-arm and randomized phase II trial designs. New chapters include phase II designs with multiple strata and phase II/III designs. Phase III trials: Many new chapters include interim analyses and early stopping considerations, phase III trial designs for targeted agents and for testing the ability of markers, adaptive trial designs, cure rate survival models, statistical methods of imaging, as well as a thorough review of software for the design and analysis of clinical trials. Exploratory and high-dimensional data analyses: All chapters in this part have been thoroughly updated since the last edition. New chapters address methods for analyzing SNP data and for developing a score based on gene expression data. In addition, chapters on risk calculators and forensic bioinformatics have been added. Accessible to statisticians and oncologists interested in clinical trial methodology, the book is a single-source collection of up-to-date statistical approaches to research in clinical oncology.

Probabilistic Graphical Models for Genetics, Genomics, and Postgenomics

Download Probabilistic Graphical Models for Genetics, Genomics, and Postgenomics PDF Online Free

Author :
Publisher : OUP Oxford
ISBN 13 : 0191019208
Total Pages : 415 pages
Book Rating : 4.1/5 (91 download)

DOWNLOAD NOW!


Book Synopsis Probabilistic Graphical Models for Genetics, Genomics, and Postgenomics by : Christine Sinoquet

Download or read book Probabilistic Graphical Models for Genetics, Genomics, and Postgenomics written by Christine Sinoquet and published by OUP Oxford. This book was released on 2014-09-18 with total page 415 pages. Available in PDF, EPUB and Kindle. Book excerpt: Nowadays bioinformaticians and geneticists are faced with myriad high-throughput data usually presenting the characteristics of uncertainty, high dimensionality and large complexity. These data will only allow insights into this wealth of so-called 'omics' data if represented by flexible and scalable models, prior to any further analysis. At the interface between statistics and machine learning, probabilistic graphical models (PGMs) represent a powerful formalism to discover complex networks of relations. These models are also amenable to incorporating a priori biological information. Network reconstruction from gene expression data represents perhaps the most emblematic area of research where PGMs have been successfully applied. However these models have also created renewed interest in genetics in the broad sense, in particular regarding association genetics, causality discovery, prediction of outcomes, detection of copy number variations, and epigenetics. This book provides an overview of the applications of PGMs to genetics, genomics and postgenomics to meet this increased interest. A salient feature of bioinformatics, interdisciplinarity, reaches its limit when an intricate cooperation between domain specialists is requested. Currently, few people are specialists in the design of advanced methods using probabilistic graphical models for postgenomics or genetics. This book deciphers such models so that their perceived difficulty no longer hinders their use and focuses on fifteen illustrations showing the mechanisms behind the models. Probabilistic Graphical Models for Genetics, Genomics and Postgenomics covers six main themes: (1) Gene network inference (2) Causality discovery (3) Association genetics (4) Epigenetics (5) Detection of copy number variations (6) Prediction of outcomes from high-dimensional genomic data. Written by leading international experts, this is a collection of the most advanced work at the crossroads of probabilistic graphical models and genetics, genomics, and postgenomics. The self-contained chapters provide an enlightened account of the pros and cons of applying these powerful techniques.

Advanced Mean Field Methods

Download Advanced Mean Field Methods PDF Online Free

Author :
Publisher : MIT Press
ISBN 13 : 9780262150545
Total Pages : 300 pages
Book Rating : 4.1/5 (55 download)

DOWNLOAD NOW!


Book Synopsis Advanced Mean Field Methods by : Manfred Opper

Download or read book Advanced Mean Field Methods written by Manfred Opper and published by MIT Press. This book was released on 2001 with total page 300 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling. A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling.

High-Dimensional Covariance Estimation

Download High-Dimensional Covariance Estimation PDF Online Free

Author :
Publisher : John Wiley & Sons
ISBN 13 : 1118034295
Total Pages : 204 pages
Book Rating : 4.1/5 (18 download)

DOWNLOAD NOW!


Book Synopsis High-Dimensional Covariance Estimation by : Mohsen Pourahmadi

Download or read book High-Dimensional Covariance Estimation written by Mohsen Pourahmadi and published by John Wiley & Sons. This book was released on 2013-06-24 with total page 204 pages. Available in PDF, EPUB and Kindle. Book excerpt: Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and machine learning. Recently, the classical sample covariance methodologies have been modified and improved upon to meet the needs of statisticians and researchers dealing with large correlated datasets. High-Dimensional Covariance Estimation focuses on the methodologies based on shrinkage, thresholding, and penalized likelihood with applications to Gaussian graphical models, prediction, and mean-variance portfolio management. The book relies heavily on regression-based ideas and interpretations to connect and unify many existing methods and algorithms for the task. High-Dimensional Covariance Estimation features chapters on: Data, Sparsity, and Regularization Regularizing the Eigenstructure Banding, Tapering, and Thresholding Covariance Matrices Sparse Gaussian Graphical Models Multivariate Regression The book is an ideal resource for researchers in statistics, mathematics, business and economics, computer sciences, and engineering, as well as a useful text or supplement for graduate-level courses in multivariate analysis, covariance estimation, statistical learning, and high-dimensional data analysis.

Statistical Diagnostics for Cancer

Download Statistical Diagnostics for Cancer PDF Online Free

Author :
Publisher : John Wiley & Sons
ISBN 13 : 3527665455
Total Pages : 301 pages
Book Rating : 4.5/5 (276 download)

DOWNLOAD NOW!


Book Synopsis Statistical Diagnostics for Cancer by : Matthias Dehmer

Download or read book Statistical Diagnostics for Cancer written by Matthias Dehmer and published by John Wiley & Sons. This book was released on 2012-11-28 with total page 301 pages. Available in PDF, EPUB and Kindle. Book excerpt: This ready reference discusses different methods for statistically analyzing and validating data created with high-throughput methods. As opposed to other titles, this book focusses on systems approaches, meaning that no single gene or protein forms the basis of the analysis but rather a more or less complex biological network. From a methodological point of view, the well balanced contributions describe a variety of modern supervised and unsupervised statistical methods applied to various large-scale datasets from genomics and genetics experiments. Furthermore, since the availability of sufficient computer power in recent years has shifted attention from parametric to nonparametric methods, the methods presented here make use of such computer-intensive approaches as Bootstrap, Markov Chain Monte Carlo or general resampling methods. Finally, due to the large amount of information available in public databases, a chapter on Bayesian methods is included, which also provides a systematic means to integrate this information. A welcome guide for mathematicians and the medical and basic research communities.

Phenotypes and Genotypes

Download Phenotypes and Genotypes PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 1447153103
Total Pages : 232 pages
Book Rating : 4.4/5 (471 download)

DOWNLOAD NOW!


Book Synopsis Phenotypes and Genotypes by : Florian Frommlet

Download or read book Phenotypes and Genotypes written by Florian Frommlet and published by Springer. This book was released on 2016-02-12 with total page 232 pages. Available in PDF, EPUB and Kindle. Book excerpt: This timely text presents a comprehensive guide to genetic association, a new and rapidly expanding field that aims to elucidate how our genetic code (genotypes) influences the traits we possess (phenotypes). The book provides a detailed review of methods of gene mapping used in association with experimental crosses, as well as genome-wide association studies. Emphasis is placed on model selection procedures for analyzing data from large-scale genome scans based on specifically designed modifications of the Bayesian information criterion. Features: presents a thorough introduction to the theoretical background to studies of genetic association (both genetic and statistical); reviews the latest advances in the field; illustrates the properties of methods for mapping quantitative trait loci using computer simulations and the analysis of real data; discusses open challenges; includes an extensive statistical appendix as a reference for those who are not totally familiar with the fundamentals of statistics.

Statistical Methods for the Analysis of Genomic Data

Download Statistical Methods for the Analysis of Genomic Data PDF Online Free

Author :
Publisher : MDPI
ISBN 13 : 3039361406
Total Pages : 136 pages
Book Rating : 4.0/5 (393 download)

DOWNLOAD NOW!


Book Synopsis Statistical Methods for the Analysis of Genomic Data by : Hui Jiang

Download or read book Statistical Methods for the Analysis of Genomic Data written by Hui Jiang and published by MDPI. This book was released on 2020-12-29 with total page 136 pages. Available in PDF, EPUB and Kindle. Book excerpt: In recent years, technological breakthroughs have greatly enhanced our ability to understand the complex world of molecular biology. Rapid developments in genomic profiling techniques, such as high-throughput sequencing, have brought new opportunities and challenges to the fields of computational biology and bioinformatics. Furthermore, by combining genomic profiling techniques with other experimental techniques, many powerful approaches (e.g., RNA-Seq, Chips-Seq, single-cell assays, and Hi-C) have been developed in order to help explore complex biological systems. As a result of the increasing availability of genomic datasets, in terms of both volume and variety, the analysis of such data has become a critical challenge as well as a topic of great interest. Therefore, statistical methods that address the problems associated with these newly developed techniques are in high demand. This book includes a number of studies that highlight the state-of-the-art statistical methods for the analysis of genomic data and explore future directions for improvement.

Goodness-of-Fit Tests and Model Validity

Download Goodness-of-Fit Tests and Model Validity PDF Online Free

Author :
Publisher : Springer Science & Business Media
ISBN 13 : 1461201039
Total Pages : 512 pages
Book Rating : 4.4/5 (612 download)

DOWNLOAD NOW!


Book Synopsis Goodness-of-Fit Tests and Model Validity by : C. Huber-Carol

Download or read book Goodness-of-Fit Tests and Model Validity written by C. Huber-Carol and published by Springer Science & Business Media. This book was released on 2012-12-06 with total page 512 pages. Available in PDF, EPUB and Kindle. Book excerpt: The 37 expository articles in this volume provide broad coverage of important topics relating to the theory, methods, and applications of goodness-of-fit tests and model validity. The book is divided into eight parts, each of which presents topics written by expert researchers in their areas. Key features include: * state-of-the-art exposition of modern model validity methods, graphical techniques, and computer-intensive methods * systematic presentation with sufficient history and coverage of the fundamentals of the subject * exposure to recent research and a variety of open problems * many interesting real life examples for practitioners * extensive bibliography, with special emphasis on recent literature * subject index This comprehensive reference work will serve the statistical and applied mathematics communities as well as practitioners in the field.

Targeted Learning

Download Targeted Learning PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 9781461429111
Total Pages : 628 pages
Book Rating : 4.4/5 (291 download)

DOWNLOAD NOW!


Book Synopsis Targeted Learning by : Mark J. van der Laan

Download or read book Targeted Learning written by Mark J. van der Laan and published by Springer. This book was released on 2013-08-03 with total page 628 pages. Available in PDF, EPUB and Kindle. Book excerpt: The statistics profession is at a unique point in history. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready to move towards clear objective benchmarks under which tools can be evaluated. Targeted learning allows (1) the full generalization and utilization of cross-validation as an estimator selection tool so that the subjective choices made by humans are now made by the machine, and (2) targeting the fitting of the probability distribution of the data toward the target parameter representing the scientific question of interest. This book is aimed at both statisticians and applied researchers interested in causal inference and general effect estimation for observational and experimental data. Part I is an accessible introduction to super learning and the targeted maximum likelihood estimator, including related concepts necessary to understand and apply these methods. Parts II-IX handle complex data structures and topics applied researchers will immediately recognize from their own research, including time-to-event outcomes, direct and indirect effects, positivity violations, case-control studies, censored data, longitudinal data, and genomic studies.

Variable Selection for High-dimensional Data with Error Control

Download Variable Selection for High-dimensional Data with Error Control PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (139 download)

DOWNLOAD NOW!


Book Synopsis Variable Selection for High-dimensional Data with Error Control by : Han Fu (Ph. D. in biostatistics)

Download or read book Variable Selection for High-dimensional Data with Error Control written by Han Fu (Ph. D. in biostatistics) and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many high-throughput genomic applications involve a large set of covariates and it is crucial to discover which variables are truly associated with the response. It is often desirable for researchers to select variables that are indeed true and reproducible in followup studies. Effectively controlling the false discovery rate (FDR) increases the reproducibility of the discoveries and has been a major challenge in variable selection research, especially for high-dimensional data. Existing error control approaches include augmentation approaches which utilize artificial variables as benchmarks for decision making, such as model-X knockoffs. We introduce another augmentation-based selection framework extended from a Bayesian screening approach called reference distribution variable selection. Ordinal responses, which were not previously considered in this area, were used to compare different variable selection approaches. We constructed various importance measures that fit into the selection frameworks, using either L1 penalized regression or machine learning techniques, and compared these measures in terms of the FDR and power using simulated data. Moreover, we applied these selection methods to high-throughput methylation data for identifying features associated with the progression from normal liver tissue to hepatocellular carcinoma to further compare and contrast their performances. Having established the effectiveness of FDR control for model-X knockoffs, we turned our attention to another important data type - survival data with long-term survivors. Medical breakthroughs in recent years have led to cures for many diseases, resulting in increased observations of long-term survivors. The mixture cure model (MCM) is a type of survival model that is often used when a cured fraction exists. Unfortunately, currently few variable selection methods exist for MCMs when there are more predictors than samples. To fill the gap, we developed penalized MCMs for high-dimensional datasets which allow for identification of prognostic factors associated with both cure status and/or survival. Both parametric models and semi-parametric proportional hazards models were considered for modeling the survival component. For penalized parametric MCMs, we demonstrated how the estimation proceeded using two different iterative algorithms, the generalized monotone incremental forward stagewise (GMIFS) and Expectation-Maximization (E-M). For semi-parametric MCMs where multiple types of penalty functions were considered, the coordinate descent algorithm was combined with E-M for optimization. The model-X knockoffs method was combined with these algorithms to allow for FDR control in variable selection. Through extensive simulation studies, our penalized MCMs have been shown to outperform alternative methods on multiple metrics and achieve high statistical power with FDR being controlled. In two acute myeloid leukemia (AML) applications with gene expression data, our proposed approaches identified important genes associated with potential cure or time-to-relapse, which may help inform treatment decisions for AML patients.