Feature Screening For Ultra-high Dimensional Longitudinal Data

Download Feature Screening For Ultra-high Dimensional Longitudinal Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (959 download)

DOWNLOAD NOW!


Book Synopsis Feature Screening For Ultra-high Dimensional Longitudinal Data by : Wanghuan Chu

Download or read book Feature Screening For Ultra-high Dimensional Longitudinal Data written by Wanghuan Chu and published by . This book was released on 2016 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: High and ultrahigh dimensional data analysis is now receiving more and more attention in many scientific fields. Various variable selection methods have been proposed for high dimensional data where feature dimension p increases with sample size n at polynomial rates. In ultrahigh dimensional setting, p is allowed to grow with n at an exponential rate. Instead of jointly selecting active covariates, a more effective approach is to incorporate screening rule that aims at filtering out unimportant covariates through marginal regression techniques. This thesis is concerned with feature screening methods for ultrahigh dimensional longitudinal data. Such data occur frequently in longitudinal genetic studies, where phenotypes and some covariates are measured repeatedly over a certain time period. Along with the genetic measurements, longitudinal genetic studies provide valuable resources for exploring primary genetic and environmental factors that influence complex phenotypes over time. The proposed statistical methods in this work allow us not only to identify genetic determinants of common complex disease, but also to understand at which stage of human life do the genetic determinants become important. In Chapter 3, we propose a new feature screening procedure for ultrahigh dimensional time-varying coefficient models. We present an effective screening rule based on marginal B-spline regression that incorporates time-varying variance and within-subject correlations. We show that under certain conditions, this procedure possesses sure screening property, and the false selection rates can be controlled. We demonstrate how within subject variability can be harnessed for increasing screening accuracy by Monte Carlo simulation studies. Furthermore, we illustrate the proposed screening rule via an empirical analysis of the Childhood Asthma Management Program (CAMP) data. Our empirical analysis clearly shows that the proposed approach is especially useful for such studies as children change quite extensively over a four-year period with highly nonlinear patterns. In Chapter 4, we study screening rules for ultrahigh dimensional covariates that are potentially associated with random effects. Mixed effects models are popular for taking into account the dependence structure of longitudinal data, as subject-specific random effects can explicitly account for within-subject correlation. We propose a two-step screening procedure for generalized varying-coefficient mixed effects models. The two-step procedure screens fixed effects first and then random effects. We conduct simulation studies to assess the finite sample performance of this two-step screening approach for continuous response with linear regression, binary response with logistic regression, count response with Poisson regression, and ordinal response with proportional-odds cumulative logit model. In real data application, we apply this procedure to data from Framingham Heart Study (FHS), and explore the genetic and environmental effects on body mass index (BMI), obesity and blood pressure in three separate analyses. Our results confirm some findings from previous studies, and also identify genetic markers with highly significant effects and interesting time-dependent patterns that worth further exploration.

Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis

Download Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 155 pages
Book Rating : 4.:/5 (82 download)

DOWNLOAD NOW!


Book Synopsis Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis by : Wei Zhong

Download or read book Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis written by Wei Zhong and published by . This book was released on 2012 with total page 155 pages. Available in PDF, EPUB and Kindle. Book excerpt:

New Screening Procedure for Ultrahigh Dimensional Varying-coefficient Model in Longitudinal Data Analysis

Download New Screening Procedure for Ultrahigh Dimensional Varying-coefficient Model in Longitudinal Data Analysis PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (94 download)

DOWNLOAD NOW!


Book Synopsis New Screening Procedure for Ultrahigh Dimensional Varying-coefficient Model in Longitudinal Data Analysis by : Wanghuan Chu

Download or read book New Screening Procedure for Ultrahigh Dimensional Varying-coefficient Model in Longitudinal Data Analysis written by Wanghuan Chu and published by . This book was released on 2014 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This thesis is concerned with feature screening methods for varying-coefficient models in ultrahigh dimensional longitudinal setting. Motivated by an empirical analysis of the Childhood Asthma Management Project, CAMP, we introduce a new screening procedure for time-varying coefficient models with ultrahigh dimensional longitudinal predictor variables. The performance of the proposed procedure is investigated via Monte Carlo simulation. Numerical comparisons indicate that it can outperform existing ones substantially, resulting in significant improvements in explained variability and prediction error. Applying these methods to CAMP, we are able to find a number of potentially important genetic mutations related to lung function, several of which exhibit interesting nonlinear patterns around puberty.

Feature Screening in Ultra-high Dimensional Survival Data Analysis

Download Feature Screening in Ultra-high Dimensional Survival Data Analysis PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (915 download)

DOWNLOAD NOW!


Book Synopsis Feature Screening in Ultra-high Dimensional Survival Data Analysis by : Wei Sun

Download or read book Feature Screening in Ultra-high Dimensional Survival Data Analysis written by Wei Sun and published by . This book was released on 2014 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Much research has been devoted to developing variable selection methods for decades since high dimensional data arise from many scientific and technological fields. Adopting continuous penalties such as the LASSO (Tibshirani, 1996) and the SCAD (Fan and Li, 2001) made it possible to cope with the high dimensionality. Independence screening is very useful tool to identify all the important covariates at less computational cost than the traditional methods when the number of covariates grows at non-polynomial rate of the sample size. When the response is survival time, feature screening is more challenging because the responses are subject to censoring. In this thesis we propose a model-free independence feature screening procedure for ultra-high dimensional survival data. This new procedure can be directly applied for most commonly-used models such as Cox's model, Cox's frailty model, additive Cox's model, parametric, nonparametric and semiparametric proportional odds models and accelerated failure time models, in survival data analysis. Thus, the virtue of the new procedure is desirable since it is usual that little prior information is known for the actual true model for ultra-high dimensional data. The newly proposed procedure is easy to implement and computationally efficient. We systematically studied the theoretical properties of the proposed procedures, and established the sure screening property and consistency in rankingproperty for the proposed procedure. Its performance is evaluated and compared with the existing procedure proposed based on Cox's model (Fan, Feng, & Wu, 2010) by extensive simulation studies and the real data analysis. Since our proposed procedure uses marginal correlation utility measure, an inherent issue is that it cannot identify those important features that are marginally independent withresponse. To resolve this issue, we propose an iterative procedure in spirit similar to iterative sure independent screening procedure proposed by Fan and Lv (2008). The major challenge in the development of the iterative procedure is the lack of definition of residuals under the model-free framework for survival data analysis. The commonly used residuals, such as martingale residual, Schoenfeld residual and deviance residual, are all defined with respect to certain semiparametric models. Therefore those residuals are not applicable in our model-free framework. We instead use the residuals from regressing the entire features space on the previously selected active features. We also carefully studied the performance of the proposed iterative procedures. Our Monte Carlo simulation studies show that the proposediterative procedures performs quite well with moderate sample sizes.

Feature Screening for Ultrahigh Dimensional Categorical Data with Applications

Download Feature Screening for Ultrahigh Dimensional Categorical Data with Applications PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (137 download)

DOWNLOAD NOW!


Book Synopsis Feature Screening for Ultrahigh Dimensional Categorical Data with Applications by : Danyang Huang

Download or read book Feature Screening for Ultrahigh Dimensional Categorical Data with Applications written by Danyang Huang and published by . This book was released on 2014 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Ultrahigh dimensional data with both categorical responses and categorical covariates are frequently encountered in the analysis of big data, for which feature screening has become an indispensable statistical tool. We propose a Pearson chi-square based feature screening procedure for categorical response with ultrahigh dimensional categorical covariates. The proposed procedure can be directly applied for detection of important interaction effects. We further show that the proposed procedure possesses screening consistency property in the terminology of Fan and Lv (2008). We investigate the finite sample performance of the proposed procedure by Monte Carlo simulation studies, and illustrate the proposed method by two empirical datasets.

Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling

Download Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (111 download)

DOWNLOAD NOW!


Book Synopsis Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling by : Ling Zhang

Download or read book Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling written by Ling Zhang and published by . This book was released on 2019 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Nowadays, rapid developments in computer technologies have greatly reduced the cost of collecting and storing a massive amount of data. As a result, data with ultrahigh dimensionality begins to enter our vision due to a cheaper cost. It makes new levels of scientific discoveries promising, but also brings us new challenges of analyzing and understanding these data. Variable selection methods, feature screening procedures, and random forest algorithms have been widely used in many scientific fields such as computational biology, health studies, and financial engineering. The goal is to recover the underlying model structure and make an accurate prediction when a large number of predictors are introduced at the initial stage, but only a small subset of them are truly associated with the response.High dimensional survival data analysis is such a scientific field. In the first part of the dissertation, we propose a two-stage feature screening procedure for varying-coefficient Cox model with ultrahigh dimensional covariates. The varying-coefficient model is flexible and powerful for modeling the dynamic effects of coefficients. In the literature, the screening methods for varying-coefficient Cox model are limited to marginal measurements. Distinguished from the marginal screening, the proposed screening procedure is based on the joint partial likelihood of all predictors. Through this, the proposed procedure can effectively identify active predictors that are jointly dependent of, but marginally independent of the response. In order to carry out the proposed procedure, we propose an efficient algorithm and establish the ascent property of the proposed algorithm. We further prove that the proposed procedure possesses the sure screening property: with probability tending to one, the selected variable set includes the actual active predictors. Monte Carlo simulation is conducted to evaluate the finite sample performance of the proposed procedure, with comparison to SIS(Fan and Lv, 2008) procedure and SJS(Yang et al., 2016) for the Cox model. The proposed methodology is also illustrated through the analysis of two real data examples.Although very helpful and computationally efficient, feature screening is not a very powerful method to detect those marginal unimportant variables that participate in high order interaction effects. However, this is the advantage of random forest algorithms because tree structure is a natural and powerful structure for detecting interaction effects. The drawback of the random forest algorithms is that they don't pay enough attention to feature selection, and therefore include lots of redundancy when constructing the forest. This phenomenon will severely influence the interpretability and prediction performance of the forest especially when only a small proportion among a large number of candidate variables are important.In the second part of the dissertation, we propose combining the advantages of forest algorithm and feature screening for a better understanding of the hidden mechanism. To achieve this, we propose a new two-layer random forest algorithm, ``Iteratively Kings' Forests''(iKF), for feature selection and interaction detection in classification and regression problems. In the first layer, we modified the traditional forest constructing process so that we can fully explore the mechanism, both marginal and interaction effects, related to a given important variable(say "King" variable). In the second layer, we iteratively search the next important variable and iterate the process of the first layer for it. Finally, we not only obtain a screened variable index set but also output a short list of ranked highly possible interaction effects. Simulation comparisons are conducted to compare its performance with the feature screening procedure DC-SIS(Li et al., 2012) and random forest algorithm "iRF"(Basu et al., 2018). Also, we apply iKF procedure for empirical analysis to identify important interactions in an early Drosophila embryo data and compare its performance with "iRF".

Macroeconomic Forecasting in the Era of Big Data

Download Macroeconomic Forecasting in the Era of Big Data PDF Online Free

Author :
Publisher : Springer Nature
ISBN 13 : 3030311503
Total Pages : 716 pages
Book Rating : 4.0/5 (33 download)

DOWNLOAD NOW!


Book Synopsis Macroeconomic Forecasting in the Era of Big Data by : Peter Fuleky

Download or read book Macroeconomic Forecasting in the Era of Big Data written by Peter Fuleky and published by Springer Nature. This book was released on 2019-11-28 with total page 716 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book surveys big data tools used in macroeconomic forecasting and addresses related econometric issues, including how to capture dynamic relationships among variables; how to select parsimonious models; how to deal with model uncertainty, instability, non-stationarity, and mixed frequency data; and how to evaluate forecasts, among others. Each chapter is self-contained with references, and provides solid background information, while also reviewing the latest advances in the field. Accordingly, the book offers a valuable resource for researchers, professional forecasters, and students of quantitative economics.

Statistical Foundations of Data Science

Download Statistical Foundations of Data Science PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1466510854
Total Pages : 752 pages
Book Rating : 4.4/5 (665 download)

DOWNLOAD NOW!


Book Synopsis Statistical Foundations of Data Science by : Jianqing Fan

Download or read book Statistical Foundations of Data Science written by Jianqing Fan and published by CRC Press. This book was released on 2020-09-21 with total page 752 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

Statistical Methods for Different Ultrahigh Dimensional Models

Download Statistical Methods for Different Ultrahigh Dimensional Models PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 166 pages
Book Rating : 4.:/5 (857 download)

DOWNLOAD NOW!


Book Synopsis Statistical Methods for Different Ultrahigh Dimensional Models by : Jingyuan Liu

Download or read book Statistical Methods for Different Ultrahigh Dimensional Models written by Jingyuan Liu and published by . This book was released on 2013 with total page 166 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Modeling and Analysis of Longitudinal Data

Download Modeling and Analysis of Longitudinal Data PDF Online Free

Author :
Publisher : Elsevier
ISBN 13 : 0443136521
Total Pages : 362 pages
Book Rating : 4.4/5 (431 download)

DOWNLOAD NOW!


Book Synopsis Modeling and Analysis of Longitudinal Data by :

Download or read book Modeling and Analysis of Longitudinal Data written by and published by Elsevier. This book was released on 2024-02-20 with total page 362 pages. Available in PDF, EPUB and Kindle. Book excerpt: Longitudinal Data Analysis, Volume 50 in the Handbook of Statistics series covers how data consists of a series of repeated observations of the same subjects over an extended time frame and is thus useful for measuring change. Such studies and the data arise in a variety of fields, such as health sciences, genomic studies, experimental physics, sociology, sports and student enrollment in universities. For example, in health studies, intra-subject correlation of responses must be accounted for, covariates vary with time, and bias can arise if patients drop out of the study. Provides the authority and expertise of leading contributors from an international board of authors Presents the latest release in the Handbook of Statistics series Updated release includes the latest information on Modeling and Analysis of Longitudinal Data

Advances and Innovations in Statistics and Data Science

Download Advances and Innovations in Statistics and Data Science PDF Online Free

Author :
Publisher : Springer Nature
ISBN 13 : 3031083296
Total Pages : 339 pages
Book Rating : 4.0/5 (31 download)

DOWNLOAD NOW!


Book Synopsis Advances and Innovations in Statistics and Data Science by : Wenqing He

Download or read book Advances and Innovations in Statistics and Data Science written by Wenqing He and published by Springer Nature. This book was released on 2022-10-27 with total page 339 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book highlights selected papers from the 4th ICSA-Canada Chapter Symposium, as well as invited articles from established researchers in the areas of statistics and data science. It covers a variety of topics, including methodology development in data science, such as methodology in the analysis of high dimensional data, feature screening in ultra-high dimensional data and natural language ranking; statistical analysis challenges in sampling, multivariate survival models and contaminated data, as well as applications of statistical methods. With this book, readers can make use of frontier research methods to tackle their problems in research, education, training and consultation.

Analysis of Longitudinal Data with Example

Download Analysis of Longitudinal Data with Example PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1498764622
Total Pages : 248 pages
Book Rating : 4.4/5 (987 download)

DOWNLOAD NOW!


Book Synopsis Analysis of Longitudinal Data with Example by : You-Gan Wang

Download or read book Analysis of Longitudinal Data with Example written by You-Gan Wang and published by CRC Press. This book was released on 2022-01-28 with total page 248 pages. Available in PDF, EPUB and Kindle. Book excerpt: Development in methodology on longitudinal data is fast. Currently, there are a lack of intermediate /advanced level textbooks which introduce students and practicing statisticians to the updated methods on correlated data inference. This book will present a discussion of the modern approaches to inference, including the links between the theories of estimators and various types of efficient statistical models including likelihood-based approaches. The theory will be supported with practical examples of R-codes and R-packages applied to interesting case-studies from a number of different areas. Key Features: •Includes the most up-to-date methods •Use simple examples to demonstrate complex methods •Uses real data from a number of areas •Examples utilize R code

Handbook of Big Data Analytics

Download Handbook of Big Data Analytics PDF Online Free

Author :
Publisher : Springer
ISBN 13 : 3319182846
Total Pages : 532 pages
Book Rating : 4.3/5 (191 download)

DOWNLOAD NOW!


Book Synopsis Handbook of Big Data Analytics by : Wolfgang Karl Härdle

Download or read book Handbook of Big Data Analytics written by Wolfgang Karl Härdle and published by Springer. This book was released on 2018-07-20 with total page 532 pages. Available in PDF, EPUB and Kindle. Book excerpt: Addressing a broad range of big data analytics in cross-disciplinary applications, this essential handbook focuses on the statistical prospects offered by recent developments in this field. To do so, it covers statistical methods for high-dimensional problems, algorithmic designs, computation tools, analysis flows and the software-hardware co-designs that are needed to support insightful discoveries from big data. The book is primarily intended for statisticians, computer experts, engineers and application developers interested in using big data analytics with statistics. Readers should have a solid background in statistics and computer science.

Insights in Statistical Genetics and Methodology: 2022

Download Insights in Statistical Genetics and Methodology: 2022 PDF Online Free

Author :
Publisher : Frontiers Media SA
ISBN 13 : 283253645X
Total Pages : 172 pages
Book Rating : 4.8/5 (325 download)

DOWNLOAD NOW!


Book Synopsis Insights in Statistical Genetics and Methodology: 2022 by : Simon Charles Heath

Download or read book Insights in Statistical Genetics and Methodology: 2022 written by Simon Charles Heath and published by Frontiers Media SA. This book was released on 2023-10-24 with total page 172 pages. Available in PDF, EPUB and Kindle. Book excerpt: This Research Topic is part of the Insights in Frontiers in Genetics series.

Analysis of Longitudinal Data with Example

Download Analysis of Longitudinal Data with Example PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1351649671
Total Pages : 213 pages
Book Rating : 4.3/5 (516 download)

DOWNLOAD NOW!


Book Synopsis Analysis of Longitudinal Data with Example by : You-Gan Wang

Download or read book Analysis of Longitudinal Data with Example written by You-Gan Wang and published by CRC Press. This book was released on 2022-01-28 with total page 213 pages. Available in PDF, EPUB and Kindle. Book excerpt: Development in methodology on longitudinal data is fast. Currently, there are a lack of intermediate /advanced level textbooks which introduce students and practicing statisticians to the updated methods on correlated data inference. This book will present a discussion of the modern approaches to inference, including the links between the theories of estimators and various types of efficient statistical models including likelihood-based approaches. The theory will be supported with practical examples of R-codes and R-packages applied to interesting case-studies from a number of different areas. Key Features: •Includes the most up-to-date methods •Use simple examples to demonstrate complex methods •Uses real data from a number of areas •Examples utilize R code

Variable Screening Methods in Multi-Category Problems for Ultra-High Dimensional Data

Download Variable Screening Methods in Multi-Category Problems for Ultra-High Dimensional Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (1 download)

DOWNLOAD NOW!


Book Synopsis Variable Screening Methods in Multi-Category Problems for Ultra-High Dimensional Data by : Yue Zeng

Download or read book Variable Screening Methods in Multi-Category Problems for Ultra-High Dimensional Data written by Yue Zeng and published by . This book was released on 2017 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Variable screening techniques are fast and crude techniques to scan high-dimensional data and conduct dimension reduction before a refined variable selection method is applied. Its marginal analysis feature makes the method computationally feasible for ultra-high dimensional problems. However, most existing screening methods for classification problems are designed only for binary classification problems. There is lack of a comprehensive study on variable screening for multi-class classification problems. This research aims to fill the gap by developing variable screening for multi-class problems, to meet the need of high dimensional classification. The work has useful applications in cancer study, medicine, engineering and biology. In this research, we propose and investigate new and effective screening methods for multi-class classification problems. We consider two types of screening methods. The first one conducts screening for multiple binary classification problems separately and then aggregates the selected variables. The second one conducts screening for multi-class classification problems directly. In particular, for each method we investigate important issues such as choices of classification algorithms, variable ranking, and model size determination. We implement various selection criteria and compare their performance. We conduct extensive simulation studies to evaluate and compare the proposed screening methods with existing ones, which show that the new methods are promising. Furthermore, we apply the proposed methods to four cancer studies. R code has been developed for each method.

Quantile Regression

Download Quantile Regression PDF Online Free

Author :
Publisher : John Wiley & Sons
ISBN 13 : 111997528X
Total Pages : 288 pages
Book Rating : 4.1/5 (199 download)

DOWNLOAD NOW!


Book Synopsis Quantile Regression by : Cristina Davino

Download or read book Quantile Regression written by Cristina Davino and published by John Wiley & Sons. This book was released on 2013-12-31 with total page 288 pages. Available in PDF, EPUB and Kindle. Book excerpt: A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensive description of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and followed by applications using real data. Quantile Regression: Presents a complete treatment of quantile regression methods, including, estimation, inference issues and application of methods. Delivers a balance between methodolgy and application Offers an overview of the recent developments in the quantile regression framework and why to use quantile regression in a variety of areas such as economics, finance and computing. Features a supporting website (www.wiley.com/go/quantile_regression) hosting datasets along with R, Stata and SAS software code. Researchers and PhD students in the field of statistics, economics, econometrics, social and environmental science and chemistry will benefit from this book.