Feature Screening For High Dimensional Variable Selection In Generalized Linear Models

Download Feature Screening For High Dimensional Variable Selection In Generalized Linear Models full books in PDF, epub, and Kindle. Read online Feature Screening For High Dimensional Variable Selection In Generalized Linear Models ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!

Feature Screening for High-dimensional Variable Selection in Generalized Linear Models

Download Feature Screening for High-dimensional Variable Selection in Generalized Linear Models PDF Online Free

Author : Jinzhu Jiang
Publisher :
ISBN 13 :
Total Pages : 105 pages
Book Rating : 4.:/5 (126 download)

Book Synopsis Feature Screening for High-dimensional Variable Selection in Generalized Linear Models by : Jinzhu Jiang

Download or read book Feature Screening for High-dimensional Variable Selection in Generalized Linear Models written by Jinzhu Jiang and published by . This book was released on 2021 with total page 105 pages. Available in PDF, EPUB and Kindle. Book excerpt: High-dimensional data are widely encountered in a great variety of areas such as bioinformatics, medicine, marketing, and finance over the past few decades. The curse of high-dimensionality presents a challenge in both methodological and computational aspects. Many traditional statistical modeling techniques perform well for low-dimensional data, but their performance begin to deteriorate when being extended to high-dimensional data. Among all modeling techniques, variable selection plays a fundamental role in high-dimensional data modeling. To deal with the high-dimensionality problem, a large amount of variable selection approaches based on regularization have been developed, including but not limited to LASSO (Tibshirani, 1996), SCAD (Fan and Li, 2001), Dantzig selector (Candes and Tao, 2007). However, as the dimensionality getting higher and higher, those regularization approaches may not perform well due to the simultaneous challenges in computational expediency, statistical accuracy, and algorithm stability (Fan et al., 2009). To address those challenges, a series of feature screening procedures have been proposed. Sure independence screening (SIS) is a well-known procedure for variable selection in linear models with high and ultrahigh dimensional data based on the Pearson correlation (Fan and Lv, 2008). Yet, the original SIS procedure mainly focused on linear models with the continuous response variable. Fan and Song (2010) also extended this method to generalized linear models by ranking the maximum marginal likelihood estimator (MMLE) or maximum marginal likelihood itself. In this dissertation, we consider extending the SIS procedure to high-dimensional generalized linear models with binary response variable. We propose a two-stage feature screening procedure for generalized linear models with a binary response based on point-biserial correlation. The point-biserial correlation is an estimate of the correlation between one continuous variable and one binary variable. The two-stage point-biserial sure independence screening (PB-SIS) can be implemented in a straightforward way as the original SIS procedure, but it targets more specifically on high-dimensional generalized linear models with the binary response variable. In the first stage, we perform the SIS procedure by using point-biserial correlation to reduce the high dimensionality of a model to a moderate size. In the second stage, we apply a regularization method, such as LASSO, SCAD, or MCP, to further select important variables and find the final spare model. We establish the sure screening property under certain conditions for the PB-SIS method for high-dimensional generalized linear models with the binary response variable. The sure independence property for PB-SIS shows that our proposed method can select all the important variables in the screened submodel with probability very close to one. We also conduct simulation studies for generalized linear models with binary response variable by generating data from different link functions. To evaluate the performance of our proposed method, we compare the proportion of submodel with size d that contains all the true predictors among 1000 simulations, P , and computing time for our proposed method with MMLE and Kolmogorov filter methods after the first stage screening. We also compare the performance of two-stage PB-SIS methods with different penalized methods by using different tuning parameter selection criteria. The simulation results demonstrate that PB-SIS outperforms the Kolmogorov filter methods in both the selection accuracy and computational cost in different settings and has almost the same selection accuracy as MMLE but with much lower computational cost. A real data application is given to illustrate the performance of the proposed two-stage PB-SIS method.

Macroeconomic Forecasting in the Era of Big Data

Download Macroeconomic Forecasting in the Era of Big Data PDF Online Free

Author : Peter Fuleky
Publisher : Springer Nature
ISBN 13 : 3030311503
Total Pages : 716 pages
Book Rating : 4.0/5 (33 download)

Book Synopsis Macroeconomic Forecasting in the Era of Big Data by : Peter Fuleky

Download or read book Macroeconomic Forecasting in the Era of Big Data written by Peter Fuleky and published by Springer Nature. This book was released on 2019-11-28 with total page 716 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book surveys big data tools used in macroeconomic forecasting and addresses related econometric issues, including how to capture dynamic relationships among variables; how to select parsimonious models; how to deal with model uncertainty, instability, non-stationarity, and mixed frequency data; and how to evaluate forecasts, among others. Each chapter is self-contained with references, and provides solid background information, while also reviewing the latest advances in the field. Accordingly, the book offers a valuable resource for researchers, professional forecasters, and students of quantitative economics.

Nonparametric and Semiparametric Models

Download Nonparametric and Semiparametric Models PDF Online Free

Author : Wolfgang Karl Härdle
Publisher : Springer Science & Business Media
ISBN 13 : 364217146X
Total Pages : 317 pages
Book Rating : 4.6/5 (421 download)

Book Synopsis Nonparametric and Semiparametric Models by : Wolfgang Karl Härdle

Download or read book Nonparametric and Semiparametric Models written by Wolfgang Karl Härdle and published by Springer Science & Business Media. This book was released on 2012-08-27 with total page 317 pages. Available in PDF, EPUB and Kindle. Book excerpt: The statistical and mathematical principles of smoothing with a focus on applicable techniques are presented in this book. It naturally splits into two parts: The first part is intended for undergraduate students majoring in mathematics, statistics, econometrics or biometrics whereas the second part is intended to be used by master and PhD students or researchers. The material is easy to accomplish since the e-book character of the text gives a maximum of flexibility in learning (and teaching) intensity.

The Estimation and Inference of Complex Models

Download The Estimation and Inference of Complex Models PDF Online Free

Author : Min Zhou
Publisher :
ISBN 13 :
Total Pages : 148 pages
Book Rating : 4.:/5 (1 download)

Book Synopsis The Estimation and Inference of Complex Models by : Min Zhou

Download or read book The Estimation and Inference of Complex Models written by Min Zhou and published by . This book was released on 2017 with total page 148 pages. Available in PDF, EPUB and Kindle. Book excerpt: In this thesis, we investigate the estimation problem and inference problem for the complex models. Two major categories of complex models are emphasized by us, one is generalized linear models, the other is time series models. For the generalized linear models, we consider one fundamental problem about sure screening for interaction terms in ultra-high dimensional feature space; for time series models, an important model assumption about Markov property is considered by us. The first part of this thesis illustrates the significant interaction pursuit problem for ultra-high dimensional models with two-way interaction effects. We propose a simple sure screening procedure (SSI) to detect significant interactions between the explanatory variables and the response variable in the high or ultra-high dimensional generalized linear regression models. Sure screening method is a simple, but powerful tool for the first step of feature selection or variable selection for ultra-high dimensional data. We investigate the sure screening properties of the proposal method from theoretical insight. Furthermore, we indicate that our proposed method can control the false discovery rate at a reasonable size, so the regularized variable selection methods can be easily applied to get more accurate feature selection in the following model selection procedures. Moreover, from the viewpoint of computational efficiency, we suggest a much more efficient algorithm-discretized SSI (DSSI) to realize our proposed sure screening method in practice. And we also investigate the properties of these two algorithms SSI and DSSI in simulation studies and apply them to some real data analyses for illustration. For the second part, our concern is the testing of the Markov property in time series processes. Markovian assumption plays an extremely important role in time series analysis and is also a fundamental assumption in economic and financial models. However, few existing research mainly focused on how to test the Markov properties for the time series processes. Therefore, for the Markovian assumption, we propose a new test procedure to check if the time series with beta-mixing possesses the Markov property. Our test is based on the Conditional Distance Covariance (CDCov). We investigate the theoretical properties of the proposed method. The asymptotic distribution of the proposed test statistic under the null hypothesis is obtained, and the power of the test procedure under local alternative hypothesizes have been studied. Simulation studies are conducted to demonstrate the finite sample performance of our test.

Sparse Graphical Modeling for High Dimensional Data

Download Sparse Graphical Modeling for High Dimensional Data PDF Online Free

Author : Faming Liang
Publisher : CRC Press
ISBN 13 : 0429584806
Total Pages : 151 pages
Book Rating : 4.4/5 (295 download)

Book Synopsis Sparse Graphical Modeling for High Dimensional Data by : Faming Liang

Download or read book Sparse Graphical Modeling for High Dimensional Data written by Faming Liang and published by CRC Press. This book was released on 2023-08-02 with total page 151 pages. Available in PDF, EPUB and Kindle. Book excerpt: A general framework for learning sparse graphical models with conditional independence tests Complete treatments for different types of data, Gaussian, Poisson, multinomial, and mixed data Unified treatments for data integration, network comparison, and covariate adjustment Unified treatments for missing data and heterogeneous data Efficient methods for joint estimation of multiple graphical models Effective methods of high-dimensional variable selection Effective methods of high-dimensional inference

Feature Screening For Ultra-high Dimensional Longitudinal Data

Download Feature Screening For Ultra-high Dimensional Longitudinal Data PDF Online Free

Author : Wanghuan Chu
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (959 download)

Book Synopsis Feature Screening For Ultra-high Dimensional Longitudinal Data by : Wanghuan Chu

Download or read book Feature Screening For Ultra-high Dimensional Longitudinal Data written by Wanghuan Chu and published by . This book was released on 2016 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: High and ultrahigh dimensional data analysis is now receiving more and more attention in many scientific fields. Various variable selection methods have been proposed for high dimensional data where feature dimension p increases with sample size n at polynomial rates. In ultrahigh dimensional setting, p is allowed to grow with n at an exponential rate. Instead of jointly selecting active covariates, a more effective approach is to incorporate screening rule that aims at filtering out unimportant covariates through marginal regression techniques. This thesis is concerned with feature screening methods for ultrahigh dimensional longitudinal data. Such data occur frequently in longitudinal genetic studies, where phenotypes and some covariates are measured repeatedly over a certain time period. Along with the genetic measurements, longitudinal genetic studies provide valuable resources for exploring primary genetic and environmental factors that influence complex phenotypes over time. The proposed statistical methods in this work allow us not only to identify genetic determinants of common complex disease, but also to understand at which stage of human life do the genetic determinants become important. In Chapter 3, we propose a new feature screening procedure for ultrahigh dimensional time-varying coefficient models. We present an effective screening rule based on marginal B-spline regression that incorporates time-varying variance and within-subject correlations. We show that under certain conditions, this procedure possesses sure screening property, and the false selection rates can be controlled. We demonstrate how within subject variability can be harnessed for increasing screening accuracy by Monte Carlo simulation studies. Furthermore, we illustrate the proposed screening rule via an empirical analysis of the Childhood Asthma Management Program (CAMP) data. Our empirical analysis clearly shows that the proposed approach is especially useful for such studies as children change quite extensively over a four-year period with highly nonlinear patterns. In Chapter 4, we study screening rules for ultrahigh dimensional covariates that are potentially associated with random effects. Mixed effects models are popular for taking into account the dependence structure of longitudinal data, as subject-specific random effects can explicitly account for within-subject correlation. We propose a two-step screening procedure for generalized varying-coefficient mixed effects models. The two-step procedure screens fixed effects first and then random effects. We conduct simulation studies to assess the finite sample performance of this two-step screening approach for continuous response with linear regression, binary response with logistic regression, count response with Poisson regression, and ordinal response with proportional-odds cumulative logit model. In real data application, we apply this procedure to data from Framingham Heart Study (FHS), and explore the genetic and environmental effects on body mass index (BMI), obesity and blood pressure in three separate analyses. Our results confirm some findings from previous studies, and also identify genetic markers with highly significant effects and interesting time-dependent patterns that worth further exploration.

The Collected Works of Wassily Hoeffding

Download The Collected Works of Wassily Hoeffding PDF Online Free

Author : Wassily Hoeffding
Publisher : Springer Science & Business Media
ISBN 13 : 1461208653
Total Pages : 653 pages
Book Rating : 4.4/5 (612 download)

Book Synopsis The Collected Works of Wassily Hoeffding by : Wassily Hoeffding

Download or read book The Collected Works of Wassily Hoeffding written by Wassily Hoeffding and published by Springer Science & Business Media. This book was released on 2012-12-06 with total page 653 pages. Available in PDF, EPUB and Kindle. Book excerpt: It has been a rare privilege to assemble this volume of Wassily Hoeffding's Collected Works. Wassily was, variously, a teacher, supervisor and colleague to us, and his work has had a profound influence on our own. Yet this would not be sufficient reason to publish his collected works. The additional and overwhelmingly compelling justification comes from the fun damental nature of his contributions to Statistics and Probability. Not only were his ideas original, and far-reaching in their implications; Wassily de veloped them so completely and elegantly in his papers that they are still cited as prime references up to half a century later. However, three of his earliest papers are cited rarely, if ever. These include material from his doctoral dissertation. They were written in German, and two of them were published in relatively obscure series. Rather than reprint the original articles, we have chosen to have them translated into English. These trans lations appear in this book, making Wassily's earliest research available to a wide audience for the first time. All other articles (including those of his contributions to Mathematical Reviews which go beyond a simple reporting of contents of articles) have been reproduced as they appeared, together with annotations and corrections made by Wassily on some private copies of his papers. Preceding these articles are three review papers which dis cuss the . impact of his work in some of the areas where he made major contributions.

Statistical Foundations of Data Science

Download Statistical Foundations of Data Science PDF Online Free

Author : Jianqing Fan
Publisher : CRC Press
ISBN 13 : 0429527616
Total Pages : 942 pages
Book Rating : 4.4/5 (295 download)

Book Synopsis Statistical Foundations of Data Science by : Jianqing Fan

Download or read book Statistical Foundations of Data Science written by Jianqing Fan and published by CRC Press. This book was released on 2020-09-21 with total page 942 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

Statistical Learning with Sparsity

Download Statistical Learning with Sparsity PDF Online Free

Author : Trevor Hastie
Publisher : CRC Press
ISBN 13 : 1498712177
Total Pages : 354 pages
Book Rating : 4.4/5 (987 download)

Book Synopsis Statistical Learning with Sparsity by : Trevor Hastie

Download or read book Statistical Learning with Sparsity written by Trevor Hastie and published by CRC Press. This book was released on 2015-05-07 with total page 354 pages. Available in PDF, EPUB and Kindle. Book excerpt: Discover New Methods for Dealing with High-Dimensional DataA sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underl

High Dimensional Classification and Variable Selection

Download High Dimensional Classification and Variable Selection PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (864 download)

Book Synopsis High Dimensional Classification and Variable Selection by :

Download or read book High Dimensional Classification and Variable Selection written by and published by . This book was released on 2013 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Recent advances in biotechnology and other disciplines have led to the generation of many high-dimensional data, which raises challenges to develop new statistical methodologies to handle them. This dissertation focuses on two aspects of high-dimensional data inference: (1) classification based on high-dimensional covariates; (2) variable selection of high-dimensional linear regression model. Both aspects have great importance in high-dimensional data inference and are related with each other. Variable selection plays a critical rule to reduce the dimension of data. It usually boosts the signal to noise ratio and results in a simpler model that becomes much easier to interpret. Classification has many important applications in practice, such as face detection, hand-writing recognition, etc. For the high-dimensional classification problem, I have developed a new Sparse Quadratic Discriminant Analysis (SQDA) approach, which extends the application of traditional low-dimensional Quadratic Discriminant Analysis. The theoretical properties of the new SQDA approach is thoroughly addressed. Simulation studies have been conducted to compare SQDA with many other well-known classifiers in the literature. This new approach has also been applied to analyze one dataset from a colon cancer study. For the variable selection problem, a Regularized LASSO approach has been proposed, which alleviates the strong conditions for the classical LASSO method to perform well. It has been found that the new Regularized LASSO approach includes many other well-known variable selection methods as its special cases, which makes it a very general approach. The asymptotic properties of Regularized LASSO is thoroughly studied. It has been shown that the Regularized LASSO asymptotically identifies the correct model under mild assumptions. The new method has also been investigated through simulation studies, where it outperforms many other variable selection methods.

Two Tales of Variable Selection for High Dimensional Data

Download Two Tales of Variable Selection for High Dimensional Data PDF Online Free

Author : Cong Liu
Publisher :
ISBN 13 :
Total Pages : 95 pages
Book Rating : 4.:/5 (811 download)

Book Synopsis Two Tales of Variable Selection for High Dimensional Data by : Cong Liu

Download or read book Two Tales of Variable Selection for High Dimensional Data written by Cong Liu and published by . This book was released on 2012 with total page 95 pages. Available in PDF, EPUB and Kindle. Book excerpt: We also conduct similar types of studies for comparison of two corresponding screening and selection procedures of LASSO and correlation screening in classification setting, i.e., $L_{1}$ penalized logistic regression and two-sample t-test. Initial results of exploratory analysis are presented to provide some insights on the preferred scenarios of the two methods respectively. Discussions are made on possible extensions, future works and difference between regression and classification setting.

Statistical Foundations of Data Science

Download Statistical Foundations of Data Science PDF Online Free

Author : Jianqing Fan
Publisher : CRC Press
ISBN 13 : 1466510854
Total Pages : 752 pages
Book Rating : 4.4/5 (665 download)

Book Synopsis Statistical Foundations of Data Science by : Jianqing Fan

Download or read book Statistical Foundations of Data Science written by Jianqing Fan and published by CRC Press. This book was released on 2020-09-21 with total page 752 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical Foundations of Data Science gives a thorough introduction to commonly used statistical models, contemporary statistical machine learning techniques and algorithms, along with their mathematical insights and statistical theories. It aims to serve as a graduate-level textbook and a research monograph on high-dimensional statistics, sparsity and covariance learning, machine learning, and statistical inference. It includes ample exercises that involve both theoretical studies as well as empirical applications. The book begins with an introduction to the stylized features of big data and their impacts on statistical analysis. It then introduces multiple linear regression and expands the techniques of model building via nonparametric regression and kernel tricks. It provides a comprehensive account on sparsity explorations and model selections for multiple regression, generalized linear models, quantile regression, robust regression, hazards regression, among others. High-dimensional inference is also thoroughly addressed and so is feature screening. The book also provides a comprehensive account on high-dimensional covariance estimation, learning latent factors and hidden structures, as well as their applications to statistical estimation, inference, prediction and machine learning problems. It also introduces thoroughly statistical machine learning theory and methods for classification, clustering, and prediction. These include CART, random forests, boosting, support vector machines, clustering algorithms, sparse PCA, and deep learning.

Flexible Imputation of Missing Data, Second Edition

Download Flexible Imputation of Missing Data, Second Edition PDF Online Free

Author : Stef van Buuren
Publisher : CRC Press
ISBN 13 : 0429960352
Total Pages : 444 pages
Book Rating : 4.4/5 (299 download)

Book Synopsis Flexible Imputation of Missing Data, Second Edition by : Stef van Buuren

Download or read book Flexible Imputation of Missing Data, Second Edition written by Stef van Buuren and published by CRC Press. This book was released on 2018-07-17 with total page 444 pages. Available in PDF, EPUB and Kindle. Book excerpt: Missing data pose challenges to real-life data analysis. Simple ad-hoc fixes, like deletion or mean imputation, only work under highly restrictive conditions, which are often not met in practice. Multiple imputation replaces each missing value by multiple plausible values. The variability between these replacements reflects our ignorance of the true (but missing) value. Each of the completed data set is then analyzed by standard methods, and the results are pooled to obtain unbiased estimates with correct confidence intervals. Multiple imputation is a general approach that also inspires novel solutions to old problems by reformulating the task at hand as a missing-data problem. This is the second edition of a popular book on multiple imputation, focused on explaining the application of methods through detailed worked examples using the MICE package as developed by the author. This new edition incorporates the recent developments in this fast-moving field. This class-tested book avoids mathematical and technical details as much as possible: formulas are accompanied by verbal statements that explain the formula in accessible terms. The book sharpens the reader’s intuition on how to think about missing data, and provides all the tools needed to execute a well-grounded quantitative analysis in the presence of missing data.

Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling

Download Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling PDF Online Free

Author : Ling Zhang
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (111 download)

Book Synopsis Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling by : Ling Zhang

Download or read book Procedures for Feature Screening and Interaction Identification in High-dimensional Data Modelling written by Ling Zhang and published by . This book was released on 2019 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: Nowadays, rapid developments in computer technologies have greatly reduced the cost of collecting and storing a massive amount of data. As a result, data with ultrahigh dimensionality begins to enter our vision due to a cheaper cost. It makes new levels of scientific discoveries promising, but also brings us new challenges of analyzing and understanding these data. Variable selection methods, feature screening procedures, and random forest algorithms have been widely used in many scientific fields such as computational biology, health studies, and financial engineering. The goal is to recover the underlying model structure and make an accurate prediction when a large number of predictors are introduced at the initial stage, but only a small subset of them are truly associated with the response.High dimensional survival data analysis is such a scientific field. In the first part of the dissertation, we propose a two-stage feature screening procedure for varying-coefficient Cox model with ultrahigh dimensional covariates. The varying-coefficient model is flexible and powerful for modeling the dynamic effects of coefficients. In the literature, the screening methods for varying-coefficient Cox model are limited to marginal measurements. Distinguished from the marginal screening, the proposed screening procedure is based on the joint partial likelihood of all predictors. Through this, the proposed procedure can effectively identify active predictors that are jointly dependent of, but marginally independent of the response. In order to carry out the proposed procedure, we propose an efficient algorithm and establish the ascent property of the proposed algorithm. We further prove that the proposed procedure possesses the sure screening property: with probability tending to one, the selected variable set includes the actual active predictors. Monte Carlo simulation is conducted to evaluate the finite sample performance of the proposed procedure, with comparison to SIS(Fan and Lv, 2008) procedure and SJS(Yang et al., 2016) for the Cox model. The proposed methodology is also illustrated through the analysis of two real data examples.Although very helpful and computationally efficient, feature screening is not a very powerful method to detect those marginal unimportant variables that participate in high order interaction effects. However, this is the advantage of random forest algorithms because tree structure is a natural and powerful structure for detecting interaction effects. The drawback of the random forest algorithms is that they don't pay enough attention to feature selection, and therefore include lots of redundancy when constructing the forest. This phenomenon will severely influence the interpretability and prediction performance of the forest especially when only a small proportion among a large number of candidate variables are important.In the second part of the dissertation, we propose combining the advantages of forest algorithm and feature screening for a better understanding of the hidden mechanism. To achieve this, we propose a new two-layer random forest algorithm, ``Iteratively Kings' Forests''(iKF), for feature selection and interaction detection in classification and regression problems. In the first layer, we modified the traditional forest constructing process so that we can fully explore the mechanism, both marginal and interaction effects, related to a given important variable(say "King" variable). In the second layer, we iteratively search the next important variable and iterate the process of the first layer for it. Finally, we not only obtain a screened variable index set but also output a short list of ranked highly possible interaction effects. Simulation comparisons are conducted to compare its performance with the feature screening procedure DC-SIS(Li et al., 2012) and random forest algorithm "iRF"(Basu et al., 2018). Also, we apply iKF procedure for empirical analysis to identify important interactions in an early Drosophila embryo data and compare its performance with "iRF".

Partially Linear Models

Download Partially Linear Models PDF Online Free

Author : Wolfgang Härdle
Publisher : Springer Science & Business Media
ISBN 13 : 3642577008
Total Pages : 210 pages
Book Rating : 4.6/5 (425 download)

Book Synopsis Partially Linear Models by : Wolfgang Härdle

Download or read book Partially Linear Models written by Wolfgang Härdle and published by Springer Science & Business Media. This book was released on 2012-12-06 with total page 210 pages. Available in PDF, EPUB and Kindle. Book excerpt: In the last ten years, there has been increasing interest and activity in the general area of partially linear regression smoothing in statistics. Many methods and techniques have been proposed and studied. This monograph hopes to bring an up-to-date presentation of the state of the art of partially linear regression techniques. The emphasis is on methodologies rather than on the theory, with a particular focus on applications of partially linear regression techniques to various statistical problems. These problems include least squares regression, asymptotically efficient estimation, bootstrap resampling, censored data analysis, linear measurement error models, nonlinear measurement models, nonlinear and nonparametric time series models.

Statistical Inference

Download Statistical Inference PDF Online Free

Author : Ayanendranath Basu
Publisher : CRC Press
ISBN 13 : 1420099663
Total Pages : 424 pages
Book Rating : 4.4/5 (2 download)

Book Synopsis Statistical Inference by : Ayanendranath Basu

Download or read book Statistical Inference written by Ayanendranath Basu and published by CRC Press. This book was released on 2011-06-22 with total page 424 pages. Available in PDF, EPUB and Kindle. Book excerpt: In many ways, estimation by an appropriate minimum distance method is one of the most natural ideas in statistics. However, there are many different ways of constructing an appropriate distance between the data and the model: the scope of study referred to by "Minimum Distance Estimation" is literally huge. Filling a statistical resource gap, Stati

Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis

Download Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis PDF Online Free

Author : Wei Zhong
Publisher :
ISBN 13 :
Total Pages : 155 pages
Book Rating : 4.:/5 (82 download)

Book Synopsis Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis by : Wei Zhong

Download or read book Feature Screening and Variable Selection for Ultrahigh Dimensional Data Analysis written by Wei Zhong and published by . This book was released on 2012 with total page 155 pages. Available in PDF, EPUB and Kindle. Book excerpt: