Dimension Reduction and Regression for Tensor Data and Mixture Models

Download Dimension Reduction and Regression for Tensor Data and Mixture Models PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (135 download)

DOWNLOAD NOW!


Book Synopsis Dimension Reduction and Regression for Tensor Data and Mixture Models by : Ning Wang

Download or read book Dimension Reduction and Regression for Tensor Data and Mixture Models written by Ning Wang and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: In modern statistics, many data sets are of complex structure, including but not limited to high dimensionality, higher-order, and heterogeneity. Recently, there has been growing interest in developing valid and efficient statistical methods for these data sets. In my thesis, we studied three types of data complexity: (1) tensor data (a.k.a. array valued random objects); (2) heavy-tailed data; (3) data from heterogeneous subpopulations. We address these three challenges by developing novel methodologies and efficient algorithms. Specifically, we proposed likelihood-based dimension folding methods for tensor data, studied the robust tensor $\td$ regression by a proposed tensor $\td$ distribution, and developed an algorithm and theory for high-dimensional mixture linear regression. My work on these three topics is elaborated as follows. In recent years, traditional multivariate analysis tools, such as multivariate regression and discriminant analysis, are generalized from modeling random vectors and matrices to higher-order random tensors (a.k.a.~array-valued random objects). Equipped with tensor algebra and high-dimensional computation techniques, concise and interpretable statistical models and estimation procedures prevail in various applications. One challenge for tensor data analysis is caused by the large dimensions of the tensor. Many statistical methods such as linear discriminant analysis and quadratic discriminant analysis are not applicable or unstable for data sets with the dimension that is larger than the sample size. Sufficient dimension reduction methods are flexible tools for data visualization and exploratory analysis, typically in a regression of a univariate response on a multivariate predictor. For regressions with tensor predictors, a general framework of dimension folding and several moment-based estimation procedures have been proposed in the literature. In this essay, we propose two likelihood-based dimension folding methods motivated by quadratic discriminant analysis for tensor data: the maximum likelihood estimators are derived under a general covariance setting and a structured envelope covariance setting. We study the asymptotic properties of both estimators and show using simulation studies and a real-data analysis that they are more accurate than existing moment-based estimators. Another challenge to statistical tensor models is the non-Gaussian nature of many real-world data. Unfortunately, existing approaches are either restricted to normality or implicitly using least squares type objective functions that are computationally efficient but sensitive to data contamination. Motivated by this, we adopt a simple tensor $\td$-distribution that is, unlike the commonly used matrix $\td$-distributions, compatible with tensor operators and reshaping of the data. We study the tensor response regression with tensor $\td$-error, and develop penalized likelihood-based estimation and a novel one-step estimation. We study the asymptotic relative efficiency of various estimators and establish the one-step estimator's oracle properties and near-optimal asymptotic efficiency. We further propose a high-dimensional modification to the one-step estimation procedure and showed that it attains the minimax optimal rate in estimation. Numerical studies show the excellent performance of the one-step estimator. In the last chapter, we consider the high-dimensional mixture linear regression. The expectation-maximization (EM) algorithm and its variants are widely used in statistics. In high-dimensional mixture linear regression, the model is assumed to be a finite mixture of linear regression forms and the number of predictors is much larger than the sample size. The standard EM algorithm, which attempts to find the maximum likelihood estimator, becomes infeasible. We devise a penalized EM algorithm and study its statistical properties. Existing theoretical results of regularized EM algorithms often rely on dividing the sample into many independent batches and employing a fresh batch of sample in each iteration of the algorithm. Our algorithm and theoretical analysis do not require sample-splitting. The proposed method also has encouraging performances in simulation studies and a real data example.

Topics on Mixture Models and Discriminant Analysis

Download Topics on Mixture Models and Discriminant Analysis PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (141 download)

DOWNLOAD NOW!


Book Synopsis Topics on Mixture Models and Discriminant Analysis by : Kai Deng

Download or read book Topics on Mixture Models and Discriminant Analysis written by Kai Deng and published by . This book was released on 2023 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Mixture models for clustering and regressions and discriminant analysis are the cornerstones of multivariate statistics and supervised/unsupervised learning research. The structure of data has become increasingly complex in many modern applications including but not limited to computational biology, recommendation systems and text/image analysis. Therefore, it is of great interest to develop methodologies and algorithms for mixture models and discriminant analysis that target the challenges arising from such complex data. In this dissertation, I address three types of challenging supervised and unsupervised topics with novel methodologies and algorithms: (1) tensor data simultaneous clustering and multiway dimension reduction; (2) high-dimensional heterogeneous data in mixture linear regression; (3) multivariate and multi-label response classification in high dimensions. The three chapters are elaborated as follows. In the form of multi-dimensional arrays, tensor data have become increasingly prevalent in modern scientific studies and biomedical applications such as computational biology, brain imaging analysis, and process monitoring system. These data are intrinsically heterogeneous with complex dependencies and structure. Therefore, ad-hoc dimension reduction methods on tensor data may lack statistical efficiency and can obscure essential findings. Model-based clustering is a cornerstone of multivariate statistics and unsupervised learning; however, existing methods and algorithms are not designed for tensor-variate samples. In the first chapter, we propose a Tensor Envelope Mixture Model (TEMM) for simultaneous clustering and multiway dimension reduction of tensor data. TEMM incorporates tensor-structure-preserving dimension reduction into mixture modeling and drastically reduces the number of free parameters and estimative variability. An EM-type algorithm is developed to obtain likelihood-based estimators of the cluster means and covariances, which are jointly parameterized and constrained onto a series of lower-dimensional subspaces known as the tensor envelopes. We demonstrate the encouraging empirical performance of the proposed method in extensive simulation studies and a real data application in comparison with existing vector and tensor clustering methods. In the second chapter, we consider the problem of finite mixture of linear regressions (MLR) for high-dimensional heterogeneous data where the sample size is much smaller than the number of random variables, which is widely used in many modern applications such as biological science, genetics and engineering. In order to capture the common sparse structure in large heterogeneous data, traditional high-dimensional EM algorithm can be computational intractable thus fail to produce meaningful estimation results. We propose a fast group-penalized EM algorithm (FGEM) for high-dimensional MLR that estimates the regression coefficients from a group sparsity perspective and is computationally efficient and less sensitive to initialization. The statistical property of the proposed algorithm is established without requiring sample-splitting that allows the predictor dimension grows exponentially with the sample size. We demonstrate the encouraging performance of FGEM in numerical studies in comparison with traditional high-dimensional EM algorithms. The problem of classifying multiple categorical responses is pervasive in modern machine learning and statistics, with diverse applications in fields such as bioinformatics and image classification. The third chapter investigates linear discriminant analysis (LDA) with high-dimensional predictors and multiple multi-class responses. Specifically, we examine two different classification scenarios under the bivariate LDA model: joint classification of the two responses and conditional classification of one response while observing the other. To achieve optimal classification rules for both scenarios, we introduce two novel tensor formulations of the discriminant coefficients and corresponding penalties. For joint classification, we propose an overlapping group lasso penalty and a blockwise coordinate descent algorithm to efficiently compute joint tensor discriminant coefficients. For conditional classification, we utilize an alternating direction method of multipliers (ADMM) algorithm to compute tensor discriminant coefficients under new constraints. We extend our method and algorithms to general multivariate responses. Finally, we validate the effectiveness of our approach through simulation studies and real data examples.

Multimodal and Tensor Data Analytics for Industrial Systems Improvement

Download Multimodal and Tensor Data Analytics for Industrial Systems Improvement PDF Online Free

Author :
Publisher : Springer Nature
ISBN 13 : 3031530926
Total Pages : 388 pages
Book Rating : 4.0/5 (315 download)

DOWNLOAD NOW!


Book Synopsis Multimodal and Tensor Data Analytics for Industrial Systems Improvement by : Nathan Gaw

Download or read book Multimodal and Tensor Data Analytics for Industrial Systems Improvement written by Nathan Gaw and published by Springer Nature. This book was released on with total page 388 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Modern Dimension Reduction

Download Modern Dimension Reduction PDF Online Free

Author :
Publisher : Cambridge University Press
ISBN 13 : 1108991645
Total Pages : 98 pages
Book Rating : 4.1/5 (89 download)

DOWNLOAD NOW!


Book Synopsis Modern Dimension Reduction by : Philip D. Waggoner

Download or read book Modern Dimension Reduction written by Philip D. Waggoner and published by Cambridge University Press. This book was released on 2021-08-05 with total page 98 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data are not only ubiquitous in society, but are increasingly complex both in size and dimensionality. Dimension reduction offers researchers and scholars the ability to make such complex, high dimensional data spaces simpler and more manageable. This Element offers readers a suite of modern unsupervised dimension reduction techniques along with hundreds of lines of R code, to efficiently represent the original high dimensional data space in a simplified, lower dimensional subspace. Launching from the earliest dimension reduction technique principal components analysis and using real social science data, I introduce and walk readers through application of the following techniques: locally linear embedding, t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection, self-organizing maps, and deep autoencoders. The result is a well-stocked toolbox of unsupervised algorithms for tackling the complexities of high dimensional data so common in modern society. All code is publicly accessible on Github.

Supervised Dimensionality Reduction Using Mixture Models

Download Supervised Dimensionality Reduction Using Mixture Models PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 15 pages
Book Rating : 4.:/5 (785 download)

DOWNLOAD NOW!


Book Synopsis Supervised Dimensionality Reduction Using Mixture Models by : Sajama Sajama

Download or read book Supervised Dimensionality Reduction Using Mixture Models written by Sajama Sajama and published by . This book was released on 2004 with total page 15 pages. Available in PDF, EPUB and Kindle. Book excerpt: Given a classification problem, our goal is to find a low-dimensional linear transformation of the feature vectors which retains information needed to predict the class labels. We present a method based on maximum conditional likelihood estimation of mixture models. Use of mixture models allows us to approximate the distributions to any desired accuracy while use of conditional likelihood as the contrast function ensures that the selected subspace retains maximum possible mutual information between feature vectors and class labels. Classification experiments using Gaussian mixture components show that this method compares favorably to related dimension reduction techniques. Other distributions belonging to the exponential family can be used to reduce dimensions when data is of a special type, for example binary or integer valued data. We provide an EM-like algorithm for model estimation and present visualization experiments using both the Gaussian and the Bernoulli mixture models.

Tensor Networks for Dimensionality Reduction and Large-Scale Optimization

Download Tensor Networks for Dimensionality Reduction and Large-Scale Optimization PDF Online Free

Author :
Publisher :
ISBN 13 : 9781680832761
Total Pages : 262 pages
Book Rating : 4.8/5 (327 download)

DOWNLOAD NOW!


Book Synopsis Tensor Networks for Dimensionality Reduction and Large-Scale Optimization by : Andrzej Cichocki

Download or read book Tensor Networks for Dimensionality Reduction and Large-Scale Optimization written by Andrzej Cichocki and published by . This book was released on 2017-05-28 with total page 262 pages. Available in PDF, EPUB and Kindle. Book excerpt: This monograph builds on Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions by discussing tensor network models for super-compressed higher-order representation of data/parameters and cost functions, together with an outline of their applications in machine learning and data analytics. A particular emphasis is on elucidating, through graphical illustrations, that by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volume of data/parameters, thereby alleviating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification, generalized eigenvalue decomposition and in the optimization of deep neural networks. The monograph focuses on tensor train (TT) and Hierarchical Tucker (HT) decompositions and their extensions, and on demonstrating the ability of tensor networks to provide scalable solutions for a variety of otherwise intractable large-scale optimization problems. Tensor Networks for Dimensionality Reduction and Large-scale Optimization Parts 1 and 2 can be used as stand-alone texts, or together as a comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions. See also: Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions. ISBN 978-1-68083-222-8

Dimensionality Reduction with Unsupervised Nearest Neighbors

Download Dimensionality Reduction with Unsupervised Nearest Neighbors PDF Online Free

Author :
Publisher : Springer Science & Business Media
ISBN 13 : 3642386520
Total Pages : 137 pages
Book Rating : 4.6/5 (423 download)

DOWNLOAD NOW!


Book Synopsis Dimensionality Reduction with Unsupervised Nearest Neighbors by : Oliver Kramer

Download or read book Dimensionality Reduction with Unsupervised Nearest Neighbors written by Oliver Kramer and published by Springer Science & Business Media. This book was released on 2013-05-30 with total page 137 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustrate the introduced concepts and to highlight the experimental results.

Dimension Reduction

Download Dimension Reduction PDF Online Free

Author :
Publisher : Now Publishers Inc
ISBN 13 : 1601983786
Total Pages : 104 pages
Book Rating : 4.6/5 (19 download)

DOWNLOAD NOW!


Book Synopsis Dimension Reduction by : Christopher J. C. Burges

Download or read book Dimension Reduction written by Christopher J. C. Burges and published by Now Publishers Inc. This book was released on 2010 with total page 104 pages. Available in PDF, EPUB and Kindle. Book excerpt: We give a tutorial overview of several foundational methods for dimension reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, canonical correlation analysis (CCA), kernel CCA, Fisher discriminant analysis, oriented PCA, and several techniques for sufficient dimension reduction. For the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps, and spectral clustering. Although the review focuses on foundations, we also provide pointers to some more modern techniques. We also describe the correlation dimension as one method for estimating the intrinsic dimension, and we point out that the notion of dimension can be a scale-dependent quantity. The Nystr m method, which links several of the manifold algorithms, is also reviewed. We use a publicly available dataset to illustrate some of the methods. The goal is to provide a self-contained overview of key concepts underlying many of these algorithms, and to give pointers for further reading.

Tensor Regression and Tensor Time Series Analyses for High Dimensional Data

Download Tensor Regression and Tensor Time Series Analyses for High Dimensional Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 100 pages
Book Rating : 4.:/5 (113 download)

DOWNLOAD NOW!


Book Synopsis Tensor Regression and Tensor Time Series Analyses for High Dimensional Data by : Herath Mudiyanselage Wiranthe Bandara Herath

Download or read book Tensor Regression and Tensor Time Series Analyses for High Dimensional Data written by Herath Mudiyanselage Wiranthe Bandara Herath and published by . This book was released on 2019 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: Many real data are naturally represented as a multidimensional array called a tensor. In classical regression and time series models, the predictors and covariate variables are considered as a vector. However, due to high dimensionality of predictor variables, these types of models are inefficient for analyzing multidimensional data. In contrast, tensor structured models use predictors and covariate variables in a tensor format. Tensor regression and tensor time series models can reduce high dimensional data to a low dimensional framework and lead to efficient estimation and prediction. In this thesis, we discuss the modeling and estimation procedures for both tensor regression models and tensor time series models. The results of simulation studies and a numerical analysis are provided.

Tensor Dimension Reduction Methods for Modeling High Dimensional Spatio-temporal Data

Download Tensor Dimension Reduction Methods for Modeling High Dimensional Spatio-temporal Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (137 download)

DOWNLOAD NOW!


Book Synopsis Tensor Dimension Reduction Methods for Modeling High Dimensional Spatio-temporal Data by : Rukayya Sani Ibrahim

Download or read book Tensor Dimension Reduction Methods for Modeling High Dimensional Spatio-temporal Data written by Rukayya Sani Ibrahim and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data observed simultaneously in both space and time are becoming increasingly prevalent with applications in diverse areas, from ecology to financial econometrics. The datasets are massive with several variables observed in varying locations and time and are often accompanied with irregularities. Therefore, there is need to formulate efficient models that can efficiently handle the size and all dependencies of massive datatsets while performing predictions and forecast well. In this work, we propose a new model for matrix-valued spatio-temporal data using the classic vector autoregressive (VAR) model on each column (location) of the matrix. This allows us to present the coefficient matrices in a unified format. To achieve dimension reduction, we decompose the folded coefficient matrix using tensor decomposition, which allows us to have reduced dimension in four directions which automatically not only reduces the number of model parameters significantly but also achieves substantial efficiency gains. We propose an alternating least squares algorithm to estimate the parameters of interest and derive the asymptotic properties of the proposed estimators for low dimension. For high dimensional setting, we propose a sparsity-inducing norms using regularized estimation techniques. An alternating least squares algorithm with sparsity inducing norms is presented. We present simulation results and a real data analysis to demonstrate the superiority of our estimators.

Tensor Computation for Data Analysis

Download Tensor Computation for Data Analysis PDF Online Free

Author :
Publisher : Springer Nature
ISBN 13 : 3030743861
Total Pages : 347 pages
Book Rating : 4.0/5 (37 download)

DOWNLOAD NOW!


Book Synopsis Tensor Computation for Data Analysis by : Yipeng Liu

Download or read book Tensor Computation for Data Analysis written by Yipeng Liu and published by Springer Nature. This book was released on 2021-08-31 with total page 347 pages. Available in PDF, EPUB and Kindle. Book excerpt: Tensor is a natural representation for multi-dimensional data, and tensor computation can avoid possible multi-linear data structure loss in classical matrix computation-based data analysis. This book is intended to provide non-specialists an overall understanding of tensor computation and its applications in data analysis, and benefits researchers, engineers, and students with theoretical, computational, technical and experimental details. It presents a systematic and up-to-date overview of tensor decompositions from the engineer's point of view, and comprehensive coverage of tensor computation based data analysis techniques. In addition, some practical examples in machine learning, signal processing, data mining, computer vision, remote sensing, and biomedical engineering are also presented for easy understanding and implementation. These data analysis techniques may be further applied in other applications on neuroscience, communication, psychometrics, chemometrics, biometrics, quantum physics, quantum chemistry, etc. The discussion begins with basic coverage of notations, preliminary operations in tensor computations, main tensor decompositions and their properties. Based on them, a series of tensor-based data analysis techniques are presented as the tensor extensions of their classical matrix counterparts, including tensor dictionary learning, low rank tensor recovery, tensor completion, coupled tensor analysis, robust principal tensor component analysis, tensor regression, logistical tensor regression, support tensor machine, multilinear discriminate analysis, tensor subspace clustering, tensor-based deep learning, tensor graphical model and tensor sketch. The discussion also includes a number of typical applications with experimental results, such as image reconstruction, image enhancement, data fusion, signal recovery, recommendation system, knowledge graph acquisition, traffic flow prediction, link prediction, environmental prediction, weather forecasting, background extraction, human pose estimation, cognitive state classification from fMRI, infrared small target detection, heterogeneous information networks clustering, multi-view image clustering, and deep neural network compression.

Statistical Learning for High-order Tensors

Download Statistical Learning for High-order Tensors PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (129 download)

DOWNLOAD NOW!


Book Synopsis Statistical Learning for High-order Tensors by : Rungang Han

Download or read book Statistical Learning for High-order Tensors written by Rungang Han and published by . This book was released on 2021 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Statistical learning for high-dimensional high-order tensor data has attracted increasing interests in recent years. The challenges arise as many classic statistical methods suffer from either statistical sub-optimality or computational limitation due to the fundamental complicated tensor algebra. In this thesis, we introduce three statistical tensor inference frameworks under different low-dimensional structures driven by real data applications: low-rankness, clustering, and smoothness. In Chapter 2, we describe a flexible framework for generalized low-rank tensor estimation problems that includes many important instances arising from applications in computational imaging, genomics, and network analysis. We propose an estimator which consists of finding a low-rank tensor fit to the data under generalized parametric models. To overcome the difficulty of non-convexity in these problems, we introduce a unified approach of projected gradient descent that adapts to the underlying low-rank structure. Under mild conditions on the loss function, we establish both an upper bound on statistical error and the linear rate of computational convergence through a general deterministic analysis. These algorithmic and theoretical results are then applied to a suite of generalized tensor estimation problems, including sub-Gaussian tensor PCA, tensor regression, and Poisson and binomial tensor PCA. We prove that the proposed algorithm achieves the minimax optimal rate of convergence in estimation error. Finally, we demonstrate the superiority of the proposed framework via extensive experiments on both simulated and real data. The main target of Chapter 3 is to propose a statistical tensor model for high-order clustering, which aims to identify heterogeneous substructure in multiway dataset that arises commonly in multilayer network studies. In addition to the non-convexity that widely appears in statistical tensor problem, this model is more complicated because of its discontinuous nature. In that chapter, we propose a tensor block model and the computationally efficient methods, high-order Lloyd algorithm (HLloyd) and high-order spectral clustering (HSC), for the high-order clustering task. We similarly establish the convergence of the proposed procedure, and we show that our method achieves exact clustering under some reasonable assumptions. We also give the complete characterization for the statistical-computational trade-off in high-order clustering based on three different signal-to-noise ratio regimes. Finally, we show the merits of the proposed procedures via extensive experiments on both synthetic and real datasets. Finally, Chapter 4 introduces the functional tensor singular value decomposition model (FTSVD), a novel dimension reduction framework for tensors with one functional mode and several tabular modes. This problem is motivated by high-order longitudinal data analysis. Our model assumes the observed data to be a random realization of an approximate CP low-rank functional tensor measured on a discrete time grid. Incorporating tensor algebra and the theory of Reproducing Kernel Hilbert Space (RKHS), we propose a novel RKHS-based constrained power iteration with spectral initialization. Our method can successfully estimate both singular vectors and functions of the low-rank structure in the observed data. With mild assumptions, we establish the non-asymptotic contractive error bounds for the proposed algorithm. We also perform extensive experiments on both simulated and real data to illustrate the superiority of the proposed framework.

Synthetic Data and Generative AI

Download Synthetic Data and Generative AI PDF Online Free

Author :
Publisher : Elsevier
ISBN 13 : 0443218560
Total Pages : 410 pages
Book Rating : 4.4/5 (432 download)

DOWNLOAD NOW!


Book Synopsis Synthetic Data and Generative AI by : Vincent Granville

Download or read book Synthetic Data and Generative AI written by Vincent Granville and published by Elsevier. This book was released on 2024-01-25 with total page 410 pages. Available in PDF, EPUB and Kindle. Book excerpt: Synthetic Data and Generative AI covers the foundations of machine learning, with modern approaches to solving complex problems and the systematic generation and use of synthetic data. Emphasis is on scalability, automation, testing, optimizing, and interpretability (explainable AI). For instance, regression techniques – including logistic and Lasso – are presented as a single method, without using advanced linear algebra. Confidence regions and prediction intervals are built using parametric bootstrap, without statistical models or probability distributions. Models (including generative models and mixtures) are mostly used to create rich synthetic data to test and benchmark various methods. Emphasizes numerical stability and performance of algorithms (computational complexity) Focuses on explainable AI/interpretable machine learning, with heavy use of synthetic data and generative models, a new trend in the field Includes new, easier construction of confidence regions, without statistics, a simple alternative to the powerful, well-known XGBoost technique Covers automation of data cleaning, favoring easier solutions when possible Includes chapters dedicated fully to synthetic data applications: fractal-like terrain generation with the diamond-square algorithm, and synthetic star clusters evolving over time and bound by gravity

Discovery of Latent Factors in High-dimensional Data Using Tensor Methods

Download Discovery of Latent Factors in High-dimensional Data Using Tensor Methods PDF Online Free

Author :
Publisher :
ISBN 13 : 9781339834047
Total Pages : 261 pages
Book Rating : 4.8/5 (34 download)

DOWNLOAD NOW!


Book Synopsis Discovery of Latent Factors in High-dimensional Data Using Tensor Methods by : Furong Huang

Download or read book Discovery of Latent Factors in High-dimensional Data Using Tensor Methods written by Furong Huang and published by . This book was released on 2016 with total page 261 pages. Available in PDF, EPUB and Kindle. Book excerpt: Unsupervised learning aims at the discovery of hidden structure that drives the observations in the real world. It is essential for success in modern machine learning and artificial intelligence. Latent variable models are versatile in unsupervised learning and have applications in almost every domain, e.g., social network analysis, natural language processing, computer vision and computational biology. Training latent variable models is challenging due to the non-convexity of the likelihood objective function. An alternative method is based on the spectral decomposition of low order moment matrices and tensors. This versatile framework is guaranteed to estimate the correct model consistently. My thesis spans both theoretical analysis of tensor decomposition framework and practical implementation of various applications.This thesis presents theoretical results on convergence to globally optimal solution of tensor decomposition using the stochastic gradient descent, despite non-convexity of the objective. This is the first work that gives global convergence guarantees for the stochastic gradient descent on non-convex functions with exponentially many local minima and saddle points.This thesis also presents large-scale deployment of spectral methods (matrix and tensor decomposition) carried out on CPU, GPU and Spark platforms. Dimensionality reduction techniques such as random projection are incorporated for a highly parallel and scalable tensor decomposition algorithm. We obtain a gain in both accuracies and in running times by several orders of magnitude compared to the state-of-art variational methods.To solve real world problems, more advanced models and learning algorithms are proposed. After introducing tensor decomposition framework under latent Dirichlet allocation (LDA) model, this thesis discusses generalization of LDA model to mixed membership stochastic block model for learning hidden user commonalities or communities in social network, convolutional dictionary model for learning phrase templates and word-sequence embeddings, hierarchical tensor decomposition and latent tree structure model for learning disease hierarchy in healthcare analytics, and spatial point process mixture model for detecting cell types in neuroscience.

High-Dimensional Methodologies for Sufficient Dimension Reduction, Discriminant Analysis, and Tensor Data

Download High-Dimensional Methodologies for Sufficient Dimension Reduction, Discriminant Analysis, and Tensor Data PDF Online Free

Author :
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (135 download)

DOWNLOAD NOW!


Book Synopsis High-Dimensional Methodologies for Sufficient Dimension Reduction, Discriminant Analysis, and Tensor Data by : Jing Zeng

Download or read book High-Dimensional Methodologies for Sufficient Dimension Reduction, Discriminant Analysis, and Tensor Data written by Jing Zeng and published by . This book was released on 2022 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt: Thanks to the advancement of data-collecting technology in brain imaging, genomics, financial econometrics, and machine learning, scientific data tend to grow in both size and structural complexity, which are not amenable to traditional statistical analysis. In this dissertation, we developed novel high-dimensional methodologies for dimension reduction, discriminant analysis, and tensor data. In the first chapter, we proposed a unified framework, called subspace estimation with automatic dimension and variable selection (SEAS), to extend many existing low-dimensional sufficient dimension reduction (SDR) methods to the high-dimensional setting. The flexibility of SEAS considerably widens the application scope of many SDR methods. Our proposal only relies on a double-penalized convex formulation, which can be solved efficiently. From the theoretical perspective, we established a satisfactory convergence rate for our proposal, which is optimal in a minimax sense. In the second chapter, we established a population model for the reduced-rank linear discriminant analysis (LDA) problem, which arises naturally in many scenarios. We also developed an efficient algorithm and derived the non-asymptotic results in the high-dimensional setting. In the last chapter, we studied how two data modalities associate and interact with each other given a third modality, which is a crucial problem in multimodal integrative analysis but has no available statistical solution. We formulated this problem as a tensor decomposition problem and proposed a novel generalized liquid association analysis (GLAA) method. A high-order orthogonal iteration algorithm is provided accordingly. Furthermore, we established the non-asymptotic results for the proposed estimators.

Multilinear Subspace Learning

Download Multilinear Subspace Learning PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1439857245
Total Pages : 298 pages
Book Rating : 4.4/5 (398 download)

DOWNLOAD NOW!


Book Synopsis Multilinear Subspace Learning by : Haiping Lu

Download or read book Multilinear Subspace Learning written by Haiping Lu and published by CRC Press. This book was released on 2013-12-11 with total page 298 pages. Available in PDF, EPUB and Kindle. Book excerpt: Due to advances in sensor, storage, and networking technologies, data is being generated on a daily basis at an ever-increasing pace in a wide range of applications, including cloud computing, mobile Internet, and medical imaging. This large multidimensional data requires more efficient dimensionality reduction schemes than the traditional techniques. Addressing this need, multilinear subspace learning (MSL) reduces the dimensionality of big data directly from its natural multidimensional representation, a tensor. Multilinear Subspace Learning: Dimensionality Reduction of Multidimensional Data gives a comprehensive introduction to both theoretical and practical aspects of MSL for the dimensionality reduction of multidimensional data based on tensors. It covers the fundamentals, algorithms, and applications of MSL. Emphasizing essential concepts and system-level perspectives, the authors provide a foundation for solving many of today’s most interesting and challenging problems in big multidimensional data processing. They trace the history of MSL, detail recent advances, and explore future developments and emerging applications. The book follows a unifying MSL framework formulation to systematically derive representative MSL algorithms. It describes various applications of the algorithms, along with their pseudocode. Implementation tips help practitioners in further development, evaluation, and application. The book also provides researchers with useful theoretical information on big multidimensional data in machine learning and pattern recognition. MATLAB® source code, data, and other materials are available at www.comp.hkbu.edu.hk/~haiping/MSL.html

Handbook of Regression Methods

Download Handbook of Regression Methods PDF Online Free

Author :
Publisher : CRC Press
ISBN 13 : 1351650742
Total Pages : 507 pages
Book Rating : 4.3/5 (516 download)

DOWNLOAD NOW!


Book Synopsis Handbook of Regression Methods by : Derek Scott Young

Download or read book Handbook of Regression Methods written by Derek Scott Young and published by CRC Press. This book was released on 2018-10-03 with total page 507 pages. Available in PDF, EPUB and Kindle. Book excerpt: Handbook of Regression Methods concisely covers numerous traditional, contemporary, and nonstandard regression methods. The handbook provides a broad overview of regression models, diagnostic procedures, and inference procedures, with emphasis on how these methods are applied. The organization of the handbook benefits both practitioners and researchers, who seek either to obtain a quick understanding of regression methods for specialized problems or to expand their own breadth of knowledge of regression topics. This handbook covers classic material about simple linear regression and multiple linear regression, including assumptions, effective visualizations, and inference procedures. It presents an overview of advanced diagnostic tests, remedial strategies, and model selection procedures. Finally, many chapters are devoted to a diverse range of topics, including censored regression, nonlinear regression, generalized linear models, and semiparametric regression. Features Presents a concise overview of a wide range of regression topics not usually covered in a single text Includes over 80 examples using nearly 70 real datasets, with results obtained using R Offers a Shiny app containing all examples, thus allowing access to the source code and the ability to interact with the analyses