Author : C. Scott
Publisher : Createspace Independent Publishing Platform
ISBN 13 : 9781976274305
Total Pages : 216 pages
Book Rating : 4.2/5 (743 download)
Book Synopsis Big Data Analytics With Matlab. Segmentation Techniques by : C. Scott
Download or read book Big Data Analytics With Matlab. Segmentation Techniques written by C. Scott and published by Createspace Independent Publishing Platform. This book was released on 2017-09-11 with total page 216 pages. Available in PDF, EPUB and Kindle. Book excerpt: Big data analytics examines large amounts of data to uncover hidden patterns, correlations and other insights. With today's technology, it's possible to analyze your data and get answers from it almost immediately - an effort that's slower and less efficient with more traditional business intelligence solutions. MATLAB has the tools to work with large datasets and apply the necessary data analysis techniques. This book develops the work with Segmentation Techniques: Cluster Analysis and Parametric Classification. Cluster analysis, also called segmentation analysis or taxonomy analysis, partitions sample data into groups or clusters. Clusters are formed such that objects in the same cluster are very similar, and objects in different clusters are very distinct. Statistics and Machine Learning Toolbox provides several clustering techniques and measures of similarity (also called distance measures) to create the clusters. Additionally, cluster evaluation determines the optimal number of clusters for the data using different evaluation criteria. Cluster visualizationoptions include dendrograms and silhouette plots. Hierarchical Clustering groups data over a variety of scales by creating a cluster tree or dendrogram. The tree is not a single set of clusters, but rather a multilevel hierarchy, where clusters at one level are joined as clusters at the next level. This allows you to decide the level or scale of clustering that is most appropriate for your application. The Statistics and Machine Learning Toolbox function clusterdata performs all of the necessary steps for you. It incorporates the pdist, linkage, and cluster functions, which may be used separately for more detailed analysis. The dendrogram function plots the cluster tree. k-Means Clustering is a partitioning method. The function kmeans partitions data into k mutually exclusive clusters, and returns the index of the cluster to which it has assigned each observation. Unlike hierarchical clustering, k-means clustering operates on actual observations (rather than the larger set of dissimilarity measures), and creates a single level of clusters. The distinctions mean that k-means clustering is often more suitable than hierarchical clustering for large amounts of data. Clustering Using Gaussian Mixture Models form clusters by representing the probability density function of observed variables as a mixture of multivariate normal densities. Mixture models of the gmdistribution class use an expectation maximization (EM) algorithm to fit data, which assigns posterior probabilities to each component density with respect to each observation. Clusters are assigned by selecting the component that maximizes the posterior probability. Clustering using Gaussian mixture models is sometimes considered a soft clustering method. The posterior probabilities for each point indicate that each data point has some probability of belonging to each cluster. Like k-means clustering, Gaussian mixture modeling uses an iterative algorithm that converges to a local optimum. Gaussian mixture modeling may be more appropriate than k-means clustering when clusters have different sizes and correlation within them. Discriminant analysis is a classification method. It assumes that different classes generate data based on different Gaussian distributions. Linear discriminant analysis is also known as the Fisher discriminant, named for its inventor Classification is a type of supervised machine learning in which an algorithm "learns" to classify new observations from examples of labeled data. To explore classification models interactively, use the Classification Learner app. For greater flexibility, you can pass predictor or feature data with corresponding responses or labels to an algorithm-fitting function in the command-line interface.