Nonlinear Compensation And Heterogeneous Data Modeling For Robust Speech Recognition

Download Nonlinear Compensation And Heterogeneous Data Modeling For Robust Speech Recognition full books in PDF, epub, and Kindle. Read online Nonlinear Compensation And Heterogeneous Data Modeling For Robust Speech Recognition ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!

Nonlinear Compensation and Heterogeneous Data Modeling for Robust Speech Recognition

Author : Yong Zhao
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (858 download)

DOWNLOAD NOW!

Book Synopsis Nonlinear Compensation and Heterogeneous Data Modeling for Robust Speech Recognition by : Yong Zhao

Download or read book Nonlinear Compensation and Heterogeneous Data Modeling for Robust Speech Recognition written by Yong Zhao and published by . This book was released on 2013 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: The goal of robust speech recognition is to maintain satisfactory recognition accuracy under mismatched operating conditions. This dissertation addresses the robustness issue from two directions. In the first part of the dissertation, we propose the Gauss-Newton method as a unified approach to estimating noise parameters for use in prevalent nonlinear compensation models, such as vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT), for noise-robust speech recognition. While iterative estimation of noise means in a generalized EM framework has been widely known, we demonstrate that such approaches are variants of the Gauss-Newton method. Furthermore, we propose a novel noise variance estimation algorithm that is consistent with the Gauss-Newton principle. The formulation of the Gauss-Newton method reduces the noise estimation problem to determining the Jacobians of the corrupted speech parameters. For sampling-based compensations, we present two methods, sample Jacobian average (SJA) and cross-covariance (XCOV), to evaluate these Jacobians. The Gauss-Newton method is closely related to another noise estimation approach, which views the model compensation from a generative perspective, giving rise to an EM-based algorithm analogous to the ML estimation for factor analysis (EM-FA). We demonstrate a close connection between these two approaches: they belong to the family of gradient-based methods except with different convergence rates. Note that the convergence property can be crucial to the noise estimation in many applications where model compensation may have to be frequently carried out in changing noisy environments to retain desired performance. Furthermore, several techniques are explored to further improve the nonlinear compensation approaches. To overcome the demand of the clean speech data for training acoustic models, we integrate nonlinear compensation with adaptive training. We also investigate the fast VTS compensation to improve the noise estimation efficiency, and combine the VTS compensation with acoustic echo cancellation (AEC) to mitigate issues due to interfering background speech. The proposed noise estimation algorithm is evaluated for various compensation models on two tasks. The first is to fit a GMM model to artificially corrupted samples, the second is to perform speech recognition on the Aurora 2 database, and the third is on a speech corpus simulating the meeting of multiple competing speakers. The significant performance improvements confirm the efficacy of the Gauss-Newton method to estimating the noise parameters of the nonlinear compensation models. The second research work is devoted to developing more effective models to take full advantage of heterogeneous speech data, which are typically collected from thousands of speakers in various environments via different transducers. The proposed synchronous HMM, in contrast to the conventional HMMs, introduces an additional layer of substates between the HMM state and the Gaussian component variables. The substates have the capability to register long-span non-phonetic attributes, such as gender, speaker identity, and environmental condition, which are integrally called speech scenes in this study. The hierarchical modeling scheme allows an accurate description of probability distribution of speech units in different speech scenes. To address the data sparsity problem in estimating parameters of multiple speech scene sub-models, a decision-based clustering algorithm is presented to determine the set of speech scenes and to tie the substate parameters, allowing us to achieve an excellent balance between modeling accuracy and robustness. In addition, by exploiting the synchronous relationship among the speech scene sub-models, we propose the multiplex Viterbi algorithm to efficiently decode the synchronous HMM within a search space of the same size as for the standard HMM. The multiplex Viterbi can also be generalized to decode an ensemble of isomorphic HMM sets, a problem often arising in the multi-model systems. The experiments on the Aurora 2 task show that the synchronous HMMs produce a significant improvement in recognition performance over the HMM baseline at the expense of a moderate increase in the memory requirement and computational complexity.

Robust Speech Recognition of Uncertain or Missing Data

Author : Dorothea Kolossa
Publisher : Springer Science & Business Media
ISBN 13 : 3642213170
Total Pages : 387 pages
Book Rating : 4.6/5 (422 download)

DOWNLOAD NOW!

Book Synopsis Robust Speech Recognition of Uncertain or Missing Data by : Dorothea Kolossa

Download or read book Robust Speech Recognition of Uncertain or Missing Data written by Dorothea Kolossa and published by Springer Science & Business Media. This book was released on 2011-07-14 with total page 387 pages. Available in PDF, EPUB and Kindle. Book excerpt: Automatic speech recognition suffers from a lack of robustness with respect to noise, reverberation and interfering speech. The growing field of speech recognition in the presence of missing or uncertain input data seeks to ameliorate those problems by using not only a preprocessed speech signal but also an estimate of its reliability to selectively focus on those segments and features that are most reliable for recognition. This book presents the state of the art in recognition in the presence of uncertainty, offering examples that utilize uncertainty information for noise robustness, reverberation robustness, simultaneous recognition of multiple speech signals, and audiovisual speech recognition. The book is appropriate for scientists and researchers in the field of speech recognition who will find an overview of the state of the art in robust speech recognition, professionals working in speech recognition who will find strategies for improving recognition results in various conditions of mismatch, and lecturers of advanced courses on speech processing or speech recognition who will find a reference and a comprehensive introduction to the field. The book assumes an understanding of the fundamentals of speech recognition using Hidden Markov Models.

Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition

Author : Mark J. Harvilla
Publisher :
ISBN 13 :
Total Pages : 0 pages
Book Rating : 4.:/5 (14 download)

DOWNLOAD NOW!

Book Synopsis Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition by : Mark J. Harvilla

Download or read book Compensation for Nonlinear Distortion in Noise for Robust Speech Recognition written by Mark J. Harvilla and published by . This book was released on 2014 with total page 0 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Model Compensation Methods for Robust Speech Recognition

Author : Stephen Mingyu Chu
Publisher :
ISBN 13 :
Total Pages : 66 pages
Book Rating : 4.:/5 (435 download)

DOWNLOAD NOW!

Book Synopsis Model Compensation Methods for Robust Speech Recognition by : Stephen Mingyu Chu

Download or read book Model Compensation Methods for Robust Speech Recognition written by Stephen Mingyu Chu and published by . This book was released on 1999 with total page 66 pages. Available in PDF, EPUB and Kindle. Book excerpt:

New Era for Robust Speech Recognition

Author : Shinji Watanabe
Publisher : Springer
ISBN 13 : 331964680X
Total Pages : 433 pages
Book Rating : 4.3/5 (196 download)

DOWNLOAD NOW!

Book Synopsis New Era for Robust Speech Recognition by : Shinji Watanabe

Download or read book New Era for Robust Speech Recognition written by Shinji Watanabe and published by Springer. This book was released on 2017-10-30 with total page 433 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field. This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.

Nonlinear Speech Modeling and Applications

Author : Gerard Chollet
Publisher : Springer
ISBN 13 : 3540318860
Total Pages : 444 pages
Book Rating : 4.5/5 (43 download)

DOWNLOAD NOW!

Book Synopsis Nonlinear Speech Modeling and Applications by : Gerard Chollet

Download or read book Nonlinear Speech Modeling and Applications written by Gerard Chollet and published by Springer. This book was released on 2005-07-12 with total page 444 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book presents the revised tutorial lectures given at the International Summer School on Nonlinear Speech Processing-Algorithms and Analysis held in Vietri sul Mare, Salerno, Italy in September 2004. The 14 revised tutorial lectures by leading international researchers are organized in topical sections on dealing with nonlinearities in speech signals, acoustic-to-articulatory modeling of speech phenomena, data driven and speech processing algorithms, and algorithms and models based on speech perception mechanisms. Besides the tutorial lectures, 15 revised reviewed papers are included presenting original research results on task oriented speech applications.

Advances in Non-Linear Modeling for Speech Processing

Author : Raghunath S. Holambe
Publisher : Springer Science & Business Media
ISBN 13 : 1461415055
Total Pages : 109 pages
Book Rating : 4.4/5 (614 download)

DOWNLOAD NOW!

Book Synopsis Advances in Non-Linear Modeling for Speech Processing by : Raghunath S. Holambe

Download or read book Advances in Non-Linear Modeling for Speech Processing written by Raghunath S. Holambe and published by Springer Science & Business Media. This book was released on 2012-02-21 with total page 109 pages. Available in PDF, EPUB and Kindle. Book excerpt: Advances in Non-Linear Modeling for Speech Processing includes advanced topics in non-linear estimation and modeling techniques along with their applications to speaker recognition. Non-linear aeroacoustic modeling approach is used to estimate the important fine-structure speech events, which are not revealed by the short time Fourier transform (STFT). This aeroacostic modeling approach provides the impetus for the high resolution Teager energy operator (TEO). This operator is characterized by a time resolution that can track rapid signal energy changes within a glottal cycle. The cepstral features like linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are computed from the magnitude spectrum of the speech frame and the phase spectra is neglected. To overcome the problem of neglecting the phase spectra, the speech production system can be represented as an amplitude modulation-frequency modulation (AM-FM) model. To demodulate the speech signal, to estimation the amplitude envelope and instantaneous frequency components, the energy separation algorithm (ESA) and the Hilbert transform demodulation (HTD) algorithm are discussed. Different features derived using above non-linear modeling techniques are used to develop a speaker identification system. Finally, it is shown that, the fusion of speech production and speech perception mechanisms can lead to a robust feature set.

Robust Automatic Speech Recognition

Author : Jinyu Li
Publisher : Academic Press
ISBN 13 : 0128026162
Total Pages : 308 pages
Book Rating : 4.1/5 (28 download)

DOWNLOAD NOW!

Book Synopsis Robust Automatic Speech Recognition by : Jinyu Li

Download or read book Robust Automatic Speech Recognition written by Jinyu Li and published by Academic Press. This book was released on 2015-10-30 with total page 308 pages. Available in PDF, EPUB and Kindle. Book excerpt: Robust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environmental distortion. It provides a thorough overview of classical and modern noise-and reverberation robust techniques that have been developed over the past thirty years, with an emphasis on practical methods that have been proven to be successful and which are likely to be further developed for future applications.The strengths and weaknesses of robustness-enhancing speech recognition techniques are carefully analyzed. The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided.The reader will: Gain a unified, deep and systematic understanding of the state-of-the-art technologies for robust speech recognition Learn the links and relationship between alternative technologies for robust speech recognition Be able to use the technology analysis and categorization detailed in the book to guide future technology development Be able to develop new noise-robust methods in the current era of deep learning for acoustic modeling in speech recognition The first book that provides a comprehensive review on noise and reverberation robust speech recognition methods in the era of deep neural networks Connects robust speech recognition techniques to machine learning paradigms with rigorous mathematical treatment Provides elegant and structural ways to categorize and analyze noise-robust speech recognition techniques Written by leading researchers who have been actively working on the subject matter in both industrial and academic organizations for many years

Robust Speech Recognition Using Neural Networks and Hidden Markov Models

Author : DongSuk Yuk
Publisher :
ISBN 13 :
Total Pages : 212 pages
Book Rating : 4.:/5 (838 download)

DOWNLOAD NOW!

Book Synopsis Robust Speech Recognition Using Neural Networks and Hidden Markov Models by : DongSuk Yuk

Download or read book Robust Speech Recognition Using Neural Networks and Hidden Markov Models written by DongSuk Yuk and published by . This book was released on 1999 with total page 212 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Robust Speech Recognition of Uncertain or Missing Data

Author : Dorothea Kolossa
Publisher : Springer
ISBN 13 : 9783642213182
Total Pages : 380 pages
Book Rating : 4.2/5 (131 download)

DOWNLOAD NOW!

Book Synopsis Robust Speech Recognition of Uncertain or Missing Data by : Dorothea Kolossa

Download or read book Robust Speech Recognition of Uncertain or Missing Data written by Dorothea Kolossa and published by Springer. This book was released on 2013-01-02 with total page 380 pages. Available in PDF, EPUB and Kindle. Book excerpt: Automatic speech recognition suffers from a lack of robustness with respect to noise, reverberation and interfering speech. The growing field of speech recognition in the presence of missing or uncertain input data seeks to ameliorate those problems by using not only a preprocessed speech signal but also an estimate of its reliability to selectively focus on those segments and features that are most reliable for recognition. This book presents the state of the art in recognition in the presence of uncertainty, offering examples that utilize uncertainty information for noise robustness, reverberation robustness, simultaneous recognition of multiple speech signals, and audiovisual speech recognition. The book is appropriate for scientists and researchers in the field of speech recognition who will find an overview of the state of the art in robust speech recognition, professionals working in speech recognition who will find strategies for improving recognition results in various conditions of mismatch, and lecturers of advanced courses on speech processing or speech recognition who will find a reference and a comprehensive introduction to the field. The book assumes an understanding of the fundamentals of speech recognition using Hidden Markov Models.

Graphical Models for Robust Speech Recognition in Adverse Environments

Author : Steven John Rennie
Publisher :
ISBN 13 : 9780494579183
Total Pages : 538 pages
Book Rating : 4.5/5 (791 download)

DOWNLOAD NOW!

Book Synopsis Graphical Models for Robust Speech Recognition in Adverse Environments by : Steven John Rennie

Download or read book Graphical Models for Robust Speech Recognition in Adverse Environments written by Steven John Rennie and published by . This book was released on 2008 with total page 538 pages. Available in PDF, EPUB and Kindle. Book excerpt: Robust speech recognition in acoustic environments that contain multiple speech sources and/or complex non-stationary noise is a difficult problem, but one of great practical interest. The formalism of probabilistic graphical models constitutes a relatively new and very powerful tool for better understanding and extending existing models, learning, and inference algorithms; and a bedrock for the creative, quasi-systematic development of new ones. In this thesis a collection of new graphical models and inference algorithms for robust speech recognition are presented.Finally, the problem of speech recognition in speech using a single microphone is treated. The Iroquois system for multi-talker speech separation and recognition is presented. The system won the 2006 Pascal International Speech Separation Challenge, and amazingly, achieved super-human recognition performance on a majority of test cases in the task. The result marks a significant first in automatic speech recognition, and a milestone in computing.The problem of speech separation using multiple microphones is first treated. A family of variational algorithms for tractably combining multiple acoustic models of speech with observed sensor likelihoods is presented. The algorithms recover high quality estimates of the speech sources even when there are more sources than microphones, and have improved upon the state-of-the-art in terms of SNR gain by over 10 dB.Next the problem of background compensation in non-stationary acoustic environments is treated. A new dynamic noise adaptation (DNA) algorithm for robust noise compensation is presented, and shown to outperform several existing state-of-the-art front-end denoising systems on the new DNA + Aurora II and Aurora II-M extensions of the Aurora II task.

Graphical Models for Robust Speech Recognition in Adverse Environments

Author : Steven J. Rennie
Publisher :
ISBN 13 :
Total Pages : pages
Book Rating : 4.:/5 (13 download)

DOWNLOAD NOW!

Book Synopsis Graphical Models for Robust Speech Recognition in Adverse Environments by : Steven J. Rennie

Download or read book Graphical Models for Robust Speech Recognition in Adverse Environments written by Steven J. Rennie and published by . This book was released on with total page pages. Available in PDF, EPUB and Kindle. Book excerpt:

Nonlinear Feature Transformations for Noise Robust Speech Recognition

Author : Shajith Ikbal
Publisher :
ISBN 13 :
Total Pages : 153 pages
Book Rating : 4.:/5 (853 download)

DOWNLOAD NOW!

Book Synopsis Nonlinear Feature Transformations for Noise Robust Speech Recognition by : Shajith Ikbal

Download or read book Nonlinear Feature Transformations for Noise Robust Speech Recognition written by Shajith Ikbal and published by . This book was released on 2004 with total page 153 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Environment Mismatch Compensation Methods for Robust Speech Recognition

Author : Abhishek Kumar
Publisher :
ISBN 13 :
Total Pages : 86 pages
Book Rating : 4.:/5 (467 download)

DOWNLOAD NOW!

Book Synopsis Environment Mismatch Compensation Methods for Robust Speech Recognition by : Abhishek Kumar

Download or read book Environment Mismatch Compensation Methods for Robust Speech Recognition written by Abhishek Kumar and published by . This book was released on 2009 with total page 86 pages. Available in PDF, EPUB and Kindle. Book excerpt:

HMM Compensation for Robust Speech Recognition in Different Environments

Author : Michael Berkovitch
Publisher :
ISBN 13 :
Total Pages : 172 pages
Book Rating : 4.:/5 (129 download)

DOWNLOAD NOW!

Book Synopsis HMM Compensation for Robust Speech Recognition in Different Environments by : Michael Berkovitch

Download or read book HMM Compensation for Robust Speech Recognition in Different Environments written by Michael Berkovitch and published by . This book was released on 2009 with total page 172 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Signal Modeling for Robust Speech Recognition with Frequency Warping and Convex Optimization

Author : Yoon Kim
Publisher :
ISBN 13 :
Total Pages : 276 pages
Book Rating : 4.:/5 (79 download)

DOWNLOAD NOW!

Book Synopsis Signal Modeling for Robust Speech Recognition with Frequency Warping and Convex Optimization by : Yoon Kim

Download or read book Signal Modeling for Robust Speech Recognition with Frequency Warping and Convex Optimization written by Yoon Kim and published by . This book was released on 2000 with total page 276 pages. Available in PDF, EPUB and Kindle. Book excerpt:

Multi-modal and Deep Learning for Robust Speech Recognition

Author : Xue Feng (Ph. D.)
Publisher :
ISBN 13 :
Total Pages : 115 pages
Book Rating : 4.:/5 (12 download)

DOWNLOAD NOW!

Book Synopsis Multi-modal and Deep Learning for Robust Speech Recognition by : Xue Feng (Ph. D.)

Download or read book Multi-modal and Deep Learning for Robust Speech Recognition written by Xue Feng (Ph. D.) and published by . This book was released on 2017 with total page 115 pages. Available in PDF, EPUB and Kindle. Book excerpt: Automatic speech recognition (ASR) decodes speech signals into text. While ASR can produce accurate word recognition in clean environments, system performance can degrade dramatically when noise and reverberation are present. In this thesis, speech denoising and model adaptation for robust speech recognition were studied, and four novel methods were introduced to improve ASR robustness. First, we developed an ASR system using multi-channel information from microphone arrays via accurate speaker tracking with Kalman filtering and subsequent beamforming. The system was evaluated on the publicly available Reverb Challenge corpus, and placed second (out of 49 submitted systems) in the recognition task on real data. Second, we explored a speech feature denoising and dereverberation method via deep denoising autoencoders (DDA). The method was evaluated on the CHiME2-WSJ0 corpus and achieved a 16% to 25% absolute improvement in word error rate (WER) compared to the baseline. Third, we developed a method to incorporate heterogeneous multi-modal data with a deep neural network (DNN) based acoustic model. Our experiments on a noisy vehicle-based speech corpus demonstrated that WERs can be reduced by 6.3% relative to the baseline system. Finally, we explored the use of a low-dimensional environmentally-aware feature derived from the total acoustic variability space. Two extraction methods are presented: one via linear discriminant analysis (LDA) projection, and the other via a bottleneck deep neural network (BN-DNN). Our evaluations showed that by adapting ASR systems with the proposed feature, ASR performance was significantly improved. We also demonstrated that the proposed feature yielded promising results on environment identification tasks.