Robust Emotion Recognition Using Spectral and Prosodic Features (Springerbriefs in Speech Technology) (Paperback)

Robust Emotion Recognition Using Spectral and Prosodic Features (Springerbriefs in Speech Technology) By K. Sreenivasa Rao, Shashidhar G. Koolagudi Cover Image

Robust Emotion Recognition Using Spectral and Prosodic Features (Springerbriefs in Speech Technology) (Paperback)

$54.99


Not On Our Shelves—Ships in 1-5 Days
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 1 1.2 Emotion from psychological perspective . . . . . . . . . . . . . . . . . . . .. . 2 1.3 Emotion from speech signal perspective . . . . . . . . . . . . . . .. . . . . . . 3 1.3.1 Speech production mechanism . . . . . . . . . . . . . . . . . . . . . . . . .... 4 1.3.2 Source features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.3 System features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3.4 Prosodic features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 Emotional speech databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.5 Applications of speech emotion recognition . . . . . . . . . . . . . . . . . . . 8 1.6 Issues in speech emotion recognition . . . . . . . . . . . . . . . . . . . . . . . 8 1.7 Objectives and scope of the work . . . . . . . . . . . . . . . . . . . . . . . . .. 9 1.8 Main highlights of research investigations . . . . . . . . . . . . . . . . . . . 10 1.9 Brief overview of contributions in this book . . . . . . . . . . . . . . .. . . 10 1.9.1 Emotion recognition using spectral features extracted from sub-syllabic regions and pitch synchronous analysis . . . . . . . 10 1.9.2 Emotion recognition using global and local prosodic features extracted from words and syllables . . . . . . . . . . . . . . 11 1.9.3 Emotion recognition using combination of features . . . . . . . . 11 1.9.4 Emotion recognition on real life emotional speech database . 11 1.10 Organization of the book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 Robust Emotion Recognition using Pitch Synchronous and

Sub-syllabic Spectral Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1 Introduction . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . 15 2.2 Emotional speech corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.1 Indian Institute of Technology Kharagpur-Simulated Emotional Speech Corpus: IITKGP-SESC . . . . . . . . . . . . . . . 18 2.2.2 Berlin Emotional Speech Database: Emo-DB . . . . . . . . . . . . . 20 2.3 Feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 Linear prediction cepstral coefficients (LPCCs) . . . . . . . . . . . 21 2.3.2 Mel frequency cepstral coefficients (MFCCs) . . . . . . . . . . . . . 22 2.3.3 Formant features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.4 Extraction of sub-syllabic spectral features . . . . . . . . . . . . . . . 25 2.3.5 Pitch synchronous analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.4 Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.4.1 Gaussian mixture models (GMM) . . . . . . . . . . . . . . . . . . . . . . 30 2.4.2 Auto-associative neural networks . . . . . . . . . . . . . . . . . . . . . . . 31 2.5 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3 Robust Emotion Recognition using Word and Syllable Level Prosodic Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.1 Introduction . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 45 3.2 Prosodic features: Importance in emotion recognition . . . . . . . . . . 46 3.3 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4 Extraction of global and local prosodic features . . . . . . . . . . . . . . . 51 3.4.1 Sentence level features . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 51 3.4.2 Word and syllab.

K. Sreenivasa Rao is at Indian Institute of Technology, Kharagpur, India.Shashidhar G, Koolagudi is at Graphic Era University, Dehradun, India.
Product Details ISBN: 9781461463597
ISBN-10: 1461463599
Publisher: Springer
Publication Date: January 12th, 2013
Pages: 118
Language: English
Series: Springerbriefs in Speech Technology