Theory and Applications of Digital Speech Processing

Häftad, Engelska, 2010

3 749 kr

Slutsåld

Theory and Applications of Digital Speech Processing is ideal for graduate students in digital signal processing, and undergraduate students in Electrical and Computer Engineering. With its clear, up-to-date, hands-on coverage of digital speech processing, this text is also suitable for practicing engineers in speech processing.

This new text presents the basic concepts and theories of speech processing with clarity and currency, while providing hands-on computer-based laboratory experiences for students. The material is organized in a manner that builds a strong foundation of basics first, and then concentrates on a range of signal processing methods for representing and processing the speech signal.

Produktinformation

Utgivningsdatum2010-05-25
Mått184 x 239 x 42 mm
Vikt1 680 g
FormatHäftad
SpråkEngelska
Antal sidor1 056
Upplaga1
FörlagPearson Education
ISBN9780136034285

Tillhör följande kategorier

Systemvetenskap och AI inom Data och it
Energiteknik inom Naturvetenskap och teknik

Lawrence Rabiner was born in Brooklyn, New York, on September 28, 1943. He received the S.B., and S.M. degrees simultaneously in June 1964, and the Ph.D. degree in Electrical Engineering in June 1967, all from the Massachusetts Institute of Technology, Cambridge Massachusetts. From 1962 through 1964, Dr. Rabiner participated in the cooperative program in Electrical Engineering at AT&T Bell Laboratories, Whippany and Murray Hill, New Jersey. During this period Dr. Rabiner worked on designing digital circuitry, issues in military communications problems, and problems in binaural hearing. Dr. Rabiner joined AT&T Bell Labs in 1967 as a Member of the Technical Staff. He was promoted to Supervisor in 1972, Department Head in 1985, Director in 1990, and Functional Vice President in 1995. He joined the newly created AT&T Labs in 1996 as Director of the Speech and Image Processing Services Research Lab, and was promoted to Vice President of Research in 1998 where he managed a broad research program in communications, computing, and information sciences technologies. Dr. Rabiner retired from AT&T at the end of March 2002 and is now a Professor of Electrical and Computer Engineering at Rutgers University, and the Associate Director of the Center for Advanced Information Processing (CAIP) at Rutgers. Dr. Rabiner is co-author of the books “ Theory and Application of Digital Signal Processing” (Prentice- Hall, 1975), “ Digital Processing of Speech Signals” (Prentice-Hall, 1978), “Multirate Digital Signal Processing” (Prentice-Hall, 1983), and “ Fundamentals of Speech Recognition” (Prentice-Hall, 1993). Dr. Rabiner is a member of Eta Kappa Nu, Sigma Xi, Tau Beta Pi, the National Academy of Engineering, the National Academy of Sciences, and a Fellow of the Acoustical Society of America, the IEEE, Bell Laboratories, and AT&T. He is a former President of the IEEE Acoustics, Speech, and Signal Processing Society, a former Vice-President of the Acoustical Society of America, a former editor of the ASSP Transactions, and a former member of the IEEE Proceedings Editorial Board. Ronald W. Schafer is an electrical engineer notable for his contributions to digital signal processing. After receiving his Ph.D. degree at MIT in 1968, he joined the Acoustics Research Department at Bell Laboratories, where he did research on digital signal processing and digital speech coding. He came to the Georgia Institute of Technology in 1974, where he stayed until joining Hewlett Packard in March 2005. He has served as Associate Editor of IEEE Transactions on Acoustics, Speech, and Signal Processing and as Vice-President and President of the IEEE Signal Processing Society. He is a Life Fellow of the IEEE and a Fellow of the Acoustical Society of America. He has received the IEEE Region 3 Outstanding Engineer Award, the 1980 IEEE Emanuel R. Piore Award, the Distinguished Professor Award at the Georgia Institute of Technology, the 1992 IEEE Education Medal and the 2010 IEEE Jack S. Kilby Signal Processing Medal.

CHAPTER 1 Introduction to Digital Speech Processing 11.1 The Speech Signal 31.2 The Speech Stack 81.3 Applications of Digital Speech Processing 101.4 Comment on the References 151.5 Summary 17 CHAPTER 2 Review of Fundamentals of Digital Signal Processing 182.1 Introduction 182.2 Discrete-Time Signals and Systems 182.3 Transform Representation of Signals and Systems 222.4 Fundamentals of Digital Filters 332.5 Sampling 442.6 Summary 56Problems 56 CHAPTER 3 Fundamentals of Human Speech Production 673.1 Introduction 673.2 The Process of Speech Production 683.3 Short-Time Fourier Representation of Speech 813.4 Acoustic Phonetics 863.5 Distinctive Features of the Phonemes of American English 1083.6 Summary 110Problems 110 CHAPTER 4 Hearing, Auditory Models, and Speech Perception 1244.1 Introduction 1244.2 The Speech Chain 1254.3 Anatomy and Function of the Ear 1274.4 The Perception of Sound 1334.5 Auditory Models 1504.6 Human Speech Perception Experiments 1584.7 Measurement of Speech Quality and Intelligibility 1624.8 Summary 166Problems 167 CHAPTER 5 Sound Propagation in the Human Vocal Tract 1705.1 The Acoustic Theory of Speech Production 1705.2 Lossless Tube Models 2005.3 Digital Models for Sampled Speech Signals 2195.4 Summary 228Problems 228 CHAPTER 6 Time-Domain Methods for Speech Processing 2396.1 Introduction 2396.2 Short-Time Analysis of Speech 2426.3 Short-Time Energy and Short-Time Magnitude 2486.4 Short-Time Zero-Crossing Rate 2576.5 The Short-Time Autocorrelation Function 2656.6 The Modified Short-Time Autocorrelation Function 2736.7 The Short-Time Average Magnitude Difference Function 2756.8 Summary 277Problems 278 CHAPTER 7 Frequency-Domain Representations 2877.1 Introduction 2877.2 Discrete-Time Fourier Analysis 2897.3 Short-Time Fourier Analysis 2927.4 Spectrographic Displays 3127.5 Overlap Addition Method of Synthesis 3197.6 Filter Bank Summation Method of Synthesis 3317.7 Time-Decimated Filter Banks 3407.8 Two-Channel Filter Banks 3487.9 Implementation of the FBS Method Using the FFT 3587.10 OLA Revisited 3657.11 Modifications of the STFT 3677.12 Summary 379Problems 380 CHAPTER 8 The Cepstrum and Homomorphic Speech Processing 3998.1 Introduction 3998.2 Homomorphic Systems for Convolution 4018.3 Homomorphic Analysis of the Speech Model 4178.4 Computing the Short-Time Cepstrum and Complex Cepstrumof Speech 4298.5 Homomorphic Filtering of Natural Speech 4408.6 Cepstrum Analysis of All-Pole Models 4568.7 Cepstrum Distance Measures 4598.8 Summary 466Problems 466 CHAPTER 9 Linear Predictive Analysis of Speech Signals 4739.1 Introduction 4739.2 Basic Principles of Linear Predictive Analysis 4749.3 Computation of the Gain for the Model 4869.4 Frequency Domain Interpretations of Linear PredictiveAnalysis 4909.5 Solution of the LPC Equations 5059.6 The Prediction Error Signal 5279.7 Some Properties of the LPC Polynomial A(z) 5389.8 Relation of Linear Predictive Analysis to Lossless Tube Models 5469.9 Alternative Representations of the LP Parameters 5519.10 Summary 560Problems 560 CHAPTER 10 Algorithms for Estimating Speech Parameters 57810.1 Introduction 57810.2 Median Smoothing and Speech Processing 58010.3 Speech-Background/Silence Discrimination 58610.4 A Bayesian Approach to Voiced/Unvoiced/Silence Detection 59510.5 Pitch Period Estimation (Pitch Detection) 60310.6 Formant Estimation 63510.7 Summary 645Problems 645 CHAPTER 11 Digital Coding of Speech Signals 66311.1 Introduction 66311.2 Sampling Speech Signals 66711.3 A Statistical Model for Speech 66911.4 Instantaneous Quantization 67611.5 Adaptive Quantization 70611.6 Quantizing of Speech Model Parameters 71811.7 General Theory of Differential Quantization 73211.8 Delta Modulation 74311.9 Differential PCM (DPCM) 75911.10 Enhancements for ADPCM Coders 76811.11 Analysis-by-Synthesis Speech Coders 78311.12 Open-Loop Speech Coders 80611.13 Applications of Speech Coders 81411.14 Summary 819Problems 820 CHAPTER 12 Frequency-Domain Coding of Speech and Audio 84212.1 Introduction 84212.2 Historical Perspective 84412.3 Subband Coding 85012.4 Adaptive Transform Coding 86112.5 A Perception Model for Audio Coding 86612.6 MPEG-1 Audio Coding Standard 88112.7 Other Audio Coding Standards 89412.8 Summary 894Problems 895 CHAPTER 13 Text-to-Speech Synthesis Methods 90713.1 Introduction 90713.2 Text Analysis 90813.3 Evolution of Speech Synthesis Methods 91413.4 Early Speech Synthesis Approaches 91613.5 Unit Selection Methods 92613.6 TTS Future Needs 94213.7 Visual TTS 94313.8 Summary 947Problems 947 CHAPTER 14 Automatic Speech Recognition and NaturalLanguage Understanding 95014.1 Introduction 95014.2 Basic ASR Formulation 95214.3 Overall Speech Recognition Process 95314.4 Building a Speech Recognition System 95414.5 The Decision Processes in ASR 95714.6 Step 3: The Search Problem 97114.7 Simple ASR System: Isolated Digit Recognition 97214.8 Performance Evaluation of Speech Recognizers 97414.9 Spoken Language Understanding 97714.10 Dialog Management and Spoken Language Generation 98014.11 User Interfaces 98314.12 Multimodal User Interfaces 98414.13 Summary 984Problems 985 AppendicesA Speech and Audio Processing Demonstrations 993B Solution of Frequency-Domain Differential Equations 1005Bibliography 1009Index 1033

Even after more than 30 years the 1978 textbook by Rabiner and Schafer still remains as one of the most comprehensive for teaching a one-semester graduate-level speech processing course. The new book manages to top that and is definitely representing an improvement over the old one. It doubles the content of the must-have classic, adds many new technology developments in recent years, and expands the application areas which have seen a tremendous growth in the last two decades. The inclusion of additional problems and MATLAB exercises along with real-world speech samples also facilitates convenient stepping stone for designing in-depth class projects. The intended course [Website] will contain demos, illustrations, and speech examples. Such a set of extra information is ideal for demonstrating key processing concepts. It will add good values to the already-appealing textbook.