Communication Acoustics

An Introduction to Speech, Audio and Psychoacoustics

Inbunden, Engelska, 2015

Av Ville Pulkki, Matti Karjalainen, Finland) Pulkki, Ville (University of Aalto, Finland) Karjalainen, Matti (University of Aalto, Pulkki

1 319 kr

Beställningsvara. Skickas inom 10-15 vardagar

Fri frakt för medlemmar vid köp för minst 249 kr.

In communication acoustics, the communication channel consists of a sound source, a channel (acoustic and/or electric) and finally the receiver: the human auditory system, a complex and intricate system that shapes the way sound is heard. Thus, when developing techniques in communication acoustics, such as in speech, audio and aided hearing, it is important to understand the time–frequency–space resolution of hearing. This book facilitates the reader’s understanding and development of speech and audio techniques based on our knowledge of the auditory perceptual mechanisms by introducing the physical, signal-processing and psychophysical background to communication acoustics. It then provides a detailed explanation of sound technologies where a human listener is involved, including audio and speech techniques, sound quality measurement, hearing aids and audiology.Key features: Explains perceptually-based audio: the authors take a detailed but accessible engineering perspective on sound and hearing with a focus on the human place in the audio communications signal chain, from psychoacoustics and audiology to optimizing digital signal processing for human listening.Presents a wide overview of speech, from the human production of speech sounds and basics of phonetics to major speech technologies, recognition and synthesis of speech and methods for speech quality evaluation.Includes MATLAB examples that serve as an excellent basis for the reader’s own investigations into communication acoustics interaction schemes which intuitively combine touch, vision and voice for lifelike interactions.

Produktinformation

Utgivningsdatum2015-01-30
Mått180 x 254 x 28 mm
Vikt816 g
FormatInbunden
SpråkEngelska
Antal sidor464
FörlagJohn Wiley & Sons Inc
ISBN9781118866542

Tillhör följande kategorier

Ville Pulkki, Department of Signal Processing and Acoustics, University of Aalto, FinlandProfessor Pulkki is currently affiliated to the Department of Signal Processing and Acoustics at the University of Aalto, Finland where he leads the Spatial Sound Research Group. He is an AES Fellow. Professor Pulkki was General Chair of the AES 45th International Conference on Applications of Time-Frequency Processing in Audio (2012), and he is Associate Technical Editor of the Journal of the Audio Engineering Society.Matti Karjalainen, Department of Signal Processing and Acoustics, University of Aalto, FinlandProfessor Karjalainen was previously Head of the Laboratory of Acoustics and Audio Signal Processing at Helsinki University of Technology which now forms part of the Department of Signal Processing and Acoustics at the University of Aalto, Finland. Professor Karjalainen had long term cooperation with companies such as Nokia and loudspeaker manufacturer Genelec. For his scientific and educational merits in audio signal processing, he received the Audio Engineering Society Fellowship in 1999, the AES Silver Medal in 2006, and the IEEE Fellowship in 2009. He published over 350 scientific and technical publications. Professor Karjalainen passed away in May 2010.

About the Authors xix Preface xxiPreface to the Unfinished Manuscript of the Book xxiiiIntroduction 11 How to Study and Develop Communication Acoustics 71.1 Domains of Knowledge 71.2 Methodology of Research and Development 81.3 Systems Approach to Modelling 101.4 About the Rest of this Book 121.5 Focus of the Book 121.6 Intended Audience 13References 142 Physics of Sound 152.1 Vibration and Wave Behaviour of Sound 152.1.1 From Vibration to Waves 162.1.2 A Simple Vibrating System 162.1.3 Resonance 182.1.4 Complex Mass–Spring Systems 192.1.5 Modal Behaviour 202.1.6 Waves 212.2 Acoustic Measures and Quantities 232.2.1 Sound and Voice as Signals 232.2.2 Sound Pressure 242.2.3 Sound Pressure Level 242.2.4 Sound Power 252.2.5 Sound Intensity 252.2.6 Computation with Amplitude and Level Quantities 252.3 Wave Phenomena 262.3.1 Spherical Waves 262.3.2 Plane Waves and the Wave Field in a Tube 272.3.3 Wave Propagation in Solid Materials 292.3.4 Reflection, Absorption, and Refraction 312.3.5 Scattering and Diffraction 322.3.6 Doppler Effect 332.4 Sound in Closed Spaces: Acoustics of Rooms and Halls 342.4.1 Sound Field in a Room 342.4.2 Reverberation 362.4.3 Sound Pressure Level in a Room 372.4.4 Modal Behaviour of Sound in a Room 382.4.5 Computational Modelling of Closed Space Acoustics 39Summary 41Further Reading 41References 413 Signal Processing and Signals 433.1 Signals 433.1.1 Sounds as Signals 433.1.2 Typical Signals 453.2 Fundamental Concepts of Signal Processing 463.2.1 Linear and Time-Invariant Systems 463.2.2 Convolution 473.2.3 Signal Transforms 483.2.4 Fourier Analysis and Synthesis 493.2.5 Spectrum Analysis 503.2.6 Time–Frequency Representations 533.2.7 Filter Banks 543.2.8 Auto- and Cross-Correlation 553.2.9 Cepstrum 563.3 Digital Signal Processing (DSP) 563.3.1 Sampling and Signal Conversion 563.3.2 Z Transform 573.3.3 Filters as LTI Systems 583.3.4 Digital Filtering 583.3.5 Linear Prediction 593.3.6 Adaptive Filtering 623.4 Hidden Markov Models 623.5 Concepts of Intelligent and Learning Systems 63Summary 64Further Reading 64References 644 Electroacoustics and Responses of Audio Systems 674.1 Electroacoustics 674.1.1 Loudspeakers 674.1.2 Microphones 704.2 Audio System Responses 714.2.1 Measurement of System Response 714.2.2 Ideal Reproduction of Sound 724.2.3 Impulse Response and Magnitude Response 724.2.4 Phase Response 744.2.5 Non-Linear Distortion 754.2.6 Signal-to-Noise Ratio 764.3 Response Equalization 76Summary 77Further Reading 78References 785 Human Voice 795.1 Speech Production 795.1.1 Speech Production Mechanism 805.1.2 Vocal Folds and Phonation 805.1.3 Vocal and Nasal Tract and Articulation 825.1.4 Lip Radiation Measurements 845.2 Units and Notation of Speech used in Phonetics 845.2.1 Vowels 865.2.2 Consonants 865.2.3 Prosody and Suprasegmental Features 885.3 Modelling of Speech Production 905.3.1 Glottal Modelling 925.3.2 Vocal Tract Modelling 925.3.3 Articulatory Synthesis 945.3.4 Formant Synthesis 955.4 Singing Voice 96Summary 96Further Reading 97References 976 Musical Instruments and Sound Synthesis 996.1 Acoustic Instruments 996.1.1 Types of Musical Instruments 996.1.2 Resonators in Instruments 1006.1.3 Sources of Excitation 1026.1.4 Controlling the Frequency of Vibration 1036.1.5 Combining the Excitation and Resonant Structures 1046.2 Sound Synthesis in Music 1046.2.1 Envelope of Sounds 1056.2.2 Synthesis Methods 1066.2.3 Synthesis of Plucked String Instruments with a One-Dimensional Physical Model 107Summary 108Further Reading 108References 1087 Physiology and Anatomy of Hearing 1117.1 Global Structure of the Ear 1117.2 External Ear 1127.3 Middle Ear 1137.4 Inner Ear 1157.4.1 Structure of the Cochlea 1157.4.2 Passive Cochlear Processing 1177.4.3 Active Function of the Cochlea 1197.4.4 The Inner Hair Cells 1227.4.5 Cochlear Non-Linearities 1227.5 Otoacoustic Emissions 1237.6 Auditory Nerve 1237.6.1 Information Transmission using the Firing Rate 1247.6.2 Phase Locking 1267.7 Auditory Nervous System 1277.7.1 Structure of the Auditory Pathway 1277.7.2 Studying Brain Function 1297.8 Motivation for Building Computational Models of Hearing 130Summary 131Further Reading 131References 1318 The Approach and Methodology of Psychoacoustics 1338.1 Sound Events versus Auditory Events 1338.2 Psychophysical Functions 1358.3 Generation of Sound Events 1358.3.1 Synthesis of Sound Signals 1368.3.2 Listening Set-up and Conditions 1378.3.3 Steering Attention to Certain Details of An Auditory Event 1378.4 Selection of Subjects for Listening Tests 1388.5 What are We Measuring? 1388.5.1 Thresholds 1388.5.2 Scales and Categorization of Percepts 1408.5.3 Numbering Scales in Listening Tests 1418.6 Tasks for Subjects 1418.7 Basic Psychoacoustic Test Methods 1428.7.1 Method of Constant Stimuli 1438.7.2 Method of Limits 1438.7.3 Method of Adjustment 1438.7.4 Method of Tracking 1448.7.5 Direct Scaling Methods 1448.7.6 Adaptive Staircase Methods 1448.8 Descriptive Sensory Analysis 1458.8.1 Verbal Elicitation 1478.8.2 Non-Verbal Elicitation 1488.8.3 Indirect Elicitation 1488.9 Psychoacoustic Tests from the Point of View of Statistics 149Summary 149Further Reading 150References 1509 Basic Function of Hearing 1539.1 Effective Hearing Area 1539.1.1 Equal Loudness Curves 1559.1.2 Sound Level and its Measurement 1569.2 Spectral Masking 1569.2.1 Masking by Noise 1579.2.2 Masking by Pure Tones 1599.2.3 Masking by Complex Tones 1599.2.4 Other Masking Phenomena 1619.3 Temporal Masking 1619.4 Frequency Selectivity of Hearing 1639.4.1 Psychoacoustic Tuning Curves 1649.4.2 ERB Bandwidths 1669.4.3 Bark, ERB, and Greenwood Scales 167Summary 169Further Reading 169References 16910 Basic Psychoacoustic Quantities 17110.1 Pitch 17110.1.1 Pitch Strength and Frequency Range 17110.1.2 JND of Pitch 17210.1.3 Pitch Perception versus Duration of Sound 17310.1.4 Mel Scale 17410.1.5 Logarithmic Pitch Scale and Musical Scale 17510.1.6 Detection Threshold of Pitch Change and Frequency Modulation 17610.1.7 Pitch of Coloured Noise 17610.1.8 Repetition Pitch 17710.1.9 Virtual Pitch 17810.1.10 Pitch of Non-Harmonic Complex Sounds 17810.1.11 Pitch Theories 17810.1.12 Absolute Pitch 17910.2 Loudness 17910.2.1 Loudness Determination Experiments 17910.2.2 Loudness Level 18010.2.3 Loudness of a Pure Tone 18010.2.4 Loudness of Broadband Signals 18210.2.5 Excitation Pattern, Specific Loudness, and Loudness 18310.2.6 Difference Threshold of Loudness 18510.2.7 Loudness versus Duration of Sound 18710.3 Timbre 18810.3.1 Timbre of Steady-State Sounds 18910.3.2 Timbre of Sound Including Modulations 18910.4 Subjective Duration of Sound 189Summary 191Further Reading 191References 19111 Further Analysis in Hearing 19311.1 Sharpness 19311.2 Detection of Modulation and Sound Onset 19511.2.1 Fluctuation Strength 19511.2.2 Impulsiveness 19711.3 Roughness 19811.4 Tonality 20011.5 Discrimination of Changes in Signal Magnitude and Phase Spectra 20111.5.1 Adaptation to the Magnitude Spectrum 20111.5.2 Perception of Phase and Time Differences 20211.6 Psychoacoustic Concepts and Music 20611.6.1 Sensory Consonance and Dissonance 20611.6.2 Intervals, Scales, and Tuning in Music 20811.6.3 Rhythm, Tempo, Bar, and Measure 21111.7 Perceptual Organization of Sound 21211.7.1 Segregation of Sound Sources 21311.7.2 Sound Streaming and Auditory Scene Analysis 214Summary 216Further Reading 217References 21712 Spatial Hearing 21912.1 Concepts and Definitions for Spatial Hearing 21912.1.1 Basic Concepts 21912.1.2 Coordinate Systems for Spatial Hearing 22112.2 Head-Related Acoustics 22212.3 Localization Cues 22612.3.1 Interaural Time Difference 22712.3.2 Interaural Level Difference 22812.3.3 Interaural Coherence 23112.3.4 Cues to Resolve the Direction on the Cone of Confusion 23212.3.5 Interaction Between Spatial Hearing and Vision 23412.4 Localization Accuracy 23512.4.1 Localization in the Horizontal Plane 23512.4.2 Localization in the Median Plane 23612.4.3 3D Localization 23712.4.4 Perception of the Distribution of a Spatially Extended Source 23812.5 Directional Hearing in Enclosed Spaces 23912.5.1 Precedence Effect 23912.5.2 Adaptation to the Room Effect in Localization 24012.6 Binaural Advantages in Timbre Perception 24112.6.1 Binaural Detection and Unmasking 24112.6.2 Binaural Decolouration 24312.7 Perception of Source Distance 24312.7.1 Cues for Distance Perception 24412.7.2 Accuracy of Distance Perception 245Summary 246Further Reading 246References 24613 Auditory Modelling 24913.1 Simple Psychoacoustic Modelling with DFT 25013.1.1 Computation of the Auditory Spectrum through DFT 25013.2 Filter Bank Models 25513.2.1 Modelling the Outer and Middle Ear 25513.2.2 Gammatone Filter Bank and Auditory Nerve Responses 25613.2.3 Level-Dependent Filter Banks 25613.2.4 Envelope Detection and Temporal Dynamics 25813.3 Cochlear Models 26013.3.1 Basilar Membrane Models 26013.3.2 Hair-Cell Models 26113.4 Modelling of Higher-Level Systemic Properties 26313.4.1 Analysis of Pitch and Periodicity 26313.4.2 Modelling of Loudness Perception 26513.5 Models of Spatial Hearing 26513.5.1 Delay-Network-Based Models of Binaural Hearing 26513.5.2 Equalization Cancellation and ILD Models 26813.5.3 Count-Comparison Models 26813.5.4 Models of Localization in the Median Plane 27013.6 Matlab Examples 27013.6.1 Filter-Bank Model with Autocorrelation-Based Pitch Analysis 27013.6.2 Binaural Filter-Bank Model with Cross-Correlation-Based ITDAnalysis 272Summary 274Further Reading 274References 27414 Sound Reproduction 27714.1 Need for Sound Reproduction 27714.2 Audio Content Production 27914.3 Listening Set-ups 28014.3.1 Loudspeaker Set-ups 28014.3.2 Listening Room Acoustics 28214.3.3 Audiovisual Systems 28314.3.4 Auditory-Tactile Systems 28414.4 Recording Techniques 28414.4.1 Monophonic Techniques 28514.4.2 Spot Microphone Technique 28514.4.3 Coincident Microphone Techniques for Two-Channel Stereophony 28614.4.4 Spaced Microphone Techniques for Two-Channel Stereophony 28614.4.5 Spaced Microphone Techniques for Multi-Channel Loudspeaker Systems 28714.4.6 Coincident Recording for Multi-Channel Set-up with Ambisonics 28714.4.7 Non-Linear Time–Frequency-domain Reproduction of Spatial Sound 29014.5 Virtual Source Positioning 29314.5.1 Amplitude Panning 29314.5.2 Amplitude Panning in a Stereophonic Set-up 29414.5.3 Amplitude Panning in Horizontal Multi-Channel Loudspeaker Set-ups 29514.5.4 3D Amplitude Panning 29514.5.5 Virtual Source Positioning using Ambisonics 29614.5.6 Wave Field Synthesis 29614.5.7 Time Delay Panning 29714.5.8 Synthesizing the Width of Virtual Sources 29814.6 Binaural Techniques 29814.6.1 Listening to Binaural Recordings with Headphones 29914.6.2 HRTF Processing for Headphone Listening 29914.6.3 Virtual Listening of Loudspeakers with Headphones 30014.6.4 Headphone Listening to Two-Channel Stereophonic Content 30114.6.5 Binaural Techniques with Cross-Talk-Cancelled Loudspeakers 30114.7 Digital Audio Effects 30214.8 Reverberators 30314.8.1 Using Room Impulse Responses in Reverberators 30414.8.2 DSP Structures for Reverberators 305Summary 306Further Reading and Available Toolboxes 306References 30715 Time–Frequency-domain Processing and Coding of Audio 31115.1 Basic Techniques and Concepts for Time–Frequency Processing 31115.1.1 Frame-Based Processing 31115.1.2 Downsampled Filter-Bank Processing 31315.1.3 Modulation with Tone Sequences 31515.1.4 Aliasing 31615.2 Time–Frequency Transforms 31715.2.1 Short-Time Fourier Transform (STFT) 31815.2.2 Alias-Free STFT 32015.2.3 Modified Discrete Cosine Transform (MDCT) 32115.2.4 Pseudo-Quadrature Mirror Filter (PQMF) Bank 32315.2.5 Complex QMF 32315.2.6 Sub-Sub-Band Filtering of the Complex QMF Bands 32515.2.7 Stochastic Measures of Time–Frequency Signals 32515.2.8 Decorrelation 32715.3 Time–Frequency-Domain Audio-Processing Techniques 32815.3.1 Masking-Based Audio Coding 32815.3.2 Audio Coding with Spectral Band Replication 32815.3.3 Parametric Stereo, MPEG Surround, and Spatial Audio Object Coding 32915.3.4 Stereo Upmixing and Enhancement for Loudspeakers and Headphones 330Summary 332Further Reading 332References 33216 Speech Technologies 33516.1 Speech Coding 33616.2 Text-to-Speech Synthesis 33816.2.1 Early Knowledge-Based Text-to-Speech (TTS) Synthesis 33916.2.2 Unit-Selection Synthesis 34016.2.3 Statistical Parametric Synthesis 34216.3 Speech Recognition 345Summary 346Further Reading 347References 34717 Sound Quality 34917.1 Historical Background of Sound Quality 35017.2 The Many Facets of Sound Quality 35117.3 Systemic Framework for Sound Quality 35217.4 Subjective Sound Quality Measurement 35317.4.1 Mean Opinion Score 35317.4.2 MUSHRA 35417.5 Audio Quality 35617.5.1 Monaural Quality 35617.5.2 Perceptual Measures and Models for Monaural Audio Quality 35617.5.3 Spatial Audio Quality 35917.6 Quality of Speech Communication 36017.6.1 Subjective Methods and Measures 36117.6.2 Objective Methods and Measures 36217.7 Measuring Speech Understandability with the Modulation Transfer Function 36317.7.1 Modulation Transfer Function 36317.7.2 Speech Transmission Index STI 36717.7.3 STI and Speech Intelligibility 36817.7.4 Practical Measurement of STI 36917.8 Objective Speech Quality Measurement for Telecommunication 37017.8.1 General Speech Quality Measurement Techniques 37117.8.2 Measurement of the Perceptual Effect of Background Noise 37217.8.3 Measurement of the Perceptual Effect of Echoes 37317.9 Sound Quality in Auditoria and Concert Halls 37417.9.1 Subjective Measures 37417.9.2 Objective Measures 37517.9.3 Percentage of Consonant Loss 37717.10 Noise Quality 37717.11 Product Sound Quality 378Summary 380Further Reading 380References 38018 Other Audio Applications 38318.1 Virtual Reality and Game Audio Engines 38318.2 Sonic Interaction Design 38618.3 Computational Auditory Scene Analysis, CASA 38718.4 Music Information Retrieval 38718.5 Miscellaneous Applications 389Summary 390Further Reading 390References 39019 Technical Audiology 39319.1 Hearing Impairments and Disabilities 39319.1.1 Key Terminology 39419.1.2 Classification of Hearing Impairments 39519.1.3 Causes for Hearing Impairments 39619.2 Symptoms and Consequences of Hearing Impairments 39619.2.1 Hearing Threshold Shift 39719.2.2 Distortion and Decrease in Discrimination 39819.2.3 Speech Communication Problems 40019.2.4 Tinnitus 40019.3 The Effect of Noise on Hearing 40119.3.1 Noise 40119.3.2 Formation of Noise-Induced Hearing Loss 40219.3.3 Temporary Threshold Shift 40219.3.4 Hearing Protection 40419.4 Audiometry 40519.4.1 Pure-Tone Audiometry 40519.4.2 Bone-Conduction Audiometry 40619.4.3 Speech Audiometry 40619.4.4 Sound-Field Audiometry 40719.4.5 Tympanometry 40719.4.6 Otoacoustic Emissions 40819.4.7 Neural Responses 40919.5 Hearing Aids 40919.5.1 Types of Hearing Aids 40919.5.2 Signal Processing in Hearing Aids 41019.5.3 Transmission Systems and Assistive Listening Devices 41419.6 Implantable Hearing Solutions 41419.6.1 Cochlear Implants 41419.6.2 Electric-Acoustic Stimulation 41619.6.3 Bone-Anchored Hearing Aids 41619.6.4 Middle-Ear Implants 416Summary 416Further Reading 417References 417Index 419