ITG-Fb. 298: Speech Communication
14th ITG Conference, 29.09. - 01.10.2021, Online-Event
Diese Publikation zitieren
VDE ITG (Hg.), ITG-Fb. 298: Speech Communication (2021), VDE Verlag, Berlin, ISBN: 9783800756285
42
Accesses
Accesses
Beschreibung / Abstract
The 14th ITG conference on Speech Communication solicits contributions on theory, algorithms, and applications in the following areas of speech, audio, and spoken language processing:
• Speech Enhancement and Separation
• Source Localization and Tracking
• Automatic Speech and Speaker Recognition
• Spoken Dialogue, Diarization, andSpoken Document Retrieval Systems
• Speech Synthesis
• Speech Modeling, Coding, andTransmission
• Speech Production and Perception
• Speech and Audio Quality Assessment
• Speech Intelligibility Assessment
• Paralinguistics, Speech Diagnostics,and Speech-related Biosignals
• Speech in Automotive, Mobile, andMultimodal Applications
• Acoustic Interfaces, Assistive Devices,and Hearing Aids
• Machine Learning for Speech Processing
• Hardware and Software Tools
• Emerging Topics and Applications
• Speech Enhancement and Separation
• Source Localization and Tracking
• Automatic Speech and Speaker Recognition
• Spoken Dialogue, Diarization, andSpoken Document Retrieval Systems
• Speech Synthesis
• Speech Modeling, Coding, andTransmission
• Speech Production and Perception
• Speech and Audio Quality Assessment
• Speech Intelligibility Assessment
• Paralinguistics, Speech Diagnostics,and Speech-related Biosignals
• Speech in Automotive, Mobile, andMultimodal Applications
• Acoustic Interfaces, Assistive Devices,and Hearing Aids
• Machine Learning for Speech Processing
• Hardware and Software Tools
• Emerging Topics and Applications
Inhaltsverzeichnis
- ITG-Fachbericht 298: Speech Communication
- Titlepage
- Imprint
- Scope
- Technical Program Committee
- Contents
- Keynotes
- Session I (Talks): Speech Recognition and Synthesis
- 01 Two-Dimensional Embeddings for Low-Resource Keyword Spotting Based on Dynamic Time Warping
- 02 Multi-Head Fusion Attention for Transformer-Based End-to-End Speech Recognition
- 03 Federated Learning in ASR: Not as Easy as You Think
- Session I (Poster): Speech Recognition and Synthesis
- 04 A Comparative Pronunciation Mapping Approach Using G2P Conversion for Anglicisms in German Speech Recognition
- 05 Bilingual I-Vector Extractor for DNN Hybrid Acoustic Model Training in German Speech Recognition Systems
- 06 New Restricted Boltzmann Machines and Deep Belief Networks for Audio Classification
- 07 A Lightweight Neural TTS System for High-quality German Speech Synthesis
- 08 Automatic Speech Recognition for Dementia Screening Using ILSE-Interviews
- Session II (Talks): Localisation, Tracking and Spatial Reproduction
- 09 On Source-Microphone Distance Estimation Using Convolutional Recurrent Neural Networks
- 10 On the Use of Additional Microphones in Binaural Cue Adaptation
- 11 Microphone Utility-based Weighting for Robust Acoustic Source Localization in Wireless Acoustic Sensor Networks
- Session II (Poster): Localisation, Tracking and Spatial Reproduction
- 12 Sound Source Localisation using Neural Networks with Circular Binary Classification
- 13 2D Acoustic Source Localisation Using Decentralised Deep Neural Networks on Distributed Microphone Arrays
- 14 Data-Dependent Initialization for ECM-Based Blind Geometry Estimation of a Microphone Array Using Reverberant Speech
- 15 Binaural Speaker Localization Based on Front/Back-Beamforming and Modulation-Domain Features
- Session III (Talks): Speech Enhancement and Separation
- 16 Plosive Enhancement Using Phase Linearization and Smoothing
- 17 Speaker-conditioned Target Speaker Extraction Based on Customized LSTM Cells
- 18 Mixed Analog-digital Speech Communication for Underwater Applications
- Session III (Poster): Speech Enhancement and Separation
- 19 Speeding Up Permutation Invariant Training for Source Separation
- 20 Comparison of Generalized Sidelobe Canceller Structures Incorporating External Microphones for Joint Noise and Interferer Reduction
- 21 Feedback Cancellation for IP-based Teleconferencing Systems
- 22 Joint Reduction of Ego-noise and Environmental Noise with a Partially-adaptive Dictionary
- 23 Beam-specific System Identification
- 24 Low-Complexity Multichannel Wiener Filtering Using Ambisonic Warping
- 25 A Comparison and Combination of Unsupervised Blind Source Separation Techniques
- 26 Robust and High Gain Acoustic Feedback Compensation in the Frequency Domain with a Simple Energy-decay Operator
- 27 An Integrated Deep Clustering-Based System for Speaker Count Agnostic Speech Separation
- 28 Joint Multi-Channel Dereverberation and Noise Reduction Using a Unified Convolutional Beamformer with Sparse Priors
- 29 Reinforcement Learning-based Microphone Selection in Wireless Acoustic Sensor Networks Considering Network and Acoustic Utilities
- Session IV (Talks): Medical Applications and Analytical Studies
- 30 Supervised Speech Representation Learning for Parkinson†˜s Disease Classification
- 31 An Objective Evaluation Framework for Pathological Speech Synthesis
- 32 Informed Source Extraction with Application to Acoustic Echo Reduction
- Session IV (Poster): Medical Applications and Analytical Studies
- 33 A Cortical Model for 0-Oscillator Segmenting Syllables
- 34 Active Acoustic Equalization: Performance Bounds for Time-Invariant Systems
- 35 The Effect of Surprisal on Articulatory Gestures in Polish Consonant-to-Vowel Transitions: A Pilot EMA Study
- 36 A Data Generation Framework for Acoustic Drone Detection Algorithms
- Session V (Talks): Quality of Speech and Speech Communication Systems
- 37 Predicting Conversational Quality from Simulated Conversations with Transmission Delay
- 38 Impact of a Speaker Head Rotation on the Far-end Listening Situation
- 39 Towards Non-Intrusive Prediction of Speech Recognition Thresholds in Binaural Conditions
- Session V (Poster): Quality of Speech and Speech Communication Systems
- 40 A Database for Research on Detection and Enhancement of Speech Transmitted over HF links
- 41 Intelligibility Prediction of Speech Reconstructed From Its Magnitude or Phase
- 42 Acoustic Ambiance Simulation Using Orthogonal Loudspeaker Signals
- 43 Assessment of Listening Effort for Various Telecommunication Scenarios
- ITG-Fachberichte im Überblick
- Your opinion matters!