Conference Information
INTERSPEECH 2025: Conference of the International Speech Communication Association
https://www.interspeech2025.org
Submission Date:
2025-02-12
Notification Date:
2025-05-21
Conference Date:
2025-08-17
Location:
Rotterdam, the Netherlands
Years:
26
CCF: c   Viewed: 252618   Tracked: 57   Attend: 12

Call For Papers
It is with great pleasure that we invite you to the 26th edition of the Interspeech Conference, to be held August 17-21, 2025, in Rotterdam, The Netherlands.

The theme of Interspeech 2025 is "Fair and Inclusive Speech Science and Technology", and aims to celebrate and incorporate the vast speech diversity both within and between individuals, as well as within and between languages. This theme emphasizes diversity as a source of richness and information, contributing to the development of fairer, more robust, and personalized speech technology applications and more accurate theories of human speech processing. By embracing this diversity, we can create theories and technologies that are inclusive of everyone, ensuring that speech science and technology benefit all individuals and communities.

We look forward to welcoming you to Rotterdam for a conference filled with insightful discussions, innovative research, and opportunities to advance the fields of speech science and technology and make it fairer, more inclusive, and more equitable.

Interspeech 2025 will delve into four specific strands, each addressing critical aspects of speech science and technology that foster inclusivity and fairness. These strands are:

    Factors Arising from the Individual in Human Speech Processing
        Exploration of individual differences in speech processing.
        Understanding how personal factors influence speech perception and production.
        Development of personalized speech technology applications.

    Under-Researched Languages, Dialects, and Accents
        Focus on linguistic diversity and the inclusion of under-researched languages and dialects.
        Efforts to develop speech technologies that accommodate a wide range of accents.
        Promotion of research that highlights the richness of global linguistic diversity.

    Inclusive Technology for Atypical Speech Communication
        Development of multi-modal and adaptive systems for atypical speech communication.
        Creation of technologies that assist individuals with speech impairments.
        Emphasis on inclusivity in speech technology to support all communication needs.

    Ethical Considerations about Fairness, Inclusion, and Democratization of Speech Technologies
        Examination of ethical issues related to speech technology development and deployment.
        Discussion on ensuring fairness and inclusion in speech technology.
        Strategies for democratizing access to speech technology to benefit all societal segments.

By addressing these strands, Interspeech 2025 aims to advance speech science and technology in a manner that is equitable, inclusive, and respectful of individual and linguistic diversity.
Last updated by Dou Sun in 2024-10-23
Acceptance Ratio
YearSubmittedAcceptedAccepted(%)
2021199096348.4%
2019185591449.3%
2018132074956.7%
2017158279950.5%
2016154177950.6%
2015145874351%
Best Papers
YearBest Papers
2019Deep feature for text dependent speaker verification
2019A survey on the application of recurrent neural networks to statistical language modeling
2019A domain independent statistical methodology for dialog management in spoken dialog systems
2019Off the cuff: Exploring extemporaneous speech delivery with TTS
2019Language Modeling with Deep Transformers
2019Evaluating Near End Listening Enhancement Algorithms in Realistic Environments
2019Adversarially Trained End-to-end Korean Singing Voice Synthesis System
2018Detecting Depression with Audio/Text Sequence Modeling of Interviews
2018Multi-Modal Data Augmentation for End-to-end ASR
2018Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator
2018Evaluating the intelligibility benefit of speech modifications in know noise condition
2018The PASCAL CHiME speech separation and recognition challenge
2018Automatic speaker age and gender recognition using acoustic and prosodic level information fusion
2017Ranked WordNet graph for Sentiment Polarity Classification in Twitter
2017Characterization of a typical vocal source excitation, temporal dynamics and prosody for objective measurement of dysarthric word intelligibility
2017Speech enhancement using a minimum mean-square error short-time spectral modulation magnitude estimator
2017Vocal-Tract Model with Static Articulators: Lips, Teeth, Tongue, and More
2017VoxCeleb: A large-scale speaker identification dataset
2017Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks
2017Hidden Markof Model Variational Autoencoder for Acoustic Unit Discovery
2016Multitaper MFCC and PLP features for speaker verification using i-vectors
2016Expressive Singing Synthesis Based on Unit Selection for the Singing Synthesis Challenge 2016
2016An Engine for Online Video Search in Large Archives of the Holocaust Testimonies
2016Characterizing Vocal Tract Dynamics Across Speakers Using Real-Time MRI
2016GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis
2016The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception
2016Turn-taking cues in task-oriented dialogue
2015A System for Automatic Broadcast News Summarization, Geolocation and Translation
2015Remeetings - Get More Out of Meetings
2015Adapting Machine Translation Models towards Misrecognized Speech with Test-to-Speech Pronunciation Rules and Acoustic Confusability
2015A Time Delay Neural Network Architecture for Efficient Modeling of Lang Temporal Context
2015Objective Intelligibility Assessment of Text-to-Speech Systems Systems through Utterance Verfication
2015Silent Speech Interfaces
2015Bayesian update of dialogue state: A POMDP framework for spoken dialogue systems
2014The subspace Gaussian mixture model—A structured model for speech recognition
2014Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor
2014I2R Speech2Singing Perfects Everyone's Singing
2014Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR
2014Word-level Invariant Representations From Acoustic Waveforms
2014Speech synthesis in various communicative situations: Impact of pronunciation variations
2013Speaker and Noise Independent Voice Activity Detection
2013The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management
2013Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
2013Using text and acoustic features to diagnose progressive aphasia and its subtypes
2013A two-step technique for MRI audio enhancement using dictionary learning and wavelet packet analysis
2012Discriminative n-gram language modeling
2012PEAKS – A system for the automatic evaluation of voice and speech disorders
2012Age Estimation from Telephone Speech using i-vectors
2012MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors
2012Discriminatively learning factorized finite state pronunciation models from dynamic Bayesian networks
2011Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model
2011One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space
2011Modelling Novelty Preference in Word Learning
2011Data Driven Emotion Conversion in Spoken English
2011Support Vector Machine for Speaker and Language Recognition
2010Partially observable Markov decision processes for spoken dialog systems
2010Joint-sequence models for grapheme-to-phoneme conversion
2010Reliable Tracking Based on Speech Sample Salience of Vocal Cycle Length Perturbations
2010Using Non-Native Error Patterns to Improve Pronunciation Verification"
2010Did you say susi or shushi? Measuring the Emergence of Robust Fricative Contrasts in English- and Japanese-Acquiring Children
2009Automatic speech recognition and speech variability: A Review
2009On the Semi-Supervised Learning of Multi-Layered Perceptrons
2009A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis
2009Sequencing of Articulatory Gestures using Cost Optimization
2008Combining Continuous Progressive Model Adaptation and Factor Analysis for Speaker Verification
2008On the equivalence of gaussian and Log-linear HMMS
2008Effect of Intonational Phrase Boundaries on Pitch-Accented Syllables > in American English
2008Decoding speech in the presence of other sources
2007Combining active and semi-supervised learning for spoken language understanding
2007An Empirical Investigation of the Nonuniqueness in the Acoustic-to-Articulatory Mapping
2007Speech Recognition Techniques for a Sign Language Recognition System
2006Statistical language model adaptation: review and perspectives
2006Acoustic cues for the classification of regular and irregular phonation
2006Soft Margin Estimation of Hidden Markov Model Parameters
2006Detecting Question-Bearing Turns in Spoken Tutorial Dialogues
2005On the Integration of Speech Recognition and Statistical Machine Translation
2005Multi-class composite N-gram language model
2004The LIMSI Broadcast News Transcription System
2004Hot Discussion or Frosty Dialogue? Towards a Temperature Metric for Conversational Interactivity
2003The auditory organization of speech and other sources in listeners and computational models
2003CRA-BF: A Novel Combined Fixed/Adaptive Beamforming for Robust Speech Recognition in Real Car Environments
2002Language-independent and language-adaptive acoustic modeling for speech recognition
2002Motor Specifications of a Baby Robot via the Analysis of Infants' Vocalizations
2001Network Optimizations for Large-Vobabulary Speech Recognition
2001Split-band Perceptual Harmonic Cepstral Coefficients as Acoustic Features for Speech Recognition
1999Multi-level decision trees for static and dynamic pronunciation models
1999Combining nonlocal, syntactic and n-gram dependencies in language modeling
1999Unsupervised training of a speech recognizer: recent experiments
1997Statistical language modeling using the CMU-Cambridge toolkit
1997Subword unit representations for spoken document retrieval
Related Conferences
CCFCOREQUALISShortFull NameSubmissionNotificationConference
cba2ISCCIEEE symposium on Computers and Communications2026-02-012026-03-202026-06-23
cba2ICCInternational Conference on Communications2025-10-132026-01-122026-05-24
cbb1AsiaCCSACM ASIA Conference on Computer and Communications Security2025-12-122026-03-102026-06-01
bb1IWCMCInternational Wireless Communications and Mobile Computing Conference2026-01-152026-03-302026-06-01
cICCC'International Conference on Communications in China2025-03-242025-06-152025-08-10
aa*a1INFOCOMInternational Conference on Computer Communications2025-07-242025-12-082026-05-18
aa*a1CCSACM Conference on Computer and Communications Security2026-04-222026-07-172026-11-15
bbb1SECONIEEE International Conference on Sensing, Communication and Networking2025-12-152026-02-272026-04-27
cb2ICTInternational Conference on Telecommunications2025-03-142025-03-192025-04-28
cINTERSPEECHConference of the International Speech Communication Association2025-02-122025-05-212025-08-17