NSF DLI-DEL

Advancing Linguistic Research through Speech Technologies

NSF Award #2450839 • 2025-2027

Advancing Linguistic Research through Speech Technologies

This project develops speech recognition and processing tools specifically designed for French-based creole languages, with initial focus on Guadeloupean Kréyòl. The research addresses the technological disparity that leaves many creole languages without adequate digital tools, hindering both linguistic research and community access to speech technologies. By creating specialized ASR systems, we aim to accelerate language documentation processes and make creole languages more accessible in the digital age.

NSF #2450839 2025-2027 $268,067
3
Target Languages
100+
Hours of Speech Data
30%
WER
2027
Project Completion

Technology Focus: Automatic Speech Recognition (ASR)

This project addresses the critical gap in speech technologies for under-resourced languages by developing specialized ASR systems for French-based creole languages. By combining linguistic expertise with computational methods, we aim to create tools that support both language documentation and community access.

Technical Objectives

  • Develop ASR models specifically for French-based creoles
  • Create annotated speech corpora for training and testing
  • Optimize models for low-resource language conditions
  • Integrate linguistic knowledge into model architecture
  • Develop user-friendly interfaces for researchers and communities

Target Languages

  • Guadeloupean Kréyòl
  • Mauritian Kreol
  • Martinican Kréyòl

Expected Deliverables

  • Specialized ASR system for Guadeloupean Kréyòl with >85% accuracy
  • Annotated speech corpus of 100+ hours
  • Open-source software tools for creole language processing
  • Documentation and tutorials for researchers
  • Community workshops on using speech technologies

Methodological Approach

  • Data collection from diverse speaker populations
  • Phonetic and phonological analysis for model tuning
  • Transfer learning from French ASR systems
  • Neural network architecture optimization
  • Community feedback and iterative development

Research Team

Dr. Fabiola Henri
Principal Investigator
University at Buffalo
Dr. Eric Le Ferrand
Post Doctoral Researcher, ASR Development
University at Buffalo

Project Timeline

Year 1 (2025--2026)

Speech data collection and corpus development. Initial ASR model prototyping. Community engagement and needs assessment in Guadeloupe.

Year 2 (2026--2027)

Model training and optimization. Development of user interfaces. Preliminary testing and evaluation with linguistic researchers. Expansion to Haitian Kreyòl.

Impact and Significance

This project addresses a critical need in both linguistic research and language preservation. By developing speech technologies specifically for creole languages, we:

  • Accelerate language documentation processes through automated transcription
  • Make creole languages more accessible in digital contexts
  • Support language preservation efforts in creole-speaking communities
  • Advance computational linguistics methods for low-resource languages
  • Create infrastructure that supports future research and applications
View NSF Award Page Back to Research View NSF-IRES Project