University at Buffalo, State University of New York

Psychology 719

Location: Park 223

Speech Perception

Syllabus


Course Goals / Course Outline / Readings / Course_Requirements

Fall 2011

Time: Thurs 12:30-3:20pm

Instructor: James R. Sawusch

Office: 360 Park Hall

Email: jsawusch@buffalo.edu

Phone: (716) 645-0238

Office Hours:

Mon 1:00 - 3:00

or by Appointment

Course Goals

Psychology 719 explores the nature of human speech perception. The course is intended to provide students with a detailed overview of speech perception and the technology used to investigate human speech processes. The course covers the basics of speech production and speech acoustics. As part of this material, students will record and analyze a corpus of syllables and use a speech synthesizer to produce synthetic versions of syllables and speech continua. The material on perception includes classic issues such as invariance and variability in the speech signal, the nature of the perceptual representation and the segmentation of the continuous speech signal into discrete units. Classic phenomena such as categorical perception, the McGurk effect (audio/visual speech), trading relations, and the biological significance of speech are covered along with contemporary issues and theories.

Tentative Course Outline:

DATE

TOPIC

READING

Sept 1 Overview of Course, Introduction H2, #2
Sept 8 Articulatory Phonetics/Acoustic Phonetics, Speech Analysis P, H1
Sept 15 Acoustic Phonetics  (Hearing) P, H8, #7
Sept 22, Oct 6 Invariance & Variability; Vowel Space Analysis Due Oct 6 H8, #2, #7, #11, #17, #18, #19
Sept 29 No Class - Rosh Hashana  
Oct 13 Theories of Speech Perception H6, #1, H26, #3, #6, #8, #19
Oct 20 Categorical Perception, Audio-Visual Speech; Paper 1 Due H3, #13, #17
Oct 27 Talker Effects and Normalization, Speech Synthesis H15, H16, #4, #14, #16
Nov 3 (Psychonomics) Synthesis  (project due 12/8)  
Nov 10 Speaking Rate #5, #10
Nov 17 Segmentation & Units of Analysis H11, H10, #1, #2, #9, #12
Nov 24 No Class, Thanksgiving  
Dec 1 Biological Specialization, Perceptual Learning H4, H5, #8, #16
Dec 8 Higher Order Influences on Speech Perception H24, H25, #12, #15
Dec 15 Paper 2 due  
   

Readings

Two books are strongly recommended:

Pisoni, D. B. & Remez, R. E. (2005). The Handbook of Speech Perception. Malden, MA: Blackwell.

A good introduction to acoustic phonetics. Two good books are:

Johnson, K. (2012). Acoustic & Auditory Phonetics. (3rd Edition) Malden, MA: Wiley-Blackwell.

Reetz, H. and Jongman, A. (2009). Phonetics. Malden, MA: Wiley-Blackwell.

Together, the handbook (H) and one of the two phonetics books (P) will constitute about half the readings for the course. The other readings are listed below.

1. Elman, J. L. (1989) Connectionist approaches to acoustic/phonetic processing. In W. Marslen-Wilson (Ed.), Lexical Representation and Process. (pp. 227-260). Cambridge, MA: MIT Press.
2. Goldinger, S. D., Pisoni, D. B., & Luce, P. A. (1996). Speech perception and spoken word recognition: Research and theory. In N. J. Lass (Ed.), Principles of Experimental Phonetics. (pp. 277-327). St. Louis, MO: Mosby.
3. Hillenbrand, J.M. & Houde, R. A. (2003) A narrow band pattern-matching model of vowel perception. The Journal of the Acoustical Society of America, 113, 2, 1044-1055.
4. Johnson, K. (1997). Speech perception without speaker normalization: An exemplar model. In K. Johnson & J. W. Mullennix (Eds.), Talker Variability in Speech Processing. (pp. 145-166) San Diego, CA: Academic Press.
5. Kidd (1989). Articulatory-rate context effects in phoneme identification. Journal of Experimental Psychology: Human Perception and Performance, 15, 736-748.
6. Klatt, D. H. (1989). Review of selected models of speech perception. In W. Marslen-Wilson (Ed.), Lexical Representation and Process. (pp. 169-226). Cambridge, MA: MIT Press.
7. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., Studdert-Kennedy (1967). Perception of the speech code. Psychological Review, 74, 431-461.
8. Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Science, 4, 187-196.
9. Mattys, S. L., White, L., & Melhorn, J. F. (2005). Integration of multiple speech segmentation cues: A hierarchical framework. Journal of Experimental Psychology: General, 134, 477–500.
10. Miller, J. L. (1981). Effects of speaking rate on segmental distinctions. In P. D. Eimas & J. L. Miller (Eds.), Perspectives on the Study of Speech. (pp. 39-74). Hillsdale, NJ: Lawrence Erlbaum Associates.
11. Nearey, T. M. (1989). Static, dynamic, and relational properties in vowel perception.  Journal of the Acoustical Society of America85, 2088-2113.
12. Newman, R. S., Sawusch, J. R., & Luce, P. A. (1997). Lexical neighborhood effects in phonetic processing. Journal of Experimental Psychology: Human Perception and Performance, 23, 873-889.
13. Pisoni, D. B. (1973). Auditory and phonetic memory codes in the discrimination of consonants and vowels. Perception & Psychophysics, 13, 253-260.
14. Pisoni, D. B. (1997). Some thoughts on "normalization" in speech perception. In K. Johnson & J. W. Mullennix (Eds.), Talker Variability in Speech Processing. (pp. 9-32) San Diego, CA: Academic Press.
15. Samuel, A. G. (1981). The role of bottom-up confirmation in the phonemic restoration illusion. Journal of Experimental Psychology: Human Perception and Performance, 7, 1124-1131.
16. Samuel, A. G. & Kraljic, T. (2009). Perceptual learning for speech. Attention, Perception & Psychophysics, 71, 6, 1207-1218.
17. Sawusch, J. R., & Gagnon, D. A. (1995). Auditory coding, cues, and coherence in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 21, 635-652.
18. Strange, W. (1987). Information for vowels in formant transitions. Journal of Memory and Language, 26, 550-557.
19. Sussman, H. M., Fruchter, D., Hilbert, J., & Sirosh, J. (1998). Linear correlates in the speech signal: The orderly output constraint. Behavioral and Brain Sciences, 21, 241-299.

Class Requirements

The class has a lecture-discussion format. Material comes from a set of journal articles and book chapters. The course grade is based upon the arithmetic mean of the two projects (speech analysis 25%; synthesis 25%) and two papers (25% each). One project focuses on the recording and analysis of speech. Students will record a set of syllables and determine the formant frequencies of the syllables at specified points. The second project involves speech synthesis. Students will synthesize a vowel to precisely match a particular talker. They will also synthesize a consonant continuum varying in VOT. The topics for the two papers will be distributed at least two weeks in advance. For each paper, there will be two or three topics that you can choose from. Each paper is expected to be about 5-6 double spaced pages. The scores will be transformed to percentages and the fixed scale, below, will be used to determine grades. In the event that the papers/projects prove to be overly difficult and scores are low (less than 25% of the class attains an average greater than or equal to 88%), then the fixed scale cut points will be lowered. (That is, the scale will be curved.) Plus and minus grades will be given for scores in the upper and lower thirds of each grade range.

Scale:
 88 and up   A
 77 - 87     B
 66 - 76     C
 55 - 65     D
 54 and down F


For the two papers, students are expected to do their own work. For the synthesis project, students may work collaboratively in teams of two or three and the same score for the synthesis will be given to all of the team. For the speech analysis project, students may work collaboratively in teams of two or three, but the analysis results that they turn in must be for their own voice and they will be scored individually.

Students with Disabilities

If you have a disability which makes it difficult for you to carry out the course work as outlined and/or requires accommodations such as recruiting note takers, readers or extended time on exams, please contact the Office of Accessibility Resources, 25 Capen Hall, phone 645-2680. Also contact the instructor within the first two weeks of class. OAR will provide you with information and review appropriate arrangements for reasonable accommodations.

 

revised: 29-August-11