|
As a tradition, several half-day and full-day workshops will be offered October 8, the day before the technical program and exhibition begins, giving opportunity for continuing education in acoustics. If you are interested in giving a workshop, please contact the Convener. There may be workshops offered also on the afternoon of October 11. Seminar: Sound and Fire Containment: Details that work Tuesday, October 8, 2002 8:45 a.m. - 4:30 p.m.
Workshop: AUTOMATIC RECOGNITION AND SYNTHESIS OF SPEECH Tuesday, October 8, 2002, 9:00 a.m. - 5:00
p.m. The automatic conversion of conversational speech into text (and vice versa) is an interdisciplinary task involving computer science, engineering, acoustics, linguistics, and psychology. This tutorial will discuss the modern techniques of automatic speech recognition and synthesis, emphasizing the breadth of knowledge needed to approach near-human performance in these complex tasks. Morning session: Fundamentals We will first briefly examine human speech production from an acoustic-phonetic view. The standard methods of speech analysis (e.g., FFT and mel-based cepstrum) will be presented and discussed in terms of efficiency and robustness. The differences in objectives between speech coding, analysis, synthesis, and recognition will be noted. We will present the modern stochastic techniques to speech recognition (i.e., hidden Markov models), with simple examples to emphasize understanding for a non-expert audience. The issues of adequate training corpora and the many trade-offs for different practical applications will be discussed (e.g., continuous vs. isolated-word recognition; small vs. large vocabularies). The differences between read speech and conversational speech will be examined, in terms of disfluencies, variable speaking rate, and increased use of function words. The added difficulties of recognizing speech over the telephone and with hands-free terminals will be explained. Afternoon session: Recent Developments The state-of-the-art will be described , noting performance levels fo r both synthesis and recognition systems. Applications will be discussed in terms of cost, usability, trainability, and performance. Quality is the main performance criterion for synthesis, while recognition has many trade-offs (speaker dependence, memory size, real-time response). The importance of appropriate language models for recognition will be emphasized, with both basic N-gram models and more complex class-based and distance models discussed. We will describe the current state-of-the-art in recognition of natural speech, both commercial and research, noting where current systems do well and where they come up short. The possibilities o f integrating knowledge-based sources (e.g., aspects of expert systems) into the current stochastic approaches to speech recognition will be examined. Predictions as to the future course of speech recognition research will be made. INFORMATION ABOUT DR. OSHAUGHNESSY: Dr. O'Shaughnessy has worked in the speech communication field for 30 years, first in study at MIT (BSc and M S in 1972, PhD in 1976), then as director of a research team at INRS-Télécommunications in Montreal (one of the research centres of the Institut National de la Recherche Scientifique of the Université du Québec), in the areas of speech analysis, coding, synthesis, recognition and enhancement. After working on the MITalk synthesis project in the early 1970s, he developed one of the first French text-to-speech system in the early 1980s. His textbook "Speech Communication: Human and Machine" (Addison-Wesley, 1987, and now in second edition by IEEE Press, 2000) has been widely used. His most recent focus has been on speech recognition, where his research group publishes regularly in the ICASSP, ICSLP, and Eurospeech Proceedings. He is an associate editor for the Journal of the Acoustical Society of America and just completed a term as associate editor for the IEEE Transactions on Speech and Audio Processing. He also teaches each year as an adjunct professor in the electrical engineering dept. at McGill University. He is the General Chair for the International Conference on Acoustics Speech and Signal Processing ( ICASSP-2004) in M ontreal. Fees:Morning Session $50. Afternoon Session $50. For those registering for the entire Conference (Oct. 8 -11) the cost is only $25 for each session and $50 for both. Please contact acohen@upei.ca for further information. Seminar / Workshop on
HUMDRUM software for music analysis Complementing the music perception/cognition and audio aspects of the CAA-ACA meeting, and pending sufficient interest, Bret Aarden will offer a workshop on Music Information Processing Using the Humdrum Toolkit. to take place at UPEI campus, for hands-on computer access. The most recent issue of the Computer Music Journal features an article by David Huron reviewing the power of this tool. According to its author, David Huron (Professor of Music, Ohio State University, formerly of the University of Waterloo) Humdrum is a general-purpose software system intended to assist u sers in a variety of music relate applications. An article by Huron entitled the same as the proposed workshop is published in the most recent issue of the Computer Music Journal, Summer 2002, Vol 26:2, pp 11 - 26. Mr. Aarden is highly familiar with the Humdrum Tool-kit. The plan is to offer hands on demonstrations of the software in a PC lab of 10 computers. It would be possible for persons with M acIntosh laptops to participate as well. Software would be provided to them. To assist with funding this seminar, the workshop will cost $25 for members for registrants of the Canadian Acoustical Association meeting or $50 who those who have not otherwise registered for the CAA meeting. Before moving ahead, evidence of interest of at least 7 persons is needed. Please inform others who might be interested in this workshop possibility. Opportunities for such training in music retrieval and analysis software are rare in the Atlantic Canada region, and perhaps elsewhere, so it is hoped that there will be a positive response. Please contact acohen@upei.ca at your earliest convenience if interested. Teaching the Science of Sound / Invitation to Charlottetown physics and science teachers and studentsOctober 11, 2002 |
| | Main | |
| | Abstracts | Papers | Program | Registration | Special Sessions | Workshops | Student Participation | Accommodation | Transportation | Exhibits | CSA | Hospitality | Important Dates | Contacts | English | Français | |