University of Washington
Department of Electrical Engineering

SSLI-LAB : Signal, Speech, and Language Interpretation Seminar

Spring Quarter, 2004
RM EE1-403 New EE Bldg (unless otherwise specified)
University of Washington, Seattle

Friday, 16 April 2004 (EE1 403, 1:00-2:00PM)
VOCABULARY-INDEPENDENT SEARCH IN SPONTANEOUS SPEECH
-- Dr. Frank Seide
Microsoft Research, Beijing

Abstract
For efficient organization of speech recordings - meetings, interviews, voice mails, lectures - the ability to search for spoken keywords is an essential capability. Today, most spoken-document retrieval systems use large-vocabulary recognition. For the above scenarios, such systems suffer from both the unpredictable vocabulary/domain and generally high word-error rates (WER). In this paper, we present a vocabulary-independent system to index and rapidly search spontaneous speech. A speech recognizer generates lattices of phonetic word fragments, against which keywords are matched phonetically. We will first show the need to use recognition alternatives (lattices) in a high-WER context, on a word-based baseline. Then we will introduce our new method of phonetic word-fragment lattice generation, which uses longer-span language knowledge than a phoneme recognizer. Last we will introduce heuristics to compact the lattices to feasible sizes that can be searched efficiently. On the LDC Voicemail corpus, we show that vocabulary/domain independent phonetic search is as accurate as a vocabulary/domain-dependent word-lattice based baseline system for in-vocabulary keywords (FOMs of 74-75%), but nearly maintains this accuracy also for OOV keywords.

joint work with Peng Yu, Chengyuan Ma, and Eric Chang.

Friday, 23 April 2004 (Anderson 223, 1:30-2:00PM (1.5 hrs))
Stopping Spam
-- Dr. Joshua Goodman
Microsoft Research

Abstract
Spam is a huge and growing problem. I'll first survey solutions to spam, including filtering approaches (machine learning, fuzzy hashing, and blackhole lists) and "postage" approaches, including reverse Turing tests, computational puzzles, and monetary challenges. Our favorite technique is a machine learning/text classification approach combined with a challenge/response postage approach. I'll talk about problems and solutions we've had in practice, especially how we have gotten millions of messages of labeled training data, both good and spam. I'll also talk briefly about my research on personalizing spam filters, which turns out to be important, but harder than we thought. I'll show some analyses of those millions of messages, including where spam actually comes from, and why legal solutions can only stop a fraction of spam. Next, I'll talk about why email in general and spam in particular need their own new field, combining aspects of machines learning, networking, cryptography/security, HCI, and economics. Finally, I'll explain why everyone should attend the Conference on Email and Anti-Spam, July 30-31, in Mountain View, www.ceas.cc, immediately after AAAI. (Joint work with Geoff Hulten, Robert Rounthwaite, David Heckerman, and others.)

Bio: Joshua Goodman started his professional life as a developer at Dragon Systems, working on speech recognition. He then went to grad chool at Harvard University, receiving a Ph.D. for his work in statistical natural language processing, especially statistical parsing. From there, he went to Microsoft Research, where he worked on language modeling. For the past 2 years, he has been working on stopping spam, including helping start Microsoft's Anti-Spam Technology Group. He is a Program Co-Chair for the Conference on Email and Anti-Spam, July 30-31, in Mountain View, www.ceas.cc, which everyone should attend.

Friday, 16 April 2004 (EE1 403, 1:00-2:00PM)
Statistical modeling for spontaneous speech recognition
-- Dr. Takahiro Shinozaki
University of Washington, Department of EE, SSLI-Lab

Abstract
Several large-scale spontaneous speech corpora have become available and recognition performance for spontaneous speech has been greatly improved by making speech models using a large amount of spontaneous speech data. However, recognition rates are still insufficient for the most applications. This is because spontaneous speech has many variations not only between speakers but also from utterance to utterance within each speaker. In this talk, I will present my recent works that deal with these variations.


Past Quarter's Seminars


Last updated ($Date: 2004/05/27 06:54:55 $)