University of Washington
Department of Electrical Engineering

SSLI-LAB : Signal, Speech, and Language Interpretation Seminar

Winter Quarter, 2005
RM EE1-403 EE1 Bldg (unless otherwise specified)
University of Washington, Seattle

Thursday, 10th March 2005 (EE1 403, 2:00-4:00PM)
Customizing Sentiment Classifiers to New Domains: a Case Study
-- Michael Gamon and Anthony Aue
Microsoft Research, Redmond WA

Abstract
Sentiment classification is a very domain-specific problem: classifiers trained in one domain do not perform well in others. At the same time, large amounts of labeled data for fully-supervised learning approaches are not available for some domains, and a sentiment classifier needs to be customizable to new domains in order to be useful in practice. In this paper we survey four different approaches to customizing a sentiment classification system to a new target domain in the absence of large amounts of labeled data. We base our experiments on data from four different domains. After establishing that naïve cross-domain classification results in poor classification accuracy, we compare results obtained by using each of the four approaches and discuss their advantages, disadvantages and performance.

Thursday, 24th February 2005 (EE1 403, 2:00-4:00PM)
A Statistical Model of Structured Hidden Dynamics for Speech Coarticulation and Reduction
-- Li Deng
Microsoft Research, Redmond

Abstract
We describe our recent work on the development, implementation, and evaluation of the structured speech model with statistically characterized hidden trajectories. Bi-directional filtering (forward as well as backward in the temporal dimension) is developed on the hidden vocal tract resonance domain for all classes of speech sounds (including consonantal closure/constriction), offering strong power in parsimonious modeling of long-span speech co-articulation and capturing fine acoustic cues of CV and VC formant transitions. This statistical model, when appropriately implemented, also simultaneously exhibits the property of contextually assimilated phonetic reduction or phonetic target undershooting that is prevalent in casual, fluent speech (e.g., conversational speech). Experiments on large-scale N-best rescoring (N=1000) have demonstrated substantially lower TIMIT phone recognition errors achieved by the model compared with a context-dependent (triphone) HMM system built with HTK. When the ``error propagation'' effect of this long-span model is artificially removed in the N-best rescoring paradigm, the error bound is further cut down in a dramatic manner.

Thursday, 10th February 2005 (EE1 403, 2:00-4:00PM)
Part-of-Speech Tagging using Virtual Evidence and Negative Training
-- Sheila Reynolds
University of Washington, Seattle, Dept. of EE

Abstract
We present a part-of-speech tagger which introduces two new concepts: virtual evidence in the form of an .observed child. which is used to link its parents, and the use of negative training data in learning the conditional probabilities for the observed child. This model remains within the framework of Dynamic Bayesian Networks (DBNs) and is a conditionallystructured model which resolves certain drawbacks inherent in the conditional Markov model (CMM).

Thursday, 3rd Feb 2005 (EE1 403, 2:00-4:00PM)
Reading Level Assessment Using Support Vector Machines and Statistical Language Models
-- Sarah Schwarm
University of Washington, Seattle, Dept. of CSE

Abstract
Reading proficiency is a fundamental component of language competency. However, finding topical texts at an appropriate reading level for foreign and second language learners is a challenge for teachers. This task can be addressed with natural language processing technology to assess reading level. Existing measures of reading level are not well suited to this task, but previous work and our own pilot experiments have shown the benefit of using statistical language models. In this paper, we also use support vector machines to combine features from traditional reading level measures, statistical language models, and other language processing tools to produce a better method of assessing reading level.


Past Quarter's Seminars


Last updated ($Date: 2005/03/12 02:27:47 $)