Discriminatively Structured Dynamic Graphical Models for Speech Recognition
JHU CLSP Summer workshop for 2001

JHU CSLP and SSLI Laboratory, University of Washington, Dept. of Electrical Engineering
 

Jeff Bilmes <bilmes@ee.washington.edu> University of Washington, Dept. of EE
Geoff Zweig <gzweig@us.ibm.com>IBM Yorktown Heights Research Center
Thomas S. Richardson <tsr@stat.washington.edu> University of Washington, Dept. of Statistics
Johan Schalkwyk <johans@speechworks.com> Speechworks
Kirk Jackson <kirkjack@afterlife.ncsc.mil> NCSC
Karen Livescu <klivescu@sls.lcs.mit.edu> MIT
Peng Xu <xp@clsp.jhu.edu> JHU
Eva Holtz <eholtz@fas.harvard.edu> Harvard
    Jerry Torres <jrey@stanford.edu> Stanford
Sanjeev Khudanpur <sanjeev@clsp.jhu.edu> JHU
Bill Byrne <byrne@clsp.jhu.edu>  JHU









The state-of-the-art in automatic speech recognition (ASR) by computer has undergone many significant advances over the past 20 years. The underlying approach, however, still involves using hidden Markov models (HMMs). Most experts believe that we must move beyond the HMM in order to significantly advance the field. This project involves advancing beyond HMMs in careful, data-driven, and task-oriented ways, and applying these techniques to the problem of hands-free ASR in automobiles, and to general conversational ASR.

We will apply the above techniques to a new automobile speech corpus (speech recorded in a variety of acoustic automobile environments) and we also will use new signal-processing methods to extract acoustic features from both normal and array microphones. We will also apply the above techniques to the Switchboard database, a collection of recordings of natural telephone conversations.

We will use a new GM toolkit, made available and optimized for ASR tasks, to conduct our research. Students will be provided instruction on both the theory of graphical models and use of the toolkit during the two preparatory weeks of the summer.

This research will enhance hands-free interactive voice-response capability in cellular telephones in cars, tourist information kiosks etc. and lead to robust ASR for automatically transcribing group meetings, court proceedings, etc.
 

Outcome of 1st Planning Meeting

Outcome of 2nd Planning Meeting

Articulatory Reading List (thanks to Katrin Kirchhoff)

Miscellaneous Information:


The entire group can be mailed to via this email list.
 

Reading Lists:

The following is a list of papers that will be useful to read prior to attending the workshop.
 

Bilmes Papers
 
* J. Bilmes. Dynamic Bayesian Networks (pdf) The 16th Conference on Uncertainty in Artificial Intelligence, Stanford, July 2000.
* J. Bilmes. Factored Sparse Inverse Covariance Matrices. (gzipped ps or pdf) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, June 2000.
* J. Bilmes. Buried Markov Models for Speech Recognition (gzipped ps or pdf) IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, March 1999.
*
J. Bilmes. Data-Driven Extensions to HMM Statistical Dependencies. (gzipped ps or pdf) Int. Conf. on Spoken Language Processing,, Dec 1998.
*
J. Bilmes. Graphical Models and Automatic Speech Recognition(pdf ) UWEETR-2001-0005, Dec 2001.
* J. Bilmes. Natural Statistical Models for Automatic Speech Recognition. Ph.D. Thesis, Dept. of EECS, CS Division, U.C. Berkeley 1999 (postscript or pdf ). 

 

Zweig Papers
 

* Speech Recognition with Dynamic Bayesian NetworksG. Zweig and S. Russel, AAAI98.

* Probabilistic Modeling with Bayesian Networks and Automatic Speech RecognitionG. Zweig and S. Russel, AJII.

* Dependency Modeling with Bayesian Networks in a voicemail transcription systemG. Zweig and M. Padmanabhan, Eurospeech99

* Speech Recognition with Dynamic Bayesian NetworksG. Zweig Ph.D. thesis
 

Richardson Papers
 

* THE TETRAD PROJECT: CONSTRAINT BASED AIDS TO MODEL SPECIFICATION. R.Scheines, C.Glymour, P.Spirtes, C.Meek and T.Richardson, to appear in Multivariate Behavioral Research.

* A POLYNOMIAL-TIME ALGORITHM FOR DECIDING MARKOV EQUIVALENCE OF DIRECTED CYCLIC GRAPHICAL MODELS. In Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, Portland, Oregon, 1996. E.Horvitz and F.Jensen (eds)., Morgan Kaufmann, San Francisco, CA.

* A DISCOVERY ALGORITHM FOR DIRECTED CYCLIC GRAPHS. In Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, Portland, Oregon, 1996. E.Horvitz and F.Jensen (eds.), Morgan Kaufmann, San Francisco, CA.

* AUTOMATED DISCOVERY OF LINEAR FEEDBACK MODELS. T.Richardson, P.Spirtes, to appear in Causality and Computation, C.Glymour, (ed.), MIT Press.

* THE DIMENSIONALITY OF MIXED ANCESTRAL GRAPHS. P.Spirtes, T.Richardson, C.Meek. CMU-PHIL-83, Nov 1997.
 

General Graphical Models Papers
 

* K. Murphy's brief overview of Bayes nets. March, 1995 (revised November, 1996).

* D. Heckerman, D. Geiger, D. Chickering. Learning Bayesian networks: The Combination of Knowledge and Statistical Data. Technical Report MSR-TR-94-09, Microsoft Research, March, 1994 (revised December, 1994).

* Thiesson, Bo ; Meek, Christopher ; Chickering, David Maxwell ; Heckerman, David Learning Mixtures of DAG Models, December 1997 (Revised May 1998)

* Bilmes, Jeff; A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models ICSI-TR-97-021

* Roweis, S.T. and Ghahramani, Z. A Unifying Review of Linear Gaussian Models

* H. Attias Independent Factor Analysis

* Ross D. Shachter. Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence

* D. Heckerman. A tutorial on learning with Bayesian networks. Technical Report MSR-TR-95-06, Microsoft Research, March, 1995 (revised November, 1996).

* Learning Probabilistic Networks by Paul J. Krause, manuscript, 1998.

*  NIPS 95 Workshop on Learning in Bayesian Networks and Other Graphical Models

*  A Guide to the Literature on Learning Probabilistic Networks From Data literature review on learning graphical models, in IEEE Trans. on Knowledge and Data Engineering. 235Kb. Final draft submitted 29th Nov., '95

*  John Binder, Daphne Koller, Stuart Russell, Keiji Kanazawa, `` Adaptive Probabilistic Networks with Hidden Variables. '' Machine Learning, 29, 213--244, 1997.

* N. Friedman. The Bayesian Structural EM Algorithm

* N. Friedman. Learning belief networks in the presence of missing values and hidden variables

*  N. Fridman and D. Koller Being Bayesian about Network Structure

* H. Attias 1999. Inferring parameters and structure of latent variable models by variational Bayes. Proc. 15th Conference on Uncertainty in Artificial Intelligence.