Jeff A. Bilmes: Software and Data
Software
- The graphical models toolkit GMTK
- The Vocal Joystick Software for windows.
- The PhiPAC automatically tuning matrix-matrix multiply library (and the first auto-tuning matrix multiply dense linear algebra library).
- The Buried Markov Model (BMM) code (includes mixtures of sparse linear conditional multi-time Gaussian models). Sorry, no documentation.
- Multi-party meeting scheduling with simple preference aggregation rules.
- Extensions to the old Berkeley parallel make software are at pmake-3.0-alpha. This includes new features and gnuconf. This is an alpha release, and is basically working but there are no plans for additional work to be done on this (at least by me or my group).
Data
- Vocal Joystick Vowel Corpus
- A small amount of hand-aligned French/English data, useful for statistical machine translation systems, done by Karim Filali.
- The COSINE multi-channel real-world in-situ noisy speech corpus. (now available for download).
- The Semi-Supervised Switchboard Transcription (S3TP) project and its data. In the 1990s, the switchboard transcription project gave us 1.5 hours of frame-by-frame phonetically transcribed switchboard conversational speech data. Here, we have used a modern semi-supervised learning algorithm to phonetically label at the frame level the remaining 250 hours of SWB-I, and we call this the semi-supervised switchboard transcription project (or s3tp). The data and algorithms are available here.