Automatic Summarization of Recorded Lectures

In the past decade, the amount of information stored in large, publicly accessible databases has increased dramatically. Most of this data consists of text documents, which is why much effort in recent years has been devoted to text-based information extraction methods, such the as keyword-based document retrieval techniques familiar from web search engines. However, online data collections increasingly include not only written documents but also video and audio documents. For this reason, advanced tools for categorizing, indexing and extracting information from multimedia documents will become indispensable in the near future.

The goal of this project is to explore methods for automatically summarizing spoken documents, in particular recordings of academic lectures. We will employ automatic speech recognition technology in order to derive a representation of the spoken document which will serve as the basis for automatic information extraction and summarization techniques. Particular emphasis will be given to the use of prosodic information for highlighting relevant portions of the audio signal. Prosodic information includes aspects of the speaker's intonation, speaking rate, accentuation of individual syllables, etc. Along with methods for extracting these parameters, we will develop new scoring methods which integrate word-based relevance measures with prosody-based relevance measures and confidence values for the different information sources.

This project will be carried out in cooperation with the program for Education at a Distance for Growth and Excellence (EDGE) at the College of Engineering of the University of Washington. This program provides educational resources to distance learning students, which includes streaming video of lectures given by Engineering faculty. We will use the audio portions of these recordings as data for system development and evaluation.

SPONSOR: University of Washington Royalty Research Fund

AWARD PERIOD: October 2001 - September 2002

TEAM MEMBERS:


Return to the SSLI Lab Projects Page.