Multi-rate Modeling, Model Inference, and Estimation for Statistical Classifiers

Özgür Çetin

Abstract

Pattern classification problems arise in a wide variety of applications ranging from speech recognition to machine tool-wear condition monitoring. In the statistical approach to pattern classification, classification decisions are made according to probabilistic models, which in a typical application are not known and need to be determined from data. Inferring models from data involves estimation of an assumed model as well as selection of a model among hypotheses. This thesis addresses these two levels of inference, making three main contributions: introduction of a new class of dynamic models for characterizing multi-scale stochastic processes, multi-rate hidden Markov models (multi-rate HMMs); development of a new criterion for model selection for statistical classifiers; and development of a new mathematical approach to parameter estimation for exponential family distributions. First, multi-rate HMMs are a parsimonious multi-scale extension of HMMs for stochastic processes that exhibit scale-dependent characteristics and long-term temporal dependence. Multi-rate HMMs characterize a process by joint statistical modeling over multiple scales, and as such, they provide better class a posteriori probability estimates than HMMs or other single-rate modeling approaches to combine multi-scale information resources in classification problems. Second, we develop a model selection criterion for classifier design based on a predictive statistics, the conditional likelihood. We apply this criterion to graph dependency structure selection in the graphical modeling formalism and illustrate that it provides intuitive and practical solutions to a number of statistical modeling problems in classification, including feature selection and dependency modeling. Lastly, we develop a new mathematical approach to parameter estimation in the exponential family with hidden data, with applications to both likelihood and conditional likelihood methods. For conditional likelihood estimation, we present an iterative algorithm and its convergence analysis, which provides theoretical justification for the existing implementations of similar methods and suggests modifications for faster convergence. For maximum likelihood estimation, we analyze the relationship of the expectation-maximization algorithm to gradient-descent methods and propose simple variations with faster convergence.

We show the utility of developed methods in a number of pattern classification tasks, including speech recognition, speaker verification, and machine tool-wear monitoring.

The full thesis in pdf format.


Return to the SSLI Lab Graduate Students Theses Page.