Multi-rate Modeling, Model Inference, and Estimation
for Statistical Classifiers
Özgür Çetin
Abstract
Pattern classification problems arise in a wide variety of
applications ranging from speech recognition to machine tool-wear
condition monitoring. In the statistical approach to pattern
classification, classification decisions are made according to
probabilistic models, which in a typical application are not known and
need to be determined from data. Inferring models from data involves
estimation of an assumed model as well as selection of a model among
hypotheses. This thesis addresses these two levels of inference,
making three main contributions: introduction of a new class of
dynamic models for characterizing multi-scale stochastic processes,
multi-rate hidden Markov models (multi-rate HMMs); development of a
new criterion for model selection for statistical classifiers; and
development of a new mathematical approach to parameter estimation for
exponential family distributions. First, multi-rate HMMs are a
parsimonious multi-scale extension of HMMs for stochastic processes
that exhibit scale-dependent characteristics and long-term temporal
dependence. Multi-rate HMMs characterize a process by joint
statistical modeling over multiple scales, and as such, they provide
better class a posteriori probability estimates than HMMs or
other single-rate modeling approaches to combine multi-scale
information resources in classification problems. Second, we develop a
model selection criterion for classifier design based on a predictive
statistics, the conditional likelihood. We apply this criterion to
graph dependency structure selection in the graphical modeling
formalism and illustrate that it provides intuitive and practical
solutions to a number of statistical modeling problems in
classification, including feature selection and dependency
modeling. Lastly, we develop a new mathematical approach to parameter
estimation in the exponential family with hidden data, with
applications to both likelihood and conditional likelihood
methods. For conditional likelihood estimation, we present an
iterative algorithm and its convergence analysis, which provides
theoretical justification for the existing implementations of similar
methods and suggests modifications for faster convergence. For maximum
likelihood estimation, we analyze the relationship of the
expectation-maximization algorithm to gradient-descent methods and
propose simple variations with faster convergence.
We show the utility of developed methods in a number of pattern
classification tasks, including speech recognition, speaker
verification, and machine tool-wear monitoring.
The full thesis in pdf format.
Return to the SSLI Lab Graduate Students Theses Page.