June 1997

Hidden State Modeling

**Peter N. Yianilos**

Abstract

Motivated by the goal of establishing stochastic and information theoretic foundations for the study of intelligence and synthesis of intelligent machines, this thesis probes several topics relating to hidden state stochastic models.

Finite Growth Models (FGM) are introduced. These are nonnegative
functionals that arise from parametrically-weighted directed acyclic
graphs and a tuple observation that affects these weights. Using FGMs
the parameters of a highly general form of stochastic transducer can
be learned from examples, and the particular case of stochastic string
edit distance is developed. Experiments are described that illustrate
the application of learned string edit distance to the problem of
recognizing a spoken word given a phonetic transcription of the
acoustic signal. With FGMs one may direct learning by criteria
beyond simple maximum-likelihood. The MAP (maximum *a posteriori
estimate*) and MDL (minimum description length) are discussed along
with the application to causal-context probability models and
unnormalized noncausal models. The FGM framework, algorithms, and
data structures describe hidden Markov models, stochastic context free
grammars, and many other conventional similar models while providing a
unified and natural way for computer scientists to learn and reason
about them and their many variations. A software system and scripting
language is proposed to serve as an assembly language or sorts for
many higher level model types.

This thesis also illuminates certain fundamental aspects of the nature of normal (Gaussian) mixtures and the reparameterization of related optimization problems. The use of conditional normal mixtures is proposed as a tool for image modeling, and issues relating to the estimation of their parameters are discussed.

Acknowledgments

I must begin by thanking my first mentor Joeseph Vargo, and my parents and grandparents who valued education and always encouraged me to follow my own interests. I also thank my advisor Bob Tarjan and Bruce Maggs for urging me to complete my graduate studies at Princeton.

Many of the results of this thesis arose from a collaboration with Eric Ristad and I acknowledge his important influence on and contributions to this research. I am also grateful for the participation and advice of my other committee members Andrew Appel, Bob Sedgewick, and Vince Poor.

Finally, this work would not have been possible without the support and patience of my wife Angela and our children Nicholas, Jonathan, Geoffrey, and Lauren.

A DISSERTATION

PRESENTED TO THE FACULTY

OF PRINCETON UNIVERSITY

IN CANDIDACY FOR THE DEGREE

OF DOCTOR OF PHILOSOPHY

PRESENTED TO THE FACULTY

OF PRINCETON UNIVERSITY

IN CANDIDACY FOR THE DEGREE

OF DOCTOR OF PHILOSOPHY

RECOMMENDED FOR ACCEPTANCE

BY THE DEPARTMENT OF COMPUTER SCIENCE

BY THE DEPARTMENT OF COMPUTER SCIENCE

June, 1997

- Introduction
- Finite Growth Models
- Introduction
- Finite Growth Models
- FGM-based Probability Models
- Beyond Finite Probability Models
- Stochastic Transducers
- New Perspectives on Existing Models
- Acknowledgments

- Learning String Edit Distance Costs
- Introduction
- The Basic Model and Approach
- The Switchboard Corpus
- Experimental Variations
- Experimental Results
- Future Work
- Postscript

- A Modeling Assembly Language

- Emphasis Reparameterization
- Introduction
*A Posteriori*Degeneracy- Emphasis Reparameterization
- The Limitations of EM-style Emphasis
- Concluding Remarks
- Acknowledgments

- Conditional Normal Mixtures
- Introduction
- Conditional Discrete Mixtures
- Application to Normal Densities
- Further discussion of the formulation of the MRCE problem
- Possible Applications

- Bibliography
- About this document ...

2002-06-24