Reliable and scalable variational inference for nonparametric mixtures, topics, and sequences

Hughes, Michael C.

Full Metadata

Overview

Year:: 2016
Contributor:: Hughes, Michael C (creator); Sudderth, Erik (Director); Raphael, Benjamin (Reader); Fox, Emily (Reader); Brown University. Computer Science (sponsor)
Genre:: theses
Subject:: Bayesian nonparametrics; Machine learning
Extent:: 16, 194 p.
DOI: https://doi.org/10.7301/Z05Q4TH1

Files

Description

Abstract:: We develop new algorithms for training nonparametric clustering models based on the Dirichlet Process (DP), including DP mixture models, hierarchical Dirichlet process (HDP) topic models, and HDP hidden Markov models. These Bayesian nonparametric models allow coherent comparisons of different clusterings of a given dataset. The nonparametric approach is particularly promising for large-scale applications, where other model selection techniques like cross-validation are too expensive. However, existing training algorithms fail to live up to this promise. Both Monte Carlo samplers and variational optimization methods are vulnerable to local optima and sensitive to initialization, especially the initial number of clusters. Our new algorithms can reliably escape poor initializations to discover interpretable clusters from millions of training examples. For the DP mixture model, we pose a variational optimization problem in which the number of instantiated clusters assigned to data can be adapted during training. The focus of this optimization is an objective function which tightly lower bounds the marginal likelihood and thus can be used for Bayesian model selection. Our algorithm maximizes this objective score via block coordinate ascent interleaved with proposal moves that can add useful clusters to escape local optima while removing redundant or irrelevant clusters. We further introduce an incremental algorithm that can exactly optimize our objective function on large datasets while processing only small batches at each step. Our approach uses cached or memoized sufficient statistics to make exact decisions for proposal acceptance or rejection. This memoized approach has the same runtime cost as previous stochastic methods but allows exact acceptance decisions for cluster proposals and avoids learning rates entirely. We later extend these algorithms to HDP topic models and HDP hidden Markov models. Previous methods for the HDP have used zero-variance point estimates with problematic model selection properties. Instead, we find sophisticated solutions to the non-conjugacy inherent in the HDP that still yield an optimization objective function usable for Bayesian model selection. We demonstrate promising proposal moves for adapting the number of clusters during memoized training on millions of news articles, hundreds of motion capture sequences, and the human genome.
Notes:: Thesis (Ph.D. -- Brown University (2016)

Content

Access Conditions

Rights: In Copyright
Restrictions on Use: Collection is open for research.

Citation

Hughes, Michael C., "Reliable and scalable variational inference for nonparametric mixtures, topics, and sequences" (2016). Computer Science Theses and Dissertations. Brown Digital Repository. Brown University Library. https://doi.org/10.7301/Z05Q4TH1

Relations

Collection:

Computer Science Theses and Dissertations

Theses and Dissertations for the Computer Science department.
...