Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

Bayesian Nonparametric Discovery of Layers and Parts from Scenes and Objects

Description

Abstract:
We develop statistical methods for analyzing natural images, videos, motion capture (MoCap) sequences, and three-dimensional (3D) representations of articulated objects. Our goal is to discover and characterize regions, objects, actions, and the parts composing them. Such data typically exhibit wide variability in complexity, with some instances containing only a few objects (parts) and others exhibiting complex structure. Further, images and 3D object representations have strong spatial correlations, while MoCap and video sequences additionally exhibit temporal dependencies. Effective models for such data must automatically reason about the number of constituent objects and parts, while simultaneously modeling strong spatio-temporal interactions. Motivated by these challenges, we study and extend flexible Bayesian nonparametric priors. Focusing first on images, we explore a family of models that generalize the Pitman-Yor (PY) process to produce decompositions of images into depth-ordered segments (layers). Spatial correlations are captured through an ordered set of Gaussian processes that encourage piecewise smooth allocation of pixels to segments. We develop variational methods for effective learning and robust inference, and demonstrate competitive performance on standard image segmentation benchmarks. Next, we explore the distance dependent Chinese restaurant process (ddCRP), a distribution over partitions that allows user-specified affinity functions to capture dependencies between data instances. We show that a statistical model endowed with a ddCRP prior, and an expressive likelihood for modeling deformations, produces state-of-the-art segmentations of articulated 3D objects. We then develop a family of hierarchical ddCRP priors that allow dependencies both between data instances and their latent clusters. Coupled with vector auto-regressive likelihoods, this hierarchical ddCRP successfully discovers activities from related MoCap sequences. The performance of the distance dependent models crucially depends on the choice of the affinity functions. Designing functions that capture appropriate domain specific dependencies can be challenging. We develop extensions to the distance dependent models and borrow ideas from the approximate Bayesian computation (ABC) literature to develop algorithms for learning affinity functions from human annotated data. Through extensive experiments on image and video segmentation corpuses, we demonstrate that the learned models consistently outperform their hand-crafted counterparts.
Notes:
Thesis (Ph.D. -- Brown University (2015)

Access Conditions

Rights
In Copyright
Restrictions on Use
Collection is open for research.

Citation

Ghosh, Soumya, "Bayesian Nonparametric Discovery of Layers and Parts from Scenes and Objects" (2015). Computer Science Theses and Dissertations. Brown Digital Repository. Brown University Library. https://doi.org/10.7301/Z0NZ8621

Relations

Collection: