Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

Physically Plausible Human Pose and Control Estimation from Video

Description

Abstract:
We propose a new paradigm for vision-based human motion capture. This paradigm extends the traditional capture of poses by providing guarantees of physical plausibility for the motion reconstructions and mechanisms for adaptation of the estimated motions to new environments. We achieve these benefits by estimating control programs for simulated physics-based characters from (potentially monocular) images. The control programs encode motions implicitly, based on their ``underlying physical principles'' and reconstruct the motions through simulation. Feedback within the control allows application of the principles in modified environments, providing an ability to adapt the motion to external events and perturbations. We explore two control models: trajectory control and state-space control. The trajectory control model encodes the desired behavior of the character as a sequence of per-frame target poses tracked by the controller. We can recover this sequence incrementally and produce pose estimates that do not suffer from common visual artifacts. However, the inference process is prone to overfitting. To address this limitation, we then explore a more compact model that is less sensitive to the quality of observations. State-space controllers allow concise representation of motion dynamics through a sparse set of target poses and control parameters, in essence allowing a key-frame-like representation of the original motion. We represent state-space controllers using state machines that characterize the character behavior in terms of motion phases (states) and physical events that cause the phases to switch (transitions, e.g., a foot contact). Parameters of the controller encode the control programs that reproduce the individual phases in simulation. Because this control representation is sparse, we are able to integrate information locally from multiple (tens of) image frames in inference, inducing smoothness in the resulting motion, resolving some of the ambiguities that arise in monocular video-based capture and enabling inference with weak likelihoods. We demonstrate our approach by capturing sequences of walking, jumping, and gymnastics. We evaluate our methods quantitatively and qualitatively and illustrate that we can produce motion interpretations that go beyond state-of-the-art in pose tracking and are physically plausible.
Notes:
Thesis (Ph.D. -- Brown University (2013)

Access Conditions

Rights
In Copyright
Restrictions on Use
Collection is open for research.

Citation

Vondrak, Marek, "Physically Plausible Human Pose and Control Estimation from Video" (2013). Computer Science Theses and Dissertations. Brown Digital Repository. Brown University Library. https://doi.org/10.7301/Z0XP737C

Relations

Collection: