Skip to page navigation menu Skip entire header
Brown University
Skip 13 subheader links

Teaching Old Dogs New Tricks: Incremental Multimap Regression for Interactive Robot Learning from Demonstration

Description

Abstract:
We consider autonomous robots as having associated control policies that determine their actions in response to perceptions of the environment. Often, these controllers are explicitly transferred from a human via programmatic description or physical instantiation. Alternatively, Robot Learning from Demonstration (RLfD) can enable a robot to learn a policy from observing only demonstrations of the task itself. We focus on interactive, teleoperative teaching, where the user manually controls the robot and provides demonstrations while receiving learner feedback. With regression, the collected perception-actuation pairs are used to directly estimate the underlying policy mapping. This dissertation contributes an RLfD methodology for interactive, mixed-initiative learning of unknown tasks. The goal of the technique is to enable users to implicitly instantiate autonomous robot controllers that perform desired tasks as well as the demonstrator, as measured by task-specific metrics. With standard regression techniques, we show that such ``on-par'' learning is restricted to policies typified by a many-to-one mapping (a unimap) from perception to actuation. Thus, controllers representable as multi-state Finite State Machines (FSMs) and that exhibit a one-to-many mapping (a multimap) cannot be learnt. To be able to do so we must address the three issues of model selection (how many subtasks or FSM states), policy learning (for each subtask), and transitioning (between subtasks). Previous work in RLfD has assumed knowledge of the task decomposition and learned the subtask policies or the transitions between them in isolation. We instead address both model selection and policy learning simultaneously. Our presented technique uses an infinite mixture of experts and treats the multimap data from an FSM controller as being generated from overlapping unimaps. The algorithm automatically determines the number of unimap experts (model selection) and learns a unimap for each one (policy learning). On data from both synthetic and robot soccer multimaps we show that the discovered subtasks can be used (switched between) to reperform the original task. While not at the same level of skill as the demonstrator, the resulting approximations represent significant improvement over ones for the same tasks learned with unimap regression.
Notes:
Thesis (Ph.D.) -- Brown University (2010)

Access Conditions

Rights
In Copyright
Restrictions on Use
Collection is open for research.

Citation

Grollman, Daniel H., "Teaching Old Dogs New Tricks: Incremental Multimap Regression for Interactive Robot Learning from Demonstration" (2009). Computer Science Theses and Dissertations. Brown Digital Repository. Brown University Library. https://doi.org/10.7301/Z09P2ZX6

Relations

Collection: