Finding and Transferring Policies Using Stored Behaviors

Martin Stolle
doctoral dissertation, tech. report CMU-RI-TR-08-27, Robotics Institute, Carnegie Mellon University, May, 2008

  • Adobe portable document format (pdf) (10MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

We present several algorithms that aim to advance the state-of-the-art in reinforcement learning and planning algorithms. One key idea is to transfer knowledge across problems by representing it using local features. This idea is used to speed up a dynamic programming based generalized policy iteration.

We then present a control approach that uses a library of trajectories to establish a control law or policy. This approach is an alternative to methods for finding policies based on value functions using dynamic programming and also to using plans based on a single desired trajectory. Our method has the advantages of providing reasonable policies much faster than dynamic programming and providing more robust and global policies than following a single desired trajectory.

Finally we show how local features can be used to transfer libraries of trajectories between similar problems. Transfer makes it useful to store special purpose behaviors in the library for solving tricky situations in new environments. By adapting the behaviors in the library, we increase the applicability of the behaviors. Our approach can be viewed as a method that allows planning algorithms to make use of special purpose behaviors/actions which are only applicable in certain situations.

Results are shown for the ?abyrinth?marble maze and the Little Dog quadruped robot. The marble maze is a difficult task which requires both fast control as well as planning ahead. In the Little Dog terrain, a quadruped robot has to navigate quickly across rough terrain.

Associated Center(s) / Consortia: Center for the Foundations of Robotics
Associated Lab(s) / Group(s): Planning and Autonomy Lab
Associated Project(s): Learning Locomotion

Text Reference
Martin Stolle, "Finding and Transferring Policies Using Stored Behaviors," doctoral dissertation, tech. report CMU-RI-TR-08-27, Robotics Institute, Carnegie Mellon University, May, 2008

BibTeX Reference
   author = "Martin Stolle",
   title = "Finding and Transferring Policies Using Stored Behaviors",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "May",
   year = "2008",
   number= "CMU-RI-TR-08-27",
   address= "Pittsburgh, PA",