Abstract
This paper introduces HyGIPO, a novel gradient-based iterative policy optimization technique designed for efficient policy learning in autonomous systems, especially in the presence of modeling errors. Performance of control algorithms for autonomous systems is often limited by mismatches between a simplified nominal model and a complex real system. To address this degradation, HyGIPO leverages a hybrid gradient optimization approach, combining gradients of dynamics from a nominal model with real-world data to optimize control policies. We apply this method to the quadcopter waypoint tracking problem, with the controller parameterized by a neural network, demonstrating its effectiveness in both simulation and hardware experiments. In simulation, HyGIPO rapidly learns the policy within a hundred samples, showing orders of magnitude higher sample efficiency compared to reinforcement learning methods. The hardware experiments further validate the method, achieving successful tracking results in just tens of samples.
| Original language | English (US) |
|---|---|
| Title of host publication | 2025 American Control Conference, ACC 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 3035-3040 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798331569372 |
| DOIs | |
| State | Published - 2025 |
| Event | 2025 American Control Conference, ACC 2025 - Denver, United States Duration: Jul 8 2025 → Jul 10 2025 |
Publication series
| Name | Proceedings of the American Control Conference |
|---|---|
| ISSN (Print) | 0743-1619 |
Conference
| Conference | 2025 American Control Conference, ACC 2025 |
|---|---|
| Country/Territory | United States |
| City | Denver |
| Period | 7/8/25 → 7/10/25 |
Bibliographical note
Publisher Copyright:© 2025 AACC.