Instance segmentation with unseen objects is a challenging problem in unstructured environments. To solve this problem, we propose a robot learning approach to actively interact with novel objects and collect each object’s training label for further fine-tuning to improve the segmentation model performance, while avoiding the time-consuming process of manually labeling a dataset. Given a cluttered pile of objects, our approach chooses pushing and grasping motions to break the clutter and conducts object-agnostic grasping for which the Singulation-and-Grasping (SaG) policy takes as input the visual observations and imperfect segmentation. We decompose the problem into three subtasks: (1) the object singulation subtask aims to separate the objects from each other, which creates more space that alleviates the difficulty of (2) the collision-free grasping subtask; (3) the mask generation subtask obtains the self-labeled ground truth masks by using an optical flow-based binary classifier and motion cue post-processing for transfer learning. Our system achieves 70 % singulation success rate in simulated cluttered scenes. The interactive segmentation of our system achieves 87.8 %, 73.9 %, and 69.3 % average precision for toy blocks, YCB objects in simulation, and real-world novel objects, respectively, which outperforms the compared baselines. Please refer to our project page for more information: https://z.umn.edu/sag-interactive-segmentation.
|Original language||English (US)|
|Title of host publication||Computer Vision – ECCV 2022 - 17th European Conference, Proceedings|
|Editors||Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner|
|Publisher||Springer Science and Business Media Deutschland GmbH|
|Number of pages||17|
|State||Published - 2022|
|Event||17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel|
Duration: Oct 23 2022 → Oct 27 2022
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||17th European Conference on Computer Vision, ECCV 2022|
|Period||10/23/22 → 10/27/22|
Bibliographical noteFunding Information:
Acknowledgements. This work was supported in part by the Sony Research Award Program and NSF Award 2143730.
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
- Interactive segmentation
- Reinforcement learning
- Robot manipulation