Self-supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach

Houjian Yu, Changhyun Choi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Instance segmentation with unseen objects is a challenging problem in unstructured environments. To solve this problem, we propose a robot learning approach to actively interact with novel objects and collect each object’s training label for further fine-tuning to improve the segmentation model performance, while avoiding the time-consuming process of manually labeling a dataset. Given a cluttered pile of objects, our approach chooses pushing and grasping motions to break the clutter and conducts object-agnostic grasping for which the Singulation-and-Grasping (SaG) policy takes as input the visual observations and imperfect segmentation. We decompose the problem into three subtasks: (1) the object singulation subtask aims to separate the objects from each other, which creates more space that alleviates the difficulty of (2) the collision-free grasping subtask; (3) the mask generation subtask obtains the self-labeled ground truth masks by using an optical flow-based binary classifier and motion cue post-processing for transfer learning. Our system achieves 70 % singulation success rate in simulated cluttered scenes. The interactive segmentation of our system achieves 87.8 %, 73.9 %, and 69.3 % average precision for toy blocks, YCB objects in simulation, and real-world novel objects, respectively, which outperforms the compared baselines. Please refer to our project page for more information:

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2022 - 17th European Conference, Proceedings
EditorsShai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages17
ISBN (Print)9783031198410
StatePublished - 2022
Event17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel
Duration: Oct 23 2022Oct 27 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13699 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference17th European Conference on Computer Vision, ECCV 2022
CityTel Aviv

Bibliographical note

Funding Information:
Acknowledgements. This work was supported in part by the Sony Research Award Program and NSF Award 2143730.

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.


  • Interactive segmentation
  • Reinforcement learning
  • Robot manipulation


Dive into the research topics of 'Self-supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach'. Together they form a unique fingerprint.

Cite this