KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Alireza Rezazadeh, Changhyun Choi

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) - an end-to-end unsupervised framework to reason about object interactions based on a keypoint representation. Using visual observations, our model learns to associate objects with keypoint coordinates and discovers a graph representation of the system as a set of keypoint embeddings and their relations. It then learns an action-conditioned forward model using contrastive estimation to predict future keypoint states. By learning to perform physical reasoning in the keypoint space, our model automatically generalizes to scenarios with a different number of objects, novel backgrounds, and unseen object geometries. Experiments demonstrate the effectiveness of our model in accurately performing forward prediction and learning plannable object-centric representations for downstream robotic pushing manipulation tasks.

Original languageEnglish (US)
Pages (from-to)6195-6202
Number of pages8
JournalIEEE Robotics and Automation Letters
Volume8
Issue number10
DOIs
StatePublished - Oct 1 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Representation learning
  • deep learning methods
  • manipulation planning

Fingerprint

Dive into the research topics of 'KINet: Unsupervised Forward Models for Robotic Pushing Manipulation'. Together they form a unique fingerprint.

Cite this