Talk Through It: End User Directed Manipulation Learning

Carl Winge, Adam Imdieke, Bahaa Aldeeb, Dongyeop Kang, Karthik Desingh

Research output: Contribution to journalArticlepeer-review

Abstract

Training robots to perform a huge range of tasks in many different environments is immensely difficult. Instead, we propose selectively training robots based on end-user preferences. Given a factory model that lets an end user instruct a robot to perform lower-level actions (e.g. 'Move left'), we show that end users can collect demonstrations using language to train their home model for higher-level tasks specific to their needs (e.g. 'Open the top drawer and put the block inside'). We demonstrate this framework on robot manipulation tasks using RLBench environments. Our method results in a 13% improvement in task success rates compared to a baseline method. We also explore the use of the large vision-language model (VLM), Bard, to automatically break down tasks into sequences of lower-level instructions, aiming to bypass end-user involvement. The VLM is unable to break tasks down to our lowest level, but does achieve good results breaking high-level tasks into mid-level skills.

Original languageEnglish (US)
Pages (from-to)8051-8058
Number of pages8
JournalIEEE Robotics and Automation Letters
Volume9
Issue number9
DOIs
StatePublished - 2024

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Human-centered robotics
  • incremental learning
  • learning from demonstration

Fingerprint

Dive into the research topics of 'Talk Through It: End User Directed Manipulation Learning'. Together they form a unique fingerprint.

Cite this