Abstract
Learning from reinforcements is a promising approach for creating intelligent agents. However, reinforcement learning usually requires a large number of training episodes. We present an approach that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer. In our approach, the advice-giver watches the learner and occasionally makes suggestions, expressed as instructions in a simple programming language. Based on techniques from knowledge-based neural networks, these programs are inserted directly into the agent's utility function. Subsequent reinforcement learning further integrates and refines the advice. We present empirical evidence that shows our approach leads to statistically-significant gains in expected reward. Importantly, the advice improves the expected reward regardless of the stage of training at which it is given.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the National Conference on Artificial Intelligence |
Publisher | AAAI |
Pages | 694-699 |
Number of pages | 6 |
Volume | 1 |
State | Published - Dec 1 1994 |
Event | Proceedings of the 12th National Conference on Artificial Intelligence. Part 1 (of 2) - Seattle, WA, USA Duration: Jul 31 1994 → Aug 4 1994 |
Other
Other | Proceedings of the 12th National Conference on Artificial Intelligence. Part 1 (of 2) |
---|---|
City | Seattle, WA, USA |
Period | 7/31/94 → 8/4/94 |