Implicit imitation assumes that learning agents observe only the state transitions of an agent they use as a mentor, and try to recreate them based on their own abilities and knowledge of their environment. In this paper, we put forward a deep implicit imitation Q-network (DIIQN) model, which incorporates ideas from three well-known Deep Q-Network (DQN) variants. As such, we enable a novel implicit imitation method for online, model-free deep reinforcement learning. Our thorough experimentation in the complex environment of the emerging lane-free traffic paradigm, verifies the benefits of our approach. Specifically, we show that deep implicit imitation RL dramatically accelerates the learning process when compared to a “vanilla” DQN method; and, unlike explicit imitation reinforcement learning, it is able to outperform mentor performance without resorting to additional information, such as the mentor’s actions.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com