Deep Reinforcement Learning (DRL) has been seen as a promising method to achieve strong (human level) artificial intelligence by many people in recent years. The huge success of the Alpha GO strongly boosted our mind to focus on modifying DQN algorithms with game simulators.

The DRL is a combination of Deep Learning (DL) and Reinforcement Learning (RL). From our perspective, we usually treat the DL as some complicated fitting method, and the core idea of RL contains learning from the reward from the environment base on the observation, which is pretty similar to the learning behavior of normal creatures. At the same time, the fitting characteristical of DL makes it a wonderful tool to extract experience without human interference.

Now, our research focuses on modifying the Rainbow DQN (Deepmind 2017[1]). We are trying to do something based on an open python library (gym) providing Atari games simulators.Recently we are trying some ideas to make the agent suitable for multiple games with less possible retraining, and also trying to achieve high-speed leaning with fewer samples.

DRL Agent playing Pac-man

 

Main Reference:

[1] Hessel, Modayil. “Rainbow: Combining Improvements in Deep Reinforcement Learning.” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, 2018, pp. 3215–22.