Skip to content

Latest commit

 

History

History
41 lines (27 loc) · 1.44 KB

README.md

File metadata and controls

41 lines (27 loc) · 1.44 KB

Policy_Gradients_to_beat_Pong

This is the code for the "How to Beat Pong Using Policy Gradients (LIVE)" by Siraj Raval on Youtube I only adapted it for Python 3 (Siraj's code works on Python 2).

Overview

This is the code for this video by Siraj Raval on Youtube. We're going to beat the game of Pong using Policy Gradients (a type of reinforcement algo). PG outperformed DeepMind's Deep Q Network, so its a worthy algo to look into.

Dependencies

Install dependencies with pip

I also put some conda environment file (env.yml). One can install it with:

conda env create -f env.yml

I also had to install cmake (on ubuntu 17.10).

sudo apt-get install cmake

Usage

Run python demo.py and the AI will start playing the game

Credits

Credits go to AndrejK from whom the initial code comes from. Credits go to Siraj Raval I just modified some bits of the code to make it compatible with python 3.

After a few days training

After 6 days of training on Dell XPS 9560 15'' Nvidia GTX 1050, the trained agent could compete... (see below: positive rewards = wins, negative rewards = losses) screenshot from 2018-04-26 08-39-43