RL agent that learns to play Ping Pong

Implemented the system using policy gradient by creating two layer neural network and applied RMSProp.

Analysis

Rewards before training:

Rewards after training:

Requirements

Open Gym: loads game environment
Pickle: stores the parameter values and rewards