RL agent that learns to play Ping Pong Implemented the system using policy gradient by creating two layer neural network and applied RMSProp. Analysis Rewards before training: Rewards after training: Requirements Open Gym: loads game environment Pickle: stores the parameter values and rewards