Skip to content

A simple framework for experimenting with Reinforcement Learning in Python.

License

Notifications You must be signed in to change notification settings

abagaria/simple_rl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple_rl

A simple framework for experimenting with Reinforcement Learning in Python.

There are loads of other great libraries out there for RL. The aim of this one is twofold:

  1. Simplicity.
  2. Reproducibility of results.

A brief tutorial for a slightly earlier version is available here. As of version 0.77, the library should work with both Python 2 and Python 3. Please let me know if you find that is not the case!

simple_rl requires numpy and matplotlib. Some MDPs have visuals, too, which requires pygame. Also includes support for hooking into any of the Open AI Gym environments. I recently added a basic test script, contained in the tests directory.

Installation

The easiest way to install is with pip. Just run:

pip install simple_rl

Alternatively, you can download simple_rl here.

Example

Some examples showcasing basic functionality are included in the examples directory.

To run a simple experiment, import the run_agents_on_mdp(agent_list, mdp) method from simple_rl.run_experiments and call it with some agents for a given MDP. For example:

# Imports
from simple_rl.run_experiments import run_agents_on_mdp
from simple_rl.tasks import GridWorldMDP
from simple_rl.agents import QLearningAgent

# Run Experiment
mdp = GridWorldMDP()
agent = QLearningAgent(mdp.get_actions())
run_agents_on_mdp([agent], mdp)

Running the above code will run unleash Q-learning on a simple GridWorld. When it finishes it will store the results in cur_dir/results/* and open the following plot:

For a slightly more complicated example, take a look at the code of simple_example.py. Here we run three few agents on the grid world from the Russell-Norvig AI textbook:

from simple_rl.agents import QLearningAgent, RandomAgent, RMaxAgent
from simple_rl.tasks import GridWorldMDP
from simple_rl.run_experiments import run_agents_on_mdp

# Setup MDP.
mdp = GridWorldMDP(width=4, height=3, init_loc=(1, 1), goal_locs=[(4, 3)], lava_locs=[(4, 2)], gamma=0.95, walls=[(2, 2)])

# Setup Agents.
ql_agent = QLearningAgent(actions=mdp.get_actions())
rmax_agent = RMaxAgent(actions=mdp.get_actions())
rand_agent = RandomAgent(actions=mdp.get_actions())

# Run experiment and make plot.
run_agents_on_mdp([ql_agent, rmax_agent, rand_agent], mdp, instances=5, episodes=50, steps=10)

The above code will generate the following plot:

Overview

  • (agents): Code for some basic agents (a random actor, Q-learning, [R-Max], Q-learning with a Linear Approximator, and so on).

  • (experiments): Code for an Experiment class to track parameters and reproduce results.

  • (mdp): Code for a basic MDP and MDPState class, and an MDPDistribution class (for lifelong learning). Also contains OO-MDP implementation [Diuk et al. 2008].

  • (planning): Implementations for planning algorithms, includes ValueIteration and MCTS [Couloum 2006], the latter being still in development.

  • (tasks): Implementations for a few standard MDPs (grid world, N-chain, Taxi [Dietterich 2000], and the OpenAI Gym).

  • (utils): Code for charting and other utilities.

Making a New MDP

Make an MDP subclass, which needs:

  • A static variable, ACTIONS, which is a list of strings denoting each action.

  • Implement a reward and transition function and pass them to MDP constructor (along with ACTIONS).

  • I also suggest overwriting the "__str__" method of the class, and adding a "__init__.py" file to the directory.

  • Create a State subclass for your MDP (if necessary). I suggest overwriting the "__hash__", "__eq__", and "__str__" for the class to play along well with the agents.

Making a New Agent

Make an Agent subclass, which requires:

  • A method, act(self, state, reward), that returns an action.

  • A method, reset(), that puts the agent back to its tabula rasa state.

In Development

I'm hoping to add the following features during Summer-Fall 2018:

  • Planning: MCTS [Coloum 2006] and RTDP [Barto et al. 1995]
  • Deep RL: Polish the DQN [Mnih et al. 2015], others (DDPG).
  • Efficiency: Convert most defaultdict/dict uses to numpy.
  • Docs: Tutorials, contribution policy, and thorough documentation.
  • Visuals: Unify MDP visualization.
  • POMDP: Add an abstract POMDPClass and some basic solvers.
  • Misc: Additional testing, reproducibility checks (store more in params file, rerun experiment from params file).

If you'd like to help out, I'll be making a mailing list for the library soon -- shoot me an email. Let me know if you have any questions or suggestions.

Cheers,

-Dave

About

A simple framework for experimenting with Reinforcement Learning in Python.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 92.4%
  • Jupyter Notebook 7.6%