Purpose: Solve the CartPole-v0 from OpenAI gym using Q-learning with experience resampling.

CartPole-v0 is solved using Q-learning with experience resampling. The experience is saved in a reservoir list.

From the terminal run:

python train.py

And then:

python evaluate.py

Training and hyperparameter tuning can be done in a Jupyter Notebook as showed below.

import train

train.main(
    render=False,
    gamma=0.95,
    epsilon=0.1,
    n_episodes=5,
    training_size=10000,
    experience_size=10000,
    batch_size=64,
    epochs=50,
)

Comments

Feel free to comment here below. A Github account is required.