Action
In this project, we will implement imitation learning/reinforcement learning to train an agent to drive in SuperTuxKart.
Install SuperTuxKart
Please refer to the official documention regarding installation.
Papers of Choice
You may choose to implement one of the four algorithms:
- Behaviour Cloning: ALVINN: An Autonomous Land Vehicle in a Neural Network include the 1 neuron skip connection.
- DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
- PPO: Proximal Policy Optimization Algorithms
- SAC: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
For the later two algorithms it is fine to use off-the-shelf implementations of the RL algorithm, and taylor it to supertux. Please implement Behavior cloning and DAgger from scratch.
Submission
Please explicitly specify which algorithm you choose implement, and upload the training code in your .zip
submission file.
Evaluation
Your agent will be evaluated on the lighthouse
track, and the performance is measured as distance down the track traveled under 2 minutes. If your agent finishes the track under 2 minutes, you will get full credit.
Note
If you choose to implement RL (PPO or SAC), you can optionally use the sampling function we provide in project/utils.py
. It depends on the ray
library, which you can install with pip install ray
.
You can test your solution against the grader using python -m val_grader project -v