Action

starter code

In this project, we will implement imitation learning/reinforcement learning to train an agent to drive in SuperTuxKart.

Install SuperTuxKart

Please refer to the official documention regarding installation.

Papers of Choice

You may choose to implement one of the four algorithms:

Behaviour Cloning: ALVINN: An Autonomous Land Vehicle in a Neural Network include the 1 neuron skip connection.
DAgger: A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
PPO: Proximal Policy Optimization Algorithms
SAC: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

For the later two algorithms it is fine to use off-the-shelf implementations of the RL algorithm, and taylor it to supertux. Please implement Behavior cloning and DAgger from scratch.

Submission

Please explicitly specify which algorithm you choose implement, and upload the training code in your .zip submission file.

Evaluation

Your agent will be evaluated on the lighthouse track, and the performance is measured as distance down the track traveled under 2 minutes. If your agent finishes the track under 2 minutes, you will get full credit.

Note

If you choose to implement RL (PPO or SAC), you can optionally use the sampling function we provide in project/utils.py. It depends on the ray library, which you can install with pip install ray.

You can test your solution against the grader using python -m val_grader project -v