Free Supervision from Video Games
Data
The data provided is for research and educational use only. Commercial use is prohibited. If you use the data, please buy the game(s).
About the data
Both train and test data are split into roughly 30s continuous clips.
For each frame (recorded or not), a *_state.json
file contains from basic information about the frame (including the player position, heading, control signal, weather, …).
We record all image modalities at 6 FPS.
All modalities are stored as compressed images (png or webp):
images
andalbedo
are stored directly as color images (lossy webp or lossless png).- The
segmentation
map contains both instance and semantic segmentation. The R channel corresponds to the object type, while GB correspond to a 16 bit integer identifying the object id. The id is persistent across time (it tracks objects). - The
flow
image is a 24bit color image, the first 12 bits correspond to the horizontal (x/u) component, the second 12 bits correspond to the vertical (y/v) component. Convert the 12 bit integer to the actual flow value use x / 4. - 512. The flow range is -512 .. 512 with four subpixel accuracy measures (0.25, 0.5, 0.75). If the flow falls out of the range it is clipped (happens in less than 0.1% of all pixels). A flow of 0 (as the 12bit number) means the flow is not defined. - The
disparity
image is a 24bit color image. Convert the 24 bit integer to the actual disparity (1/depth) value use x * 8192. The disparity is currently clipped at 7 (and is by definition greater than 0). A disparity of 0 means that there was likely no object drawn at that location.
Code to read those image will be released soon.
Overfitting to the test data
Please don’t train on the test data! Out of scientific curiosity I collected an additional test set, slated to be release in 2021 (or whenever the dataset becomes irrelevant). I hope to benchmark the top methods on that heldout dataset at that time. Feel free to overfit as you see fit…