Semantic segmentation
In this project, we will implement a semantic segmentation network. Your segment
function should take as input an uint8
numpy array of size (w,h,3)
and output an integer semantic label map of size (w,h)
. The integer should have the following mapping:
ID | Class | ID | Class |
---|---|---|---|
-1 | Unknown | 10 | bicycle |
0 | buliding | 11 | flower |
1 | grass | 12 | sign |
2 | tree | 13 | bird |
3 | cow | 14 | book |
4 | sheep | 15 | chair |
5 | sky | 16 | road |
6 | airplane | 17 | cat |
7 | water | 18 | dog |
8 | face | 19 | body |
9 | car | 20 | boat |
The dataset we will work on is the MSRC segmentation dataset. The train/test/val split is avaliable here. You are NOT allowed to train on the testing images.
You may choose to implement your architecture from one of the following four papers:
- Fully Convolutional Networks for Semantic Segmentation
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
- Deep Layer Aggregation
- U-Net: Convolutional Networks for Biomedical Image Segmentation
Some architectures can be pre-trained (e.g. ImageNet pretrained ResNets and VGGs). You may download pre-trained imagenet classification weights, but NOT (semantic) segmentation weights. In short, you need to make your network fully convolutional yourself, please do not use someone elses code for this.
We provide a python start code which contains a project
and a val_grader
. During testing we will evaluate your segmentation model using the following two metrics:
- Per-pixel accuracy (
global accuracy
) - Per-class accuracy (
average accuracy
)
You can use python -m val_grader project -v
to test your solution against the grader.
Please explicitly mention your choice of paper in your submission.