Cityscapes is a great dataset for semantic image segmentation which is widely used in academia in the context of automated driving. This dataset provides pixel-precise class annotations on the full image from a vehicle’s perspective. However, sometimes you are only interested in the 2D bounding box of specific objects such as cars
or pedestrians
in order to perform 2D object detection on the image.
The annotations in Cityscapes also considers segmentation instances. That means a single object is defined by the segmentation mask and an unique instance ID. We can use that information to transform such an instance and extract the extend of it, in short: the 2D bounding box. Furthermore, we can also determine the area that is covered by that instance, which is called the mask. Together, we would obtain labels for object segmentation as shown in the head image above.
Cityscapes to Coco Conversion Tool
To convert the Cityscapes dataset into a Coco format dataset you may use my Cityscapes Coco conversion tool. You can use it as described in the following:
Usage
Clone the repository
git clone https://github.com/TillBeemelmanns/cityscapes-to-coco-conversion
and install the requirements:
pip install -r requirements.txt
You may setup a virtual environment to do so.
Download the Cityscapes dataset
Download the Cityscapes dataset. Download gtFine_trainvaltest.zip
and also leftImg8bit_trainvaltest.zip
. You may have to register in order to download them. Setup the following file structure.
data/
└── cityscapes
├── annotations
├── gtFine
│ ├── test
│ ├── train
│ └── val
└── leftImg8bit
├── test
├── train
└── val
main.py
inspect_coco.py
README.md
requirements.txt
Now you can start the conversion script by calling
python main.py --dataset cityscapes --datadir="data/cityscapes" --outdir="data/cityscapes/annotations"
The script will create the files
instancesonly_filtered_gtFine_train.json
instancesonly_filtered_gtFine_val.json
in the directory annotations
for the train
and val
split which contain the Coco annotations.
Filter certain classes
The Cityscapes dataset contains about 30 different classes. Not all of them may be relevant for you. The variable category_instancesonly
defines which classes should be considered in the conversion process.
category_instancesonly = [
'person',
'rider',
'car',
'truck',
'bus',
'train',
'motorcycle',
'bicycle',
]
On these mentioned classes will be converted into a Coco annotation.
Sometimes the segmentation annotations are so small that no reasonable big enough object could be created. In this case the, the object will be skipped.
Warning: invalid contours.
Output
You can visualize the final object segmentation annotations with the inspection script:
python inspect_coco.py --coco_dir data/cityscapes
And you would obtain some of the following pictures:
Wrap-up
- You converted a image segmentation dataset into a object segmentation dataset
- You can now use the new dataset with Mask R-CNN, DETR or Detectron2 network architectures