Towards Training Person Detectors for Mobile Robots using Synthetically Generated RGB-D Data

Home / Publications / 2019 / Towards Training Person Detectors for Mobile Robots using Synthetically Generated RGB-D Data

Timm Linder, Michael Johan Hernandez Leon, Narunas Vaskevicius, and Kai Oliver Arras
Towards Training Person Detectors for Mobile Robots using Synthetically Generated RGB-D Data
CVPR 2019: Workshop on 3D Scene Generation

 

Abstract

We explore how we can use synthetically generated RGB-D training data from a near photo-realistic game engine to train modality-specific person detectors. We perform ablation studies on a challenging, real-world dataset which we recorded using a Kinect v2 RGB-D sensor in multiple warehouse environments. Through extensive use of domain randomization techniques, we synthesize a realistic and highly varied training set of challenging intralogistics scenarios as observed from a mobile robot, comprising persons in confined and cluttered indoor spaces. We then train the detector layers of a YOLOv3 model from scratch on our synthetic RGB and jet-encoded depth images. While for the RGB case, we still observe a domain gap of 6 points in mAP compared to a pretrained COCO model, results indicate that by exploiting simulation, an immense manual labeling effort needed to prepare large-scale datasets such as MS COCO might be unnecessary for the depth modality. We further find that filtering of highly occluded groundtruth bounding boxes during training, as well as modeling of time-of-flight sensor noise characteristics has a positive impact on model performance. We also provide an initial set of qualitative results on our real-world dataset.

 

@inproceedings{LinderCVPRWS19},
author = {Timm Linder and Michael Johan Hernandez Leon and Narunas Vaskevicius and Kai Oliver Arras},
title = {Towards Training Person Detectors for Mobile Robots using Synthetically Generated {RGB-D} Data},
booktitle = {Computer Vision and Pattern Recognition ({CVPR}) 2019 Workshop on 3{D} Scene Generation},
year = {2019}
}