Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network

Home / Publications / 2018 / Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network

Cheng Zhao, Li SunPulak Purkait Tom Duckettand Rustam Stolkin 
Dense RGB-D Semantic Mapping with Pixel-Voxel Neural Network
Sensors (Volume: 18, Issue: 9, 2018)


 

Abstract

In this paper, a novel Pixel-Voxel network is proposed for dense 3D semantic mapping, which can perform dense 3D mapping while simultaneously recognizing and labelling the semantic category each point in the 3D map. In our approach, we fully leverage the advantages of different modalities. That is, the PixelNet can learn the high-level contextual information from 2D RGB images, and the VoxelNet can learn 3D geometrical shapes from the 3D point cloud. Unlike the existing architecture that fuses score maps from different modalities with equal weights, we propose a softmax weighted fusion stack that adaptively learns the varying contributions of PixelNet and VoxelNet and fuses the score maps according to their respective confidence levels. Our approach achieved competitive results on both the SUN RGB-D and NYU V2 benchmarks, while the runtime of the proposed system is boosted to around 13 Hz, enabling near-real-time performance using an i7 eight-cores PC with a single Titan X GPU.

@article{Zhao2018,
doi = {10.3390/s18093099},
url = {https://doi.org/10.3390/s18093099},
year  = {2018},
month = {sep},
publisher = {{MDPI} {AG}},
volume = {18},
number = {9},
pages = {3099},
author = {Cheng Zhao and Li Sun and Pulak Purkait and Tom Duckett and Rustam Stolkin},
title = {Dense {RGB}-D Semantic Mapping with Pixel-Voxel Neural Network},
journal = {Sensors}
}