Learning Sparse Representations with CNNs for Efficient Object Detection in 3D Point Clouds

Abstract – Convolutional neural networks (CNNs) have exhibited state-of-the-art performance across a number of domains, but have yet to realise the same success when applied to 3D point cloud data. This is in part due to the third spatial dimension, which renders the processing of large inputs computationally impractical. In the context of object detection, we circumvent this issue by exploiting the sparsity inherent in 3D point clouds to learn a hierarchy of sparse representations.

By leveraging efficient sparse convolution operations which are implemented through a feature-centric voting scheme, our approach – called Vote3Deep – is able to process large-scale 3D point clouds at fast detection speeds. This is achieved by maintaining sparsity in the intermediate representations through the use of Rectified Linear Units (ReLUs) and constraining filter biases to be non-positive. To this end, we additionally propose an L1 sparsity loss on the post-ReLU activations to maintain sparsity throughout the feature hierarchy and thus benefit from efficient sparse convolutions in every layer.

Following a comprehensive model comparison and an analysis of the sparsity in the learned representations, we benchmark Vote3Deep models with up to three layers and significantly fewer parameters than typically encountered in image-based CNN architectures on the KITTI Object Detection Benchmark. Vote3Deep outperforms prior art for laser-only as well as laser-camera based object detection in terms of accuracy by considerable margins at competitive inference speeds.

 

  • [PDF] M. Engelcke, D. Rao, D. Zeng Wang, C. Hay Tong, and I. Posner, “Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks,” in Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2017.
    [Bibtex]

    @inproceedings{EngelckeICRA2017,
    author = {Engelcke, M. and Rao, D. and Zeng Wang, D. and Hay Tong, C. and
    Posner, I.},
    title = "{Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks}",
    Booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation (ICRA)},
    Month = {June},
    year = {2017},
    Pdf = {https://arxiv.org/abs/1609.06666}
    }