Deep neural networks for semantic segmentation are most often trained with RGB color images, which encode the radiation visible to the human eyes. In this paper, we study if additional physical scene information, specifically Near-Infrared (NIR) images, improve the performance of neural networks. NIR information can be captured with conventional silicon-based cameras and provide complementary information to visible images regarding object boundaries and materials. In addition, extending the networks’ input from a three to a four channel layer is trivial with respect to changes to the architecture and additional parameters. We perform experiments on several state-of-the-art neural networks trained both on RGB alone and on RGB plus NIR and show that the additional image channel consistently improves semantic segmentation accuracy over conventional RGB input even for powerful architectures.
The 26th IEEE International Conference on Image Processing - IEEE ICIP, Taipei (CN), September 2019.