Accurate semantic image segmentation requires the joint consideration of local appearance, semantic information, and global scene context. In today’s age of pre-trained deep networks and their powerful convolutional features, state-of-the-art semantic segmentation approaches differ mostly in how they choose to combine together these different kinds of information. In this work, we propose a novel scheme for aggregating features from different scales, which we refer to as Multi-Scale Context Intertwining (MSCI). In contrast to previous approaches, which typically propagate information between scales in a one-directional manner, we merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. By training the parameters of the LSTM units on the segmentation task, the above approach learns how to extract powerful and effective features for pixel-level semantic segmentation, which are then combined hierarchically. Furthermore, rather than using fixed information propagation routes, we subdivide images into super-pixels, and use the spatial relationship between them in order to perform image-adapted context aggregation. Our extensive evaluation on public benchmarks indicates that all of the aforementioned components of our approach increase the effectiveness of information propagation throughout the network, and significantly improve its eventual segmentation accuracy.
|Original language||American English|
|Title of host publication||Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings|
|Editors||Vittorio Ferrari, Cristian Sminchisescu, Martial Hebert, Yair Weiss|
|Number of pages||17|
|State||Published - 2018|
|Event||15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany|
Duration: 8 Sep 2018 → 14 Sep 2018
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||15th European Conference on Computer Vision, ECCV 2018|
|Period||8/09/18 → 14/09/18|
Bibliographical noteFunding Information:
Acknowledgments. We thank the anonymous reviewers for their constructive comments. This work was supported in part by NSFC (61702338, 61522213, 61761146002, 61861130365), 973 Program (2015CB352501), Guangdong Science and Technology Program (2015A030312015), Shenzhen Innovation Program (KQJSCX20170727101233642, JCYJ20151015151249564), and ISF-NSFC Joint Research Program (2472/17).
© 2018, Springer Nature Switzerland AG.
- Convolutional neural network
- Deep learning
- Long short-term memory
- Semantic segmentation