Abstract
Semantic segmentation is a key computer vision task that has been actively researched for decades. In recent years, supervised methods have reached unprecedented accuracy; however, obtaining pixel-level annotation is very time-consuming and expensive. In this paper, we propose a novel open-vocabulary approach to creating semantic segmentation masks, without the need for training segmentation networks or seeing any segmentation masks. At test time, our method takes as input the image-level labels of the categories present in the image. We utilize a vision-language embedding model to create a rough segmentation map for each class via model interpretability methods and refine the maps using a test-time augmentation technique. The output of this stage provides pixel-level pseudo-labels, which are utilized by single-image segmentation techniques to obtain high-quality output segmentations. Our method is shown quantitatively and qualitatively to outperform methods that use a similar amount of supervision, and to be competitive with weakly-supervised semantic-segmentation techniques.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 Workshops, Proceedings |
Editors | Leonid Karlinsky, Tomer Michaeli, Ko Nishino |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 56-72 |
Number of pages | 17 |
ISBN (Print) | 9783031250620 |
DOIs | |
State | Published - 2023 |
Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: 23 Oct 2022 → 27 Oct 2022 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13802 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th European Conference on Computer Vision, ECCV 2022 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 23/10/22 → 27/10/22 |
Bibliographical note
Publisher Copyright:© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- Language-based segmentation
- Open-vocabulary segmentation
- Semantic segmentation