DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data

1The University of Hong Kong, 2University of Central Florida
Teaser image.
We find that the advancements of previous works have primarily been concentrated on the classes overlapping with the training set, while only marginal progress has been made on novel categories. We empirically found that training with samples from larger-vocabulary datasets as a result of DreamMask contributes to better performance on novel classes. To this end, we can turn to using synthetic data to boost segmentation models that has been proven effective in previous works.

Abstract

Open-vocabulary panoptic segmentation has received significant attention due to its applicability in the real world. Despite claims of robust generalization, we find that the advancements of previous works are attributed mainly to trained categories, exposing a lack of generalization to novel classes. In this paper, we explore boosting existing models from a data-centric perspective. We propose DreamMask, which systematically explores how to generate training data in the open-vocabulary setting, and how to train the model with both real and synthetic data. For the first part, we propose an automatic data generation pipeline with off-the-shelf models. We propose crucial designs for vocabulary expansion, layout arrangement, data filtering, etc. Equipped with these techniques, our generated data could significantly outperform the manually collected web data. To train the model with generated data, a synthetic real alignment loss is designed to bridge the representation gap, bringing noticeable improvements across multiple benchmarks. In general, DreamMask significantly simplifies the collection of large-scale training data, serving as a plug-and-play enhancement for existing methods. For instance, when trained on COCO and tested on ADE20K, the model equipped with DreamMask outperforms the previous state-of-the-art by a substantial margin of 2.1% mIoU.






Quantitative Comparison




Qualitative Comparison

Retained Samples

Algorithm

BibTeX

@inproceedings{tu2024seg,
    title={DreamMask: Boosting Open-vocabulary Panoptic Segmentation with Synthetic Data},
    author={Tu, Yuanpeng and Chen, Xi and SerNam, Lim and Zhao, Hengshuang},
    booktitle={Arxiv},
    year={2024}
  }
}