Weakly-supervised instance co-segmentation via tensor-based salient co-peak search

doi:10.1007/s11704-022-2468-8

Front. Comput. Sci.

2024, Vol. 18

Issue (2) : 182305 https://doi.org/10.1007/s11704-022-2468-8

Artificial Intelligence

Weakly-supervised instance co-segmentation via tensor-based salient co-peak search

Wuxiu QUAN^1,², Yu HU², Tingting DAN², Junyu LI², Yue ZHANG¹(

), Hongmin CAI²

¹. School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou 510665, China
². School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China

Download: PDF(6898 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Instance co-segmentation aims to segment the co-occurrent instances among two images. This task heavily relies on instance-related cues provided by co-peaks, which are generally estimated by exhaustively exploiting all paired candidates in point-to-point patterns. However, such patterns could yield a high number of false-positive co-peaks, resulting in over-segmentation whenever there are mutual occlusions. To tackle with this issue, this paper proposes an instance co-segmentation method via tensor-based salient co-peak search (TSCPS-ICS). The proposed method explores high-order correlations via triple-to-triple matching among feature maps to find reliable co-peaks with the help of co-saliency detection. The proposed method is shown to capture more accurate intra-peaks and inter-peaks among feature maps, reducing the false-positive rate of co-peak search. Upon having accurate co-peaks, one can efficiently infer responses of the targeted instance. Experiments on four benchmark datasets validate the superior performance of the proposed method.

Keywords weakly-supervised co-segmentation co-peak tensor matching deep network instance segmentation

Corresponding Author(s): Yue ZHANG

Just Accepted Date: 14 December 2022 Issue Date: 15 March 2023

Cite this article:

Wuxiu QUAN,Yu HU,Tingting DAN, et al. Weakly-supervised instance co-segmentation via tensor-based salient co-peak search[J]. Front. Comput. Sci., 2024, 18(2): 182305.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-022-2468-8
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I2/182305

Fig.1 Matching between point-to-point pattern and third-order pattern. In the point-to-point pattern, points from different instances are easily mismatched and assigned to the same instance mask. For example, point

a

had been matched by

e

and

c

f

. However, point

b

may be matched with

g ′

instead of

g

because of the emerging occlusion between the two targeted instances in the second image. The point pair

b

g ′

is considered a false-positive co-peak. These pairs of points are mismatched from two co-occurrent instances, prompting the over-segmentation. By contrast, the connections among intra-peaks are estimated by the high-order pattern, and the highly instance-related tuples can be correlated as a whole. Therefore, the tuple

(a, b, c)

will match with the tuple

(e, g, f)

or the tuple

(e ′, g ′, f ′)

, thereby reducing the false-positive co-peaks

Fig.2 Overview of the proposed TSCPS-ICS. In the training phase, the feature maps are first extracted by a pre-trained fully convolutional network (FCN) on paired images. Then two complementary modules, including the co-peak search and co-saliency detection module, are introduced to explore instance-related cues, including peaks and fine saliency maps. Finally, in the inference phase, the instance-related cues are aggregated to infer the instance masks

Fig.3 A triangle descriptor by a six-dimensional feature. The first three features

(f 1, f 2, f 3)

are angles of the triangle, and the three other features

(f 4, f 5, f 6)

are the angles between edges and the vertical

Tab.1 Statistics on tested datasets

Method	SOC		VOC12		COCO-VOC		COCO-NONVOC
Method	$m A P 0.25$	$m A P 0.5$	$m A P 0.25$	$m A P 0.5$	$m A P 0.25$	$m A P 0.5$	$m A P 0.25$	$m A P 0.5$
CLRW [36]	34.9	15.6	29.2	10.5	33.3	13.7	24.6	10.7
UODL [37]	11.0	2.7	9.4	2.0	9.6	2.2	8.5	1.8
DDT [29]	43.0	25.7	30.7	8.8	31.4	10.1	25.7	9.7
DDT+ [29]	39.6	22.4	33.6	9.4	31.7	10.6	26.0	10.1
DFF [38]	42.3	17.0	27.7	13.7	30.8	11.6	22.6	7.3
NLDF [39]	49.5	21.6	34.3	12.7	39.1	18.2	23.9	8.5
C2S-Net [40]	37.0	12.5	30.1	10.7	39.6	13.4	25.1	7.6
SCG [41]	46.6	20.8	38.4	12.9	46.8	15.5	29.7	11.2
BCS [42]	45.8	20.3	37.4	11.5	47.3	16.2	32.1	11.5
PRM [30]	?	?	45.3	14.8	44.9	14.6	?	?
DeepCo3 [14]	54.2	26.0	45.6	16.7	52.6	21.1	35.3	12.3
TSCPS-ICS	56.2	24.2	47.4	16.5	53.4	21.1	36.8	12.4

Tab.2 Performance of instance co-segmentation by our method and eleven competing methods on four tested datasets. A higher value of

m A P 0.25

m A P 0.5

indicates a better performance. The optimal and sub-optimal results are highlighted in blue and green, respectively

Fig.4 The visualization of co-peak response maps (CRMs). From left to right are (a) original image, (b) CRMs extracted from DeepCo3, (c) CRMS extracted from TSCPS-ICS. The identified false-positive (FP) and true-positive (TP) co-peaks are denoted by green and yellow stars, respectively. The FP CRMs suffer from over-segmentation in the scene of mutual occlusions. In comparison, the TP CRMs are shown to eliminate the false-positive co-peaks

Fig.5 Experimental results of instance co-segmentation by (iii) TSCPS-ICS; (iv) DeepCo3; (v) PRM; (vi) NLDF; (vii) DFF; (viii) CLRW on five object categories, including (a) sheep, (b) dog, (c) horse, (d) train, and (e) airplane from COCOVOC dataset. The first row is (i) original images, while the second row is the (ii) ground truth for quantifying the performances of each method. One can verify that the result of the proposed TSCPS-ICS achieved superior performance, with the segmentations nearly perfectly matched to the ground truth

$? m$	$? p$	$? a$	$? s$	SOC	VOC12	COCO-VOC	COCO-NONVOC
$√$	$×$	$×$	$×$	36.5	33.2	32.7	25.8
$√$	$√$	$×$	$×$	49.8	43.3	49.3	33.2
$√$	$×$	$√$	$×$	38.2	34.7	34.2	26.4
$√$	$×$	$×$	$√$	45.7	40.8	42.3	29.5
$√$	$√$	$×$	$√$	54.3	46.3	52.7	35.5
Only tensor matching module				50.3	44.2	49.8	34.0
Only co-saliency detection module				39.8	36.6	37.2	27.5
All losses/modules				56.2	47.4	53.4	36.8

Tab.3 Ablation studies measured by

m A P 0.25

Fig.6 Energy of the false-positive co-peaks on four datasets. The higher the energy is, the better performance of the method does. The result by DeepCo3 and the proposed method are highlighted in different colors, respectively

1	H Y, Chen Y Y, Lin B Y Chen . Co-segmentation guided Hough transform for robust feature matching. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37( 12): 2388–2401
2	A, Subramaniam A, Nambiar A Mittal . Co-segmentation inspired attention networks for video-based person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 562−572
3	A, Mustafa A Hilton . Semantically coherent co-segmentation and reconstruction of dynamic scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 5583−5592
4	C, Rother T, Minka A, Blake V Kolmogorov . Cosegmentation of image pairs by histogram matching - incorporating a global constraint into MRFs. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 993−1000
5	F, Wang Q, Huang L J Guibas . Image co-segmentation via consistent functional maps. In: Proceedings of the IEEE International Conference on Computer Vision. 2013, 849−856
6	T, Taniai S N, Sinha Y Sato . Joint recovery of dense correspondence and cosegmentation in two images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 4246−4255
7	L, Li Z, Liu J Zhang . Unsupervised image co-segmentation via guidance of simple images. Neurocomputing, 2018, 275: 1650–1661
8	G, Tao H, Li J, Huang C, Han J, Chen G, Ruan W, Huang Y, Hu T, Dan B, Zhang S, He L, Liu H Cai . SeqSeg: A sequential method to achieve nasopharyngeal carcinoma segmentation free from background dominance. Medical Image Analysis, 2022, 78: 102381
9	Y, Li T, Dan H, Li J, Chen H, Peng L, Liu H Cai . NPCNet: Jointly segment primary nasopharyngeal carcinoma tumors and metastatic lymph nodes in MR images. IEEE Transactions on Medical Imaging, 2022, 41( 7): 1639–1650
10	Y, Li H, Peng T, Dan Y, Hu G, Tao H Cai . Coarse-to-fine nasopharyngeal carcinoma segmentation in MRI via multi-stage rendering. In: Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine. 2020, 623−628
11	J, Han R, Quan D, Zhang F Nie . Robust object co-segmentation using background prior. IEEE Transactions on Image Processing, 2018, 27( 4): 1639–1651
12	W, Li Jafari O, Hosseini C Rother . Deep object co-segmentation. In: Proceedings of the 14th Asian Conference on Computer Vision. 2018, 638−653
13	K, Zhang J, Chen B, Liu Q Liu . Deep object co-segmentation via spatial-semantic network modulation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 12813−12820
14	K J, Hsu Y Y, Lin Y Y Chuang . DeepCO3: Deep instance co-segmentation by co-peak search and co-saliency detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 8838−8847
15	G, Papandreou L C, Chen K P, Murphy A L Yuille . Weakly-and semi-supervised learning of a deep convolutional network for semantic image segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 1742−1750
16	D, Lin J, Dai J, Jia K, He J Sun . ScribbleSup: Scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3159−3167
17	A, Roy S Todorovic . Combining bottom-up, top-down, and smoothness cues for weakly supervised image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 7282−7291
18	B, Lai X Gong . Saliency guided dictionary learning for weakly-supervised image parsing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 3630−3639
19	A, Kolesnikov C H Lampert . Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 695−711
20	C L, Zitnick P Dollár . Edge boxes: Locating object proposals from edges. In: Proceedings of the 13th European Conference on Computer Vision. 2014, 391−405
21	J, Pont-Tuset P, Arbeláez J T, Barron F, Marques J Malik . Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39( 1): 128–140
22	F, Wan P, Wei J, Jiao Z, Han Q Ye . Min-entropy latent model for weakly supervised object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1297−1306
23	Y, Wei J, Feng X, Liang M M, Cheng Y, Zhao S Yan . Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6488−6496
24	R A, Khan A, Meyer H, Konik S Bouakaz . Saliency-based framework for facial expression recognition. Frontiers of Computer Science, 2019, 13( 1): 183–198
25	T, Li K, Zhang S, Shen B, Liu Q, Liu Z Li . Image co-saliency detection and instance co-segmentation using attention graph clustering based graph convolutional network. IEEE Transactions on Multimedia, 2021, 24: 492–505
26	F, Meng K, Luo H, Li Q, Wu X Xu . Weakly supervised semantic segmentation by a class-level multiple group cosegmentation and foreground fusion strategy. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30( 12): 4823–4836
27	X, Gong X, Liu Y, Li H Li . A novel co-attention computation block for deep learning based image co-segmentation. Image and Vision Computing, 2020, 101: 103973
28	X S, Wei C L, Zhang Y, Li C W, Xie J, Wu C, Shen Z H Zhou . Deep descriptor transforming for image co-localization. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence. 2017, 3048−3054
29	X S, Wei C L, Zhang J, Wu C, Shen Z H Zhou . Unsupervised object discovery and co-localization by deep descriptor transformation. Pattern Recognition, 2019, 88: 113–126
30	Y, Zhou Y, Zhu Q, Ye Q, Qiu J Jiao . Weakly supervised instance segmentation using class peak response. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 3791−3800
31	R, Fan M M, Cheng Q, Hou T J, Mu J, Wang S M Hu . S4Net: single stage salient-instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 6096−6105
32	G, Li Y, Xie L, Lin Y Yu . Instance-level salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 247−256
33	M, Aumüller E, Bernhardsson A Faithfull . ANN-benchmarks: a benchmarking tool for approximate nearest neighbor algorithms. In: Proceedings of the 10th International Conference on Similarity Search and Applications. 2017, 34−49
34	R, Zass A Shashua . Probabilistic graph and hypergraph matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1−8
35	D, Zhang J, Han Y Zhang . Supervision by fusion: Towards unsupervised learning of deep salient object detector. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 4068−4076
36	K, Tang A, Joulin L J, Li F F Li . Co-localization in real-world images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1464−1471
37	M, Cho S, Kwak C, Schmid J Ponce . Unsupervised object discovery and localization in the wild: part-based matching with bottom-up region proposals. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1201−1210
38	E, Collins R, Achanta S Susstrunk . Deep feature factorization for concept discovery. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 352−368
39	Z, Luo A, Mishra A, Achkar J, Eichel S, Li P M Jodoin . Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6593−6601
40	X, Li F, Yang H, Cheng W, Liu D Shen . Contour knowledge transfer for salient object detection. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 370−385
41	N, Liu W, Zhao L, Shao J Han . SCG: Saliency and contour guided salient instance segmentation. IEEE Transactions on Image Processing, 2021, 30: 5862–5874
42	X, Wang J, Feng B, Hu Q, Ding L, Ran X, Chen W Liu . Weakly-supervised instance segmentation via class-agnostic learning with salient images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 10220−10230

[1]

FCS-22468-OF-WQ_suppl_1

Download

[1]	Yanpeng SUN, Zechao LI. SSA: semantic structure aware inference on CNN networks for weakly pixel-wise dense predictions without cost[J]. Front. Comput. Sci., 2025, 19(2): 192702-.
[2]	Zhi ZHOU, Yi-Xuan JIN, Yu-Feng LI. Rts: learning robustly from time series data with noisy label[J]. Front. Comput. Sci., 2024, 18(6): 186332-.

Viewed

Full text

Abstract

Cited

Shared

Discussed