Improving meta-learning model via meta-contrastive loss

doi:10.1007/s11704-021-1188-9

Front. Comput. Sci.

2022, Vol. 16

Issue (5) : 165331 https://doi.org/10.1007/s11704-021-1188-9

RESEARCH ARTICLE

Improving meta-learning model via meta-contrastive loss

Pinzhuo TIAN, Yang GAO(

)

Department of Computer Science and Technology, Nanjing University, Jiangsu 210023, China

Download: PDF(6396 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Recently, addressing the few-shot learning issue with meta-learning framework achieves great success. As we know, regularization is a powerful technique and widely used to improve machine learning algorithms. However, rare research focuses on designing appropriate meta-regularizations to further improve the generalization of meta-learning models in few-shot learning. In this paper, we propose a novel meta-contrastive loss that can be regarded as a regularization to fill this gap. The motivation of our method depends on the thought that the limited data in few-shot learning is just a small part of data sampled from the whole data distribution, and could lead to various bias representations of the whole data because of the different sampling parts. Thus, the models trained by a few training data (support set) and test data (query set) might misalign in the model space, making the model learned on the support set can not generalize well on the query data. The proposed meta-contrastive loss is designed to align the models of support and query sets to overcome this problem. The performance of the meta-learning model in few-shot learning can be improved. Extensive experiments demonstrate that our method can improve the performance of different gradient-based meta-learning models in various learning problems, e.g., few-shot regression and classification.

Keywords meta-learning few-shot learning metaregularization deep learning

Corresponding Author(s): Yang GAO

Just Accepted Date: 11 June 2021 Issue Date: 31 December 2021

Cite this article:

Pinzhuo TIAN,Yang GAO. Improving meta-learning model via meta-contrastive loss[J]. Front. Comput. Sci., 2022, 16(5): 165331.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-1188-9
https://academic.hep.com.cn/fcs/EN/Y2022/V16/I5/165331

Fig.1 The motivation of our method. Too limited data in few-shot learning leads to data discrepancy of support and query set, because they are sampled from different parts. This problem causes the model learned by the support set can not generalize well on the query set

Fig.2

Fig.3 A simple illustration of meta-contrastive loss. In few-shot learning, each task

T

contains a support set and a query set. In our method, we use meta-contrastive loss to align the models of support and query set to eliminate the influence of the bias representation, caused by the limited data. Compared with the traditional contrastive loss to align the feature vector, our method needs to deal with how to maximize agreement of the models with the parameter matrix

Methods	$5$ -shot	$10$ -shot
ANIL	0.746 ± 0.044	0.354 ± 0.018
ANIL-ours	0.744 ± 0.044	0.345 ± 0.018

Tab.1 Mean square error of few-shot regression. Lower is better

Methods	Embedding	miniImageNet $5$ -shot
ANIL	ConvNet	58.51 ± 0.46
ANIL-ours	ConvNet	60.11 ± 0.46
R2D2	ConvNet	56.79 ± 0.41
R2D2-ours	ConvNet	61.70 ± 0.41
MetaOpt	ConvNet	64.06 ± 0.41
MetaOpt-ours	ConvNet	65.80 ± 0.40
R2D2	ResNet12	58.48 ± 0.43
R2D2-ours	ResNet12	70.04 ± 0.40
MetaOpt	ResNet12	66.64 ± 0.41
MetaOpt-ours	ResNet12	68.49 ± 0.42

Tab.2 Accuracy(%) of

5

-way classification on miniImageNet

Methods	Embedding	tieredImageNet $5$ -shot
ANIL	ConvNet	58.64 ± 0.49
ANIL-ours	ConvNet	61.72 ± 0.50
R2D2	ConvNet	59.53 ± 0.45
R2D2-ours	ConvNet	64.21 ± 0.45
MetaOpt	ConvNet	63.97 ± 0.46
MetaOpt-ours	ConvNet	65.75 ± 0.45
R2D2	ResNet12	60.10 ± 0.46
R2D2-ours	ResNet12	70.21 ± 0.45
MetaOpt	ResNet12	66.02 ± 0.46
MetaOpt-ours	ResNet12	71.82 ± 0.47

Tab.3 Accuracy(%) of

5

-way classification on tieredImageNet

Methods	CUB2011 $5$ -shot
ANIL	71.82 ± 0.49
ANIL-ours	73.91 ± 0.47
R2D2	75.73 ± 0.40
R2D2-ours	77.23 ± 0.38
MetaOpt	74.88 ± 0.42
MetaOpt-ours	75.92 ± 0.41

Tab.4 Accuracy(%) of

5

-way classification on CUB2011

Method	Our method	Scale factor	$5$ -shot	$10$ -shot
ANIL			0.746	0.354
	√		0.744	0.345
		√	2.561	1.811

Tab.5 Mean square error of different methods in few-shot regression

1	C Finn, P Abbeel, S Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126−1135
2	J Snell, K Swersky, R S Zemel. Prototypical networks for few-shot learning. 2017, arXiv preprint arXiv: 1703.05175
3	P Tian , Z Wu , L Qi , L Wang , Y Shi , Y Gao . Differentiable meta-learning model for few-shot semantic segmentation. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34( 7): 12087– 12094
4	I Goodfellow, Y Bengio, A Courville. Deep Learning. Cambridge: MIT Press, 2016
5	D P Kingma, M Welling. Auto-encoding variational bayes. 2014, arXiv preprint arXiv: 1312.6114
6	N Srivastava , G Hinton , A Krizhevsky , I Sutskever , R Salakhutdinov . Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 2014, 15( 1): 1929– 1958
7	T Chen, S Kornblith, M Norouzi, G E Hinton. A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1597−1607
8	K He, H Fan, Y Wu, S Xie, R Girshick. Momentum contrast for unsupervised visual representation learning. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 9726−9735
9	O Vinyals, C Blundell, T Lillicrap, K Kavukcuoglu, D Wierstra. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3637−3645
10	F Sung, Y Yang, L Zhang, T Xiang, P H S Torr, T M Hospedales. Learning to compare: relation network for few-shot learning. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1199−1208
11	S Ravi, H Larochelle. Optimization as a model for few-shot learning. In: Proceedings of the 5th International Conference on Learning Representations. 2017
12	A Santoro, S Bartunov, M Botvinick, D Wierstra, T Lillicrap. One-shot learning with memory-augmented neural networks. 2016, arXiv preprint arXiv: 1605.06065
13	H B Lee, H Lee, D Na, S Kim, M Park, E Yang, S J Hwang. Learning to balance: bayesian meta-learning for imbalanced and out-of-distribution tasks. 2020, arXiv preprint arXiv: 1905.12917
14	A Raghu, M Raghu, S Bengio, O Vinyals. Rapid learning or feature reuse? towards understanding the effectiveness of MAML. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. 2019
15	K Lee , S Maji , A Ravichandran , S Soatto . Meta-learning with differentiable convex optimization. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 1064, 9– 10657
16	L Bertinetto, J F Henriques, P H S Torr, A Vedaldi. Meta-learning with differentiable closed-form solvers. In: Proceedings of the 7th International Conference on Learning Representations. 2019
17	A Sinha , P Malo , K Deb . A review on bilevel optimization: from classical to evolutionary approaches and applications. IEEE Transactions on Evolutionary Computation, 2018, 22( 2): 276– 295
18	Y Balaji, S Sankaranarayanan, R Chellappa. MetaReg: towards domain generalization using meta-regularization. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 1006−1016
19	H Y Tseng, Y W Chen, Y H Tsai, S Liu, Y Y Lin, M H Yang. Regularizing meta-learning via gradient dropout. In: Proceedings of the 15th Asian Conference on Computer Vision. 2020, 218– 234
20	A Jaiswal , A R Babu , M Z Zadeh , D Banerjee , F Makedon . A survey on contrastive self-supervised learning. Technologies, 2021, 9( 1): 2–
21	A van den Oord, Y Li, O Vinyals. Representation learning with contrastive predictive coding. 2018, arXiv preprint arXiv: 1807.03748
22	M Tschannen, J Djolonga, P K Rubenstein, S Gelly, M Lucic. On mutual information maximization for representation learning. In: Proceedings of the 8th International Conference on Learning Representations. 2020
23	L Franceschi, P Frasconi, S Salzo, R Grazzi, M Pontil. Bilevel programming for hyperparameter optimization and meta-learning. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 1568−1577
24	C Cortes , V Vapnik . Support-vector networks. Machine Learning, 1995, 20( 3): 273– 297
25	D Kingma, J Ba. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations. 2015
26	M Ren, E Triantafillou, S Ravi, J Snell, K Swersky, J B Tenenbaum, H Larochelle, R S Zemel. Meta-learning for semi-supervised few-shot classification. In: Proceedings of the 6th International Conference on Learning Representations. 2018
27	A Krizhevsky, I Sutskever, G E Hinton. ImageNet classification with deep convolutional neural networks. In: Proceedings of 26th Annual Conference on Neural Information Processing Systems. 2012, 1106−1114
28	B N Oreshkin, P Rodriguez, A Lacoste. TADAM: task dependent adaptive metric for improved few-shot learning. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 719– 729
29	P Welinder, S Branson, T Mita, C Wah, F Schroff, S Belongie, P Perona. Caltech-UCSD birds 200. CNS-TR-2010-001. Pasadena: California Institute of Technology, 2010
30	W Y Chen, Y C Liu, Z Kira, Y C F Wang, J B Huang. A closer look at few-shot classification. In: Proceedings of the 7th International Conference on Learning Representations. 2019

[1]

Download

[1]	Muazzam MAQSOOD, Sadaf YASMIN, Saira GILLANI, Maryam BUKHARI, Seungmin RHO, Sang-Soo YEO. An efficient deep learning-assisted person re-identification solution for intelligent video surveillance in smart cities[J]. Front. Comput. Sci., 2023, 17(4): 174329-.
[2]	Qiming FU, Zhechao WANG, Nengwei FANG, Bin XING, Xiao ZHANG, Jianping CHEN. MAML²: meta reinforcement learning via meta-learning for task categories[J]. Front. Comput. Sci., 2023, 17(4): 174325-.
[3]	Zhong JI, Jingwei NI, Xiyao LIU, Yanwei PANG. Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning[J]. Front. Comput. Sci., 2023, 17(2): 172312-.
[4]	Zhe XUE, Junping DU, Xin XU, Xiangbin LIU, Junfu WANG, Feifei KOU. Few-shot node classification via local adaptive discriminant structure learning[J]. Front. Comput. Sci., 2023, 17(2): 172316-.
[5]	Haoyu ZHAO, Weidong MIN, Jianqiang XU, Qi WANG, Yi ZOU, Qiyan FU. Scene-adaptive crowd counting method based on meta learning with dual-input network DMNet[J]. Front. Comput. Sci., 2023, 17(1): 171304-.
[6]	Wei GAO, Mingwen SHAO, Jun SHU, Xinkai ZHUANG. Meta-BN Net for few-shot learning[J]. Front. Comput. Sci., 2023, 17(1): 171302-.
[7]	Yi WEI, Mei XUE, Xin LIU, Pengxiang XU. Data fusing and joint training for learning with noisy labels[J]. Front. Comput. Sci., 2022, 16(6): 166338-.
[8]	Donghong HAN, Yanru KONG, Jiayi HAN, Guoren WANG. A survey of music emotion recognition[J]. Front. Comput. Sci., 2022, 16(6): 166335-.
[9]	Tian WANG, Jiakun LI, Huai-Ning WU, Ce LI, Hichem SNOUSSI, Yang WU. ResLNet: deep residual LSTM network with longer input for action recognition[J]. Front. Comput. Sci., 2022, 16(6): 166334-.
[10]	Tian WANG, Shiye LEI, Youyou JIANG, Choi CHANG, Hichem SNOUSSI, Guangcun SHAN, Yao FU. Accelerating temporal action proposal generation via high performance computing[J]. Front. Comput. Sci., 2022, 16(4): 164317-.
[11]	Kaimin WEI, Tianqi LI, Feiran HUANG, Jinpeng CHEN, Zefan HE. Cancer classification with data augmentation based on generative adversarial networks[J]. Front. Comput. Sci., 2022, 16(2): 162601-.
[12]	Yu OU, Lang LI. Side-channel analysis attacks based on deep learning network[J]. Front. Comput. Sci., 2022, 16(2): 162303-.
[13]	Qiang LIN, Yusheng HAO, Caihong LIU. Wi-Fi based non-invasive detection of indoor wandering using LSTM model[J]. Front. Comput. Sci., 2021, 15(6): 156505-.
[14]	Anirban DUTTA, Gudmalwar ASHISHKUMAR, Ch V Rama RAO. Performance analysis of ASR system in hybrid DNN-HMM framework using a PWL euclidean activation function[J]. Front. Comput. Sci., 2021, 15(4): 154705-.
[15]	Huiying ZHANG, Yu ZHANG, Xin GENG. Practical age estimation using deep label distribution learning[J]. Front. Comput. Sci., 2021, 15(3): 153318-.

Viewed

Full text

Abstract

Cited

Shared

Discussed