|
|
Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning |
Zhong JI1,2,3, Jingwei NI1,3, Xiyao LIU1,3(), Yanwei PANG1,3 |
1. School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China 2. Science and Technology on Electro-Optical Information Security Control Laboratory, Tianjin 300308, China 3. Tianjin Key Laboratory of Brain-Inspired Intelligence Technology, Tianjin 300072, China |
|
|
Abstract Although few-shot learning (FSL) has achieved great progress, it is still an enormous challenge especially when the source and target set are from different domains, which is also known as cross-domain few-shot learning (CD-FSL). Utilizing more source domain data is an effective way to improve the performance of CD-FSL. However, knowledge from different source domains may entangle and confuse with each other, which hurts the performance on the target domain. Therefore, we propose team-knowledge distillation networks (TKD-Net) to tackle this problem, which explores a strategy to help the cooperation of multiple teachers. Specifically, we distill knowledge from the cooperation of teacher networks to a single student network in a meta-learning framework. It incorporates task-oriented knowledge distillation and multiple cooperation among teachers to train an efficient student with better generalization ability on unseen tasks. Moreover, our TKD-Net employs both response-based knowledge and relation-based knowledge to transfer more comprehensive and effective knowledge. Extensive experimental results on four fine-grained datasets have demonstrated the effectiveness and superiority of our proposed TKD-Net approach.
|
Keywords
cross-domain few-shot learning
meta-learning
knowledge distillation
multiple teachers
|
Corresponding Author(s):
Xiyao LIU
|
Just Accepted Date: 09 September 2021
Issue Date: 01 August 2022
|
|
1 |
O, Vinyals C, Blundell T, Lillicrap K, Kavukcuoglu D Wierstra. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3637− 3645
|
2 |
S, Ravi H Larochelle. Optimization as a model for few-shot learning. In: Proceedings of the International Conference on Learning Representations. 2017, 1− 11
|
3 |
J, Snell K, Swersky R S Zemel. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4080− 4090
|
4 |
F, Sung Y, Yang L, Zhang T, Xiang P H S, Torr T M Hospedales. Learning to compare: relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1199− 1208
|
5 |
C, Finn P, Abbeel S Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126− 1135
|
6 |
X Y, Liu S T, Wang M L Zhang . Transfer synthetic over-sampling for class-imbalance learning with limited minority class data. Frontiers of Computer Science, 2019, 13( 5): 996– 1009
|
7 |
Y X, Wang M Hebert. Learning to learn: model regression networks for easy small sample learning. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 616− 634
|
8 |
W Y, Chen Y C, Liu Z, Kira Y C F, Wang J B Huang. A closer look at few-shot classification. In: Proceedings of the International Conference on Learning Representations. 2019, 1− 17
|
9 |
Z, Luo Y, Zou J, Hoffman F F Li. Label efficient learning of transferable representations across domains and tasks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 164− 176
|
10 |
J, Lu Z, Cao K, Wu G, Zhang C Zhang. Boosting few-shot image recognition via domain alignment prototypical networks. In: Proceedings of the 30th International Conference on Tools with Artificial Intelligence. 2018, 260− 264
|
11 |
H W, Ge Y X, Han W J, Kang L Sun . Unpaired image to image transformation via informative coupled generative adversarial networks. Frontiers of Computer Science, 2021, 15( 4): 154326
|
12 |
L, Liu W L, Hamilton G, Long J, Jiang H Larochelle. A universal representation transformer layer for few-shot image classification. In: Proceedings of the 9th International Conference on Learning Representations. 2021, 1− 11
|
13 |
H Y, Tseng H Y, Lee J B, Huang M H Yang. Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1− 18
|
14 |
N, Dvornik C, Schmid J Mairal. Selecting relevant features from a multi-domain representation for few-shot classification. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 769− 786
|
15 |
E, Triantafillou T, Zhu V, Dumoulin P, Lamblin U, Evci K, Xu R, Goroshin C, Gelada K, Swersky P A, Manzagol others. Meta-dataset: a dataset of datasets for learning to learn from few examples. In: Proceedings of the 8th International Conference on Learning Representations. 2020, 1− 24
|
16 |
T, He C, Shen Z, Tian D, Gong C, Sun Y Yan. Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 578− 587
|
17 |
S, Mukherjee Awadallah A Hassan. XtremeDistil: multi-stage distillation for massive multilingual models. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 2221− 2234
|
18 |
G, Hinton O, Vinyals J Dean. Distilling the knowledge in a neural network. 2015, arXiv preprint arXiv: 1503.02531
|
19 |
B, Peng X, Jin D, Li S, Zhou Y, Wu J, Liu Z, Zhang Y Liu. Correlation congruence for knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019, 5006− 5015
|
20 |
A, Nichol J, Achiam J Schulman. On first-order meta-learning algorithms. 2018, arXiv preprint arXiv: 1803.02999
|
21 |
A A, Rusu D, Rao J, Sygnowski O, Vinyals R, Pascanu S, Osindero R Hadsell. Meta-learning with latent embedding optimization. In: Proceedings of the 7th International Conference on Learning Representations. 2019, 1− 17
|
22 |
X, Chen H, Dai Y, Li X, Gao L Song. Learning to stop while learning to predict. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1520− 1530
|
23 |
Z, Ji X, Liu Y, Pang W, Ouyang X Li . Few-shot human-object interaction recognition with semantic-guided attentive prototypes network. IEEE Transactions on Image Processing, 2021, 30: 1648– 1661
|
24 |
Y, Tian Y, Wang D, Krishnan J B, Tenenbaum P Isola. Rethinking few-shot image classification: a good embedding is all you need? In: Proceedings of the 16th European Conference on Computer Vision. 2020, 266− 282
|
25 |
Y X, Wang R, Girshick M, Hebert B Hariharan. Low-shot learning from imaginary data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7278− 7286
|
26 |
K, Li Y, Zhang K, Li Y Fu. Adversarial feature hallucination networks for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 13467− 13476
|
27 |
Z, Chen Y, Fu Y X, Wang L, Ma W, Liu M Hebert. Image deformation meta-networks for one-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 8672− 8681
|
28 |
H, Zhang J, Zhang P Koniusz. Few-shot learning via saliency-guided hallucination of samples. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2765− 2774
|
29 |
C, Yang L, Xie S, Qiao A L Yuille. Training deep neural networks in generations: a more tolerant teacher educates better students. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 5628− 5635
|
30 |
A, Romero N, Ballas S E, Kahou A, Chassang C, Gatta Y Bengio. FitNets: hints for thin deep nets. In: Proceedings of the 3rd International Conference on Learning Representations. 2015, 1− 13
|
31 |
S, Zagoruyko N Komodakis. Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: Proceedings of the 5th International Conference on Learning Representations. 2017, 1− 13
|
32 |
J, Yim D, Joo J, Bae J Kim. A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 7130− 7138
|
33 |
T, Fukuda M, Suzuki G, Kurata S, Thomas J, Cui B Ramabhadran. Efficient knowledge distillation from an ensemble of teachers. In: Proceedings of the 18th Annual Conference of the International Speech Communication Association. 2017, 3697− 3701
|
34 |
Z H, Zhou Y, Jiang S F Chen . Extracting symbolic rules from trained neural network ensembles. AI Communications, 2003, 16( 1): 3– 15
|
35 |
Z H, Zhou Y Jiang . NeC4. 5: neural ensemble based C4.5. IEEE Transactions on Knowledge and Data Engineering, 2004, 16( 6): 770– 773
|
36 |
S, Ruder P, Ghaffari J G Breslin. Knowledge adaptation: teaching to adapt. 2017, arXiv preprint arXiv: 1702.02052
|
37 |
N, Li Y, Yu Z H Zhou. Diversity regularized ensemble pruning. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases. 2012, 330− 345
|
38 |
J, Deng W, Dong R, Socher L J, Li K, Li F F Li. ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248− 255
|
39 |
C, Wah S, Branson P, Welinder P, Perona S Belongie. The caltech-ucsd birds-200-2011 dataset. Technical Report CNS-TR-2011-001. Pasadena: California Institute of Technology, 2011
|
40 |
N, Hilliard L, Phillips S, Howland A, Yankov C D, Corley N O Hodas. Few-shot learning with metric-agnostic conditional embeddings. 2018, arXiv preprint arXiv: 1802.04376
|
41 |
J, Krause M, Stark J, Deng F F Li. 3D object representations for fine-grained categorization. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. 2013, 554− 561
|
42 |
B, Zhou A, Lapedriza A, Khosla A, Oliva A Torralba . Places: a 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40( 6): 1452– 1464
|
43 |
Horn G, Van Aodha O, Mac Y, Song Y, Cui C, Sun A, Shepard H, Adam P, Perona S Belongie. The iNaturalist species classification and detection dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8769− 8778
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|