Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (1) : 171302    https://doi.org/10.1007/s11704-021-1237-4
RESEARCH ARTICLE
Meta-BN Net for few-shot learning
Wei GAO1, Mingwen SHAO1(), Jun SHU2, Xinkai ZHUANG1
1. School of Computer Science, China University of Petroleum, Qingdao 266580, China
2. School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
 Download: PDF(7851 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

In this paper, we propose a lightweight network with an adaptive batch normalization module, called Meta-BN Net, for few-shot classification. Unlike existing few-shot learning methods, which consist of complex models or algorithms, our approach extends batch normalization, an essential part of current deep neural network training, whose potential has not been fully explored. In particular, a meta-module is introduced to learn to generate more powerful affine transformation parameters, known as γ and β, in the batch normalization layer adaptively so that the representation ability of batch normalization can be activated. The experimental results on miniImageNet demonstrate that Meta-BN Net not only outperforms the baseline methods at a large margin but also is competitive with recent state-of-the-art few-shot learning methods. We also conduct experiments on Fewshot-CIFAR100 and CUB datasets, and the results show that our approach is effective to boost the performance of weak baseline networks. We believe our findings can motivate to explore the undiscovered capacity of base components in a neural network as well as more efficient few-shot learning methods.

Keywords meta-learning      few-shot learning      batch normalization     
Corresponding Author(s): Mingwen SHAO   
Just Accepted Date: 06 August 2021   Issue Date: 01 March 2022
 Cite this article:   
Wei GAO,Mingwen SHAO,Jun SHU, et al. Meta-BN Net for few-shot learning[J]. Front. Comput. Sci., 2023, 17(1): 171302.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-1237-4
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I1/171302
Fig.1  
Fig.2  The architecture of Meta-BN Net (left) and the detail of Meta-BN Generator (right). Given a batch of images, the Feature Extractor first extracts the feature representation layer by layer. Then, conditioned on the output of each convolution layer, the Meta-BN Generator will produce adaptive affine transformation parameters to adjust the feature distribution. Finally, the modified feature representation will be input into the Classifier for Classification. Each Meta-BN Generator is constructed with four convolution blocks, followed with an average pool, for the extraction of meta-information
Method Setting Scheme Backbone 5-way 1-shot 5-way 5-shot
Meta LSTM [41] Inductive Meta-based Conv-4 43.44±0.77 60.60±0.71
MatchingNet [7] Inductive Meta-based Conv-4 43.56±0.84 55.31±0.73
MAML [21] Inductive Meta-based Conv-4 48.70±1.84 63.11±0.92
ProtoNet [25] Inductive Meta-based Conv-4 49.42±0.78 68.20±0.66
Reptile [23] Inductive Meta-based Conv-4 49.97±0.32 49.97±0.32
GNN [44] Inductive Meta-based Conv-4 50.33±0.36 66.41±0.63
RelationNet [45] Inductive Meta-based Conv-4 50.44±0.82 65.32±0.70
Meta SGD [22] Inductive Meta-based Conv-4 50.47±1.87 64.03±0.94
Qiao et al. [46] Inductive Meta-based Conv-4 54.53±0.40 67.87±0.20
FEAT [47] Inductive Meta-based Conv-4 55.15±0.20 71.61±0.16
CAML [48] Inductive Meta-based ResNet-12 59.23±0.99 72.35±0.71
TADAM [33] Inductive Meta-based ResNet-12 58.50±0.30 76.70±0.30
AM3 [49] Inductive Meta-based ResNet-12 65.30±0.49 78.10±0.36
DeepEMD [50] Inductive Meta-based ResNet-12 65.91±0.82 82.41±0.56
LEO [51] Inductive Meta-based WRN-28-10 65.10±0.20 81.11±0.14
EPNet [52] Transductive Meta-based WRN-28-10 70.74±0.85 79.20±0.40
SIB [53] Transductive Meta-based WRN-28-10 70.00±0.60 79.20±0.40
SimpleShot(CL2N) [28] Inductive Non-Meta Conv-4 49.69±0.19 66.92±0.17
Baseline++ [13] Inductive Non-Meta Conv-4 48.24±0.75 66.43±0.63
Baseline++ [13] Inductive Non-Meta ResNet-18 51.87±0.77 75.68±0.63
Transductive fine-tuning [27] Transductive Non-Meta Conv-4 52.30±0.61 68.78±0.53
Meta-BN Net Inductive Mixed Conv-4 71.73±0.23 82.58±0.17
Tab.1  Average accuracy (%) of 5-way 1-shot and 5-shot classification on miniImageNet
γ & β 5-way 1-shot 5-way 5-shot DB index
Naive 33.75±0.17 63.23±0.17 0.31
Fixed 40.29±0.18 63.18±0.18 0.26
Meta-BN 71.73±0.23 82.58±0.17 0.04
Tab.2  Ablation study on the miniImageNet dataset
γ & β 5-way 1-shot 5-way 5-shot DB index
Naive 36.73±0.17 63.98±0.17 0.24
Fixed 38.64±0.17 63.36±0.18 0.25
Meta-BN 40.89±0.18 68.57±0.17 0.21
Tab.3  Ablation study on the CUB dataset
γ & β 5-way 1-shot 5-way 5-shot DB index
Naive 29.56±0.14 48.18±0.18 0.25
Fixed 31.50±0.15 48.92±0.17 0.22
Meta-BN 32.29±0.15 50.49±0.18 0.19
Tab.4  Ablation study on the FC100 dataset
Fig.3  Qualitative visualization of Meta-BN Net: without Meta-BN generator (left) and with Meta-BN generator (right)
  
  
  
  
1 A Krizhevsky, I Sutskever, G E Hinton. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 26th Conference on Neural Information Processing Systems. 2012, 1106−1114
2 K Simonyan, A Zisserman. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
3 C Szegedy, W Liu, Y Jia, P Sermanet, S Reed, D Anguelov, D Erhan, V Vanhoucke, A Rabinovich. Going deeper with convolutions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1– 9
4 K He, X Zhang, S Ren, J Sun. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770– 778
5 B M Lake , R Salakhutdinov , J B Tenenbaum . Human-level concept learning through probabilistic program induction. Science, 2015, 350( 6266): 1332– 1338
6 L Fei-Fei , R Fergus , P Perona . One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28( 4): 594– 611
7 O Vinyals, C Blundell, T Lillicrap, K Kavukcuoglu, D Wierstra. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 3637−3645
8 Y Tian, Y Wang, D Krishnan, J B Tenenbaum, P Isola. Rethinking few-shot image classification: a good embedding is all you need? In: Proceedings of the 16th European Conference on Computer Vision. 2020, 266– 282
9 M Goldblum, S Reich, L Fowl, R Ni, V Cherepanova, T Goldstein. Unraveling meta-learning: understanding feature representations for few-shot tasks. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 3607−3616
10 N Bendre, H T Marín, P Najafirad. Learning from few samples: a survey. 2020, arXiv preprint arXiv: 2007.15484
11 Y Bengio, S Bengio, J Cloutier. Learning a synaptic learning rule. In: Proceedings of IJCNN-91-Seattle International Joint Conference on Neural Networks. 1991, 969
12 J Schmidhuber. Evolutionary principles in self-referential learning. On learning now to learn: the meta-meta-meta...-hook. Technische Universitat Munchen, Dissertation, 1987
13 W Y Chen, Y C Liu, Z Kira, Y C F Wang, J B Huang. A closer look at few-shot classification. In: Proceedings of the 7th International Conference on Learning Representations. 2019
14 Bronskill J, Gordon J, Requeima J, Nowozin S, Turner R E. TASKNORM: rethinking batch normalization for meta-learning. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 1153−1164
15 J L Ba, J R Kiros, G E Hinton. Layer normalization. 2016, arXiv preprint arXiv: 1607.06450
16 D Ulyanov, A Vedaldi, V Lempitsky. Instance normalization: the missing ingredient for fast stylization. 2016, arXiv preprint arXiv: 1607.08022
17 Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 448–456
18 A Santoro, S Bartunov, M Botvinick, D Wierstra, T Lillicrap. One-shot learning with memory-augmented neural networks. 2016, arXiv preprint arXiv: 1605.06065
19 A Graves, G Wayne, I Danihelka. Neural turing machines. 2014, arXiv preprint arXiv: 1410.5401
20 N Mishra, M Rohaninejad, X Chen, P Abbeel. A simple neural attentive meta-learner. In: Proceedings of the 6th International Conference on Learning Representations. 2018
21 C Finn, P Abbeel, S Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1126−1135
22 Z G Li, F W Zhou, F Chen, H Li. Meta-SGD: learning to learn quickly for few-shot learning. 2017, arXiv preprint arXiv: 1707.09835
23 A Nichol, J Achiam, J Schulman. On first-order meta-learning algorithms. 2018, arXiv preprint arXiv: 1803.02999
24 M A Jamal, G J Qi. Task agnostic meta-learning for few-shot learning. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 1171 1– 11719
25 J Snell, K Swersky, R Zemel. Prototypical networks for few-shot learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4080−4090
26 S Gidaris, N Komodakis. Dynamic few-shot visual learning without forgetting. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4367−4375
27 G S Dhillon, P Chaudhari, A Ravichandran, S Soatto. A baseline for few-shot image classification. In: Proceedings of the 8th International Conference on Learning Representations. 2020
28 Y Wang, W L Chao, K Q Weinberger, L van der Maaten. SimpleShot: revisiting nearest-neighbor classification for few-shot learning. 2019, arXiv preprint arXiv: 1911.04623
29 H de Vries, F Strub, J Mary, H Larochelle, O Pietquin, A Courville. Modulating early visual processing by language. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6597−6607
30 V Dumoulin, J Shlens, M Kudlur. A learned representation for artistic style. In: Proceedings of the 5th International Conference on Learning Representations. 2017
31 X Huang, S Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of IEEE International Conference on Computer Vision. 2017, 1510−1519
32 T Park, M Y Liu, T C Wang, J Y Zhu. Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 2332−2341
33 B N Oreshkin, P Rodriguez, A Lacoste. TADAM: task dependent adaptive metric for improved few-shot learning. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 719– 729
34 E Perez, F Strub, H de Vries, V Dumoulin, A Courville. FiLM: visual reasoning with a general conditioning layer. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018, 3942−3951
35 H Y Tseng, H Y Lee, J B Huang, M H Yang. Cross-domain few-shot classification via learned feature-wise transformation. In: Proceedings of International Conference on Learning Representations. 2020
36 Y Guo, N C Codella, L Karlinsky, J V Codella, J R Smith, K Saenko, T Rosing, R Feris. A broader study of cross-domain few-shot learning. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 124– 141
37 Requeima J, Gordon J, Bronskill J, Nowozin S, Turner R E. Fast and flexible multi-task classification using conditional neural adaptive processes. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 7957−7968
38 A G Howard, M L Zhu, B Chen, D Kalenichenko, W J Wang, T Weyand, M Andreetto, H Adam. MobileNets: efficient convolutional neural networks for mobile vision applications. 2016, arXiv preprint arXiv: 1704.04861
39 C Wah, S Branson, P Welinder, P Perona, S Belongie. The Caltech-UCSD birds-200-2011 dataset. Technical Report CNS-TR-2011-001. Pasadena, CA, USA: California Institute of Technology, 2011
40 O Russakovsky , J Deng , H Su , J Krause , S Satheesh , S Ma , Z Huang , A Karpathy , A Khosla , M Bernstein , A C Berg , L Fei-Fei . ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115( 3): 211– 252
41 S Ravi, H Larochelle. Optimization as a model for few-shot learning. In: Proceedings of the 5th International Conference on Learning Representations. 2017
42 A Krizhevsky, G Hinton. Learning multiple layers of features from tiny images. University of Toronto, Dissertation, 2009
43 N Hilliard, L Phillips, S Howland, A Yankov, C D Corley, N O Hodas. Few-shot learning with metric-agnostic conditional embeddings. 2018, arXiv preprint arXiv: 1802.04376
44 V G Satorras, J B Estrach. Few-shot learning with graph neural networks. In: Proceedings of the 6th International Conference on Learning Representations. 2018
45 F Sung, Y Yang, L Zhang, T Xiang, P H S Torr, T M Hospedales. Learning to compare: relation network for few-shot learning. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 1199−1208
46 S Qiao, C Liu, W Shen, A Yuille. Few-shot image recognition by predicting parameters from activations. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7229−7238
47 H J Ye, H Hu, D C Zhan, F Sha. Few-shot learning via embedding adaptation with set-to-set functions. 2018, arXiv preprint arXiv: 1812.03664
48 Jiang X, Havaei M, Varno F, Chartrand G, Chapados N, Matwin S. Learning to learn with conditional class dependencies. In: Proceedings of the 7th International Conference on Learning Representations. 2019
49 C Xing, N Rostamzadeh, B N Oreshkin, P O Pinheiro. Adaptive cross-modal few-shot learning. In: Proceedings of International Conference on Neural Information Processing Systems. 2019, 4848−4858
50 C Zhang, Y Cai, G Lin, C Shen. DeepEMD: few-shot image classification with differentiable earth mover’s distance and structured classifiers. In: Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, 1220 0– 12210
51 A A Rusu, D Rao, J Sygnowski, O Vinyals, R Pascanu, S Osindero, R Hadsell. Meta-learning with latent embedding optimization. In: Proceedings of the 7th International Conference on Learning Representations. 2019
52 Rodríguez P, Laradji I, Drouin A, Lacoste A. Embedding propagation: smoother manifold for few-shot classification. In: Proceedings of the 16th European Conference on Computer Vision. 2020, 121–138
53 S X Hu, P G Moreno, Y Xiao, X Shen, G Obozinski, N D Lawrence, A C Damianou. Empirical Bayes transductive meta-learning with synthetic gradients. In: Proceedings of the 8th International Conference on Learning Representations. 2020
54 D L Davies , D W Bouldin . A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979, 1( 2): 224– 227
[1] Qiming FU, Zhechao WANG, Nengwei FANG, Bin XING, Xiao ZHANG, Jianping CHEN. MAML2: meta reinforcement learning via meta-learning for task categories[J]. Front. Comput. Sci., 2023, 17(4): 174325-.
[2] Zhe XUE, Junping DU, Xin XU, Xiangbin LIU, Junfu WANG, Feifei KOU. Few-shot node classification via local adaptive discriminant structure learning[J]. Front. Comput. Sci., 2023, 17(2): 172316-.
[3] Zhong JI, Jingwei NI, Xiyao LIU, Yanwei PANG. Teachers cooperation: team-knowledge distillation for multiple cross-domain few-shot learning[J]. Front. Comput. Sci., 2023, 17(2): 172312-.
[4] Haoyu ZHAO, Weidong MIN, Jianqiang XU, Qi WANG, Yi ZOU, Qiyan FU. Scene-adaptive crowd counting method based on meta learning with dual-input network DMNet[J]. Front. Comput. Sci., 2023, 17(1): 171304-.
[5] Pinzhuo TIAN, Yang GAO. Improving meta-learning model via meta-contrastive loss[J]. Front. Comput. Sci., 2022, 16(5): 165331-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed