1. Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing 100190, China 2. University of Chinese Academy of Sciences, Beijing 100049, China
Robust face representation is imperative to highly accurate face recognition. In this work, we propose an open source face recognition method with deep representation named as VIPLFaceNet, which is a 10-layer deep convolutional neural network with seven convolutional layers and three fully-connected layers. Compared with the well-known AlexNet, our VIPLFaceNet takes only 20% training time and 60% testing time, but achieves 40% drop in error rate on the real-world face recognition benchmark LFW. Our VIPLFaceNet achieves 98.60% mean accuracy on LFW using one single network. An open-source C++ SDK based on VIPLFaceNet is released under BSD license. The SDK takes about 150ms to process one face image in a single thread on an i7 desktop CPU. VIPLFaceNet provides a state-of-the-art start point for both academic and industrial face recognition applications.
Zhao W Y, Chellappa R, Phillips P J, Rosenfeld A. Face recognition: a literature survey. ACM Computing Surveys, 2003, 35(4): 399–458 https://doi.org/10.1145/954339.954342
2
Liu C, Wechsler H. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Transactions on Image Processing, 2002, 11(4): 467–476 https://doi.org/10.1109/TIP.2002.999679
3
Ahonen T, Hadid A, Pietikainen M. Face description with local binary patterns: application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(12): 2037–2041 https://doi.org/10.1109/TPAMI.2006.244
4
Chen D, Cao X D, Wen F, Sun J. Blessing of dimensionality: highdimensional feature and its efficient compression for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3025–3032 https://doi.org/10.1109/cvpr.2013.389
5
Albiol A, Monzo D, Martin A, Sastre J, Albiol A. Face recognition using HOG-EBGM. Pattern Recognition Letters, 2008, 29(10): 1537–1543 https://doi.org/10.1016/j.patrec.2008.03.017
6
Vu N S, Caplier A. Enhanced patterns of oriented edge magnitudes for face recognition and image matching. IEEE Transactions on Image Processing, 2012, 21(3): 1352–1365 https://doi.org/10.1109/TIP.2011.2166974
7
Hussain S U, Napoléon T, Jurie F. Face recognition using local quantized patterns. In: Proceedings of British Machive Vision Conference. 2012, 11–20 https://doi.org/10.5244/c.26.99
8
Bicego M, Lagorio A, Grosso E, Tistarelli M. On the use of SIFT features for face authentication. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2006, 35–35 https://doi.org/10.1109/cvprw.2006.149
9
Kumar R, Banerjee A, Vemuri B C, Pfister H. Trainable convolution filters and their application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1423–1436 https://doi.org/10.1109/TPAMI.2011.225
10
Lei Z, Yi D, Li S Z. Discriminant image filter learning for face recognition with local binary pattern like representation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 2512–2517
11
Xie S F, Shan S G, Chen X L, Meng X, Gao W. Learned local gabor patterns for face representation and recognition. Signal Processing, 2009, 89(12): 2333–2344 https://doi.org/10.1016/j.sigpro.2009.02.016
12
Cao Z M, Yin Q, Tang X O, Sun J. Face recognition with learningbased descriptor. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2707–2714
13
Cui Z, Li W, Xu D, Shan S G, Chen X. Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3554–3561 https://doi.org/10.1109/cvpr.2013.456
14
Berg T, Belhumeur P N. Tom-vs-Pete classifiers and identitypreserving alignment for face verification. In: Proceedings of British Machine Vision Conference. 2012, 5
15
Taigman Y, Yang M, Ranzato M, Wolf L. Deepface: closing the gap to human-level performance in face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1701–1708 https://doi.org/10.1109/cvpr.2014.220
16
Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1891–1898 https://doi.org/10.1109/cvpr.2014.244
17
Sun Y, Chen Y H, Wang X G, Tang X O. Deep learning face representation by joint identification-verification. In: Proceedings of Advances in Neural Information Processing Systems. 2014, 1988–1996
18
Sun Y, Wang X G, Tang X O. Deeply learned face representations are sparse, selective, and robust. 2014, arXiv:1412.1265
19
Schroff F, Kalenichenko D, Philbin J. Facenet: a unified embedding for face recognition and clustering. 2015, arXiv:1503.03832
20
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems. 2012, 1097–1105
21
Liu X, Shan S G, Li S X, Hauptmann A G. Everything is in the face? represent faces with object bank. In: Proceedings of Asian Conference on Computer Vision Workshops. 2014, 180–193
22
Simonyan K, Parkhi O M, Vedaldi A, Zisserman A. Fisher vector faces in the wild. In: Proceedings of British Machive Vision Conference. 2013 https://doi.org/10.5244/c.27.8
23
Kumar N, Berg A C, Belhumeur P N, Nayar S K. Attribute and simile classifiers for face verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 365–372 https://doi.org/10.1109/iccv.2009.5459250
24
Yi D, Lei Z, Liao S C, Li S Z. Learning face representation from scratch, 2014, arXiv:1411.7923
25
Chen D, Cao X, Wang L, Wen F, Sun J. Bayesian face revisited: a joint formulation. In: Proceedings of European Conference on Computer Vision. 2012, 566–579 https://doi.org/10.1007/978-3-642-33712-3_41
26
Samaria F S, Harter A C. Parameterisation of a stochastic model for human face identification. In: Proceedings of IEEE Workshop on Applications of Computer Vision. 1994, 138–142 https://doi.org/10.1109/acv.1994.341300
27
Martinez A M. The AR face database. CVC Technical Report, 1998, 24
28
Phillips P J, Moon H, Rizvi S A, Rauss P J. The feret evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(10): 1090–1104 https://doi.org/10.1109/34.879790
29
Sim T, Baker S, Bsat M. The CMU pose, illumination, and expression (PIE) database. In: Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. 2002, 46–51 https://doi.org/10.1109/AFGR.2002.1004130
30
Phillips P J, Flynn P J, Scruggs T, Bowyer K W, Chang J, Hoffman K, Marques J, Min J, Worek W. Overview of the face recognition grand challenge. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2005, 947–954 https://doi.org/10.1109/cvpr.2005.268
31
Lee K C, Ho J, Kriegman D. Acquiring linear subspaces for face recognition under variable lighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 684–698 https://doi.org/10.1109/TPAMI.2005.92
32
Gao W, Cao B, Shan S G, Chen X L, Zhou D L, Zhang X H, Zhao D B. The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man and Cybernetics Part A System and Humans, 2008, 38(1): 149–161 https://doi.org/10.1109/TSMCA.2007.909557
Huang G B, Ramesh M, Berg T, Learned-Miller E. Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, 2007
35
Chen B C, Chen C S, Hsu W H. Cross-age reference coding for ageinvariant face recognition and retrieval. In: Proceedings of European Conference on Computer Vision. 2014, 768–783
36
Wang D Y, Hoi S C H, Zhu J K. WLFDB: weakly labeled face databases. Technical Report, 2014
37
Zhang X, Zhang L, Wang X J, Shum H Y. Finding celebrities in billions of web images. IEEE Transactions on Multimedia, 2012, 14(4): 995–1007 https://doi.org/10.1109/TMM.2012.2186121
38
Best-Rowden L, Han H, Otto C, Klare B F, Jain A K. Unconstrained face recognition: identifying a person of interest from a media collection. IEEE Transactions on Information Forensics and Security, 2014, 9(12): 2144–2157 https://doi.org/10.1109/TIFS.2014.2359577
39
Guillaumin M, Verbeek J, Schmid C. Is that you? metric learning approaches for face identification. In: Proceedings of the 12th IEEE International Conference on Computer Vision. 2009, 498–505 https://doi.org/10.1109/iccv.2009.5459197
40
Taigman Y, Wolf L, Hassner T. Multiple one-shots for utilizing class label information. In: Proceedings of British Machive Vision Conference. 2009, 1–12 https://doi.org/10.5244/c.23.77
41
Yin Q, Tang X O, Sun J. An associate-predict model for face recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2011, 497–504 https://doi.org/10.1109/cvpr.2011.5995494
42
Cao X, Wipf D, Wen F, Duan G Q, Sun J. A practical transfer learning algorithm for face verification. In: Proceedings of IEEE International Conference on Computer Vision. 2013, 3208–3215 https://doi.org/10.1109/iccv.2013.398
43
Lu C C, Tang X O. Surpassing human-level face verification performance on LFW with gaussianface. 2014, arXiv:1404.3840
44
Parkhi O M, Vedaldi A, Zisserman A. Deep face recognition. Proceedings of the British Machine Vision, 2015, 1(3): 6 https://doi.org/10.5244/c.29.41
45
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. 2014, arXiv:1409.4842
46
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
47
He K M, Zhang X Y, Ren S Q, Sun J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. 2015, arXiv:1502.01852
48
He K M, Sun J. Convolutional neural networks at constrained time cost. 2014, arXiv:1412.1710
49
Zhang S S, Zhang C, You Z, Zheng R, Xu B. Asynchronous stochastic gradient descent for DNN training. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. 2013, 6660–6663 https://doi.org/10.1109/icassp.2013.6638950
50
Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: delving deep into convolutional nets. 2014, arXiv:1405.3531
51
LeCun Y, Bottou L, Orr G B, Müller K R. Efficient backprop. In: Montavon G, Orr G B, Müller K R, eds. Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, Vol 7700. Berlin: Springer, 2012, 9–48 https://doi.org/10.1007/978-3-642-35289-8_3
52
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of Internatioal Conference on Machine Learning. 2015, 448–456
53
Yan S, Shan S G, Chen X, Gao W. Locally assembled binary (LAB) feature with feature-centric cascade for fast and accurate face detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–7
54
Zhang J, Shan S G, Kan M N, Chen X L. Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: Proceedings of European Conference on Computer Vision. 2014, 1–16 https://doi.org/10.1007/978-3-319-10605-2_1
55
Jia Y Q, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia. 2014, 675–678 https://doi.org/10.1145/2647868.2654889
56
Zou Q, Zeng J C, Cao L J, Ji R R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing, 2016, 173: 346–354 https://doi.org/10.1016/j.neucom.2014.12.123
57
Lin C, Chen W Q, Qiu C, Wu Y F, Krishnan S, Zou Q. LibD3C: ensemble classifiers with a clustering and dynamic selection strategy. Neurocomputing, 2014, 123: 424–435 https://doi.org/10.1016/j.neucom.2013.08.004
58
Taigman Y, Yang M, Ranzato M A, Wolf L. Web-scale training for face identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2746–2754 https://doi.org/10.1109/cvpr.2015.7298891
59
Liu X, Li S X, Kan M N, Zhang J, Wu S Z, Liu W X, Han H, Shan S G, Chen X L. Agenet: deeply learned regressor and classifier for robust apparent age estimation. In: Proceedings of IEEE International Conference on Computer Vision Workshops. 2015, 258–266 https://doi.org/10.1109/iccvw.2015.42