Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

邮发代号 80-970

2019 Impact Factor: 1.275

Frontiers of Computer Science  2023, Vol. 17 Issue (4): 174706   https://doi.org/10.1007/s11704-022-2059-8
  本期目录
ARCosmetics: a real-time augmented reality cosmetics try-on system
Shan AN1,2, Jianye CHEN1, Zhaoqi ZHU1, Fangru ZHOU1, Yuxing YANG1, Yuqing MA2, Xianglong LIU2, Haogang ZHU2()
1. Tech. & Data Center, JD.COM Inc., Beijing 100176, China
2. State Key Lab of Software Development Environment, Beihang University, Beijing 100191, China
 全文: PDF(21642 KB)   HTML
Abstract

A virtual cosmetics try-on system provides a realistic try-on experience for consumers and helps them efficiently choose suitable cosmetics. In this article, we propose a real-time augmented reality virtual cosmetics try-on system for smartphones (ARCosmetics), taking speed, accuracy, and stability into consideration at each step to ensure a better user experience. A novel and very fast face tracking method utilizes the face detection box and the average position of facial landmarks to estimate the faces in continuous frames. A dynamic weight Wing loss is introduced to assign a dynamic weight to every landmark by the estimated error during training. It balances the attention between small, medium, and large range error and thus increases the accuracy and robustness. We also designed a weighted average method to utilize the information of the adjacent frame for landmark refinement, guaranteeing the stability of the generated landmarks. Extensive experiments conducted on a large 106-point facial landmark dataset and the 300-VW dataset demonstrate the superior performance of the proposed method compared to other state-of-the-art methods. We also conducted user satisfaction studies further to verify the efficiency and effectiveness of our ARCosmetics system.

Key wordsfacial landmark localization    face tracking    stabilization    augmented reality    virtual try-on
收稿日期: 2022-01-30      出版日期: 2022-12-25
Corresponding Author(s): Haogang ZHU   
 引用本文:   
. [J]. Frontiers of Computer Science, 2023, 17(4): 174706.
Shan AN, Jianye CHEN, Zhaoqi ZHU, Fangru ZHOU, Yuxing YANG, Yuqing MA, Xianglong LIU, Haogang ZHU. ARCosmetics: a real-time augmented reality cosmetics try-on system. Front. Comput. Sci., 2023, 17(4): 174706.
 链接本文:  
https://academic.hep.com.cn/fcs/CN/10.1007/s11704-022-2059-8
https://academic.hep.com.cn/fcs/CN/Y2023/V17/I4/174706
Fig.1  
  
Fig.2  
  
Fig.3  
Parts Part 1 Part 2 Part 3 Part 4
Dataset Training set of JD Landmark Dataset [37] A subset of VGGFace2 Dataset [38] Micro-video Dataset 1 Micro-video Dataset 2
Constitution Based on 300W [39] Images from Google Image Search 6,152 Selfie Video 183 Selfie Video
#images 10,865 18,619 68,399 9,042
Tab.1  
Model PFLD 1X PFLD 0.25X Ours
Size (MB) 8.71 3.27 0.93
Calculations (MFlops) 274.27 52.31 9.61
CPU (ms) 10.50 3.23 1.06
ARM (ms) 63.21 13.98 4.28
Tab.2  
Module Speed/ms
Face Detection 11.80
Face Tracking 0.02
Facial Landmark Localization 4.28
Landmark Stabilization 0.48
Tab.3  
Processing a frame Speed/ms
using tracking 4.84
using detection 14.47
on average 7.35
Tab.4  
Face tracking method Category 1 Category 2 Category 3 Speed
KCF [9] 0.697 0.937 0.822 155.35
SiamRPN [12] 0.843 0.892 0.885 634.52
Ours 0.874 0.873 0.944 3.23
Tab.5  
Method AR-Landmark107k Part 1,2 AR-Landmark107k
PFLD 1X 3.16 2.76
PFLD 0.25X 3.78 3.22
Ours 3.72 3.19
Tab.6  
Fig.4  
Fig.5  
L2 Loss Wing loss DW OHKM AR-L Part 1,2 AR-Landmark107k
× × × 4.06 3.50
× × × 3.78 3.33
× × 3.77 3.23
× × 3.74 3.27
× 3.72 3.19
Tab.7  
Method Category 1 Category 2 Category 3 Speed
dlib [44] 0.00527 0.00658 0.00701
dlib [44] + optical flow [42] 0.00524 0.00655 0.00762 3.69
dlib [44]+ our method 0.00493 0.00649 0.00694 1.01
Tab.8  
Fig.6  
Fig.7  
Try-on products Meitu YoucamMakeup Modiface Our ARCosmetics system
Lipstick 3.8500 3.1125 2.9250 3.8125
Blusher 3.55 3.0875 3.0625 3.7500
Eyebrow Pencil 3.7375 3.3125 3.0250 3.7250
Eye Shadow 3.3250 2.7000 2.7375 3.5625
Eyeliner 3.4750 3.1000 2.6375 3.7125
Mascara 3.4125 2.8375 2.4625 3.4375
Powder Foundation 3.5500 3.1625 3.2250 3.7250
Cosmetics Try-on Style 3.7250 2.8500 2.7125 3.5875
Tab.9  
  
  
  
  
  
  
  
  
1 H J, Chen K M, Hui S Y, Wang L W, Tsao H H, Shuai W H Cheng . BeautyGlow: on-demand makeup transfer framework with reversible generative network. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10034–10042
2 W, Jiang S, Liu C, Gao J, Cao R, He J, Feng S Yan . PSGAN: pose and expression robust spatial-aware GAN for customizable makeup transfer. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, 5193–5201
3 D P, Kingma P Dhariwal . Glow: generative flow with invertible 1×1 convolutions. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 10236–10245
4 P, Viola M J Jones . Robust real-time face detection. International Journal of Computer Vision, 2004, 57( 2): 137–154
5 H, Li Z, Lin X, Shen J, Brandt G Hua . A convolutional neural network cascade for face detection. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015, 5325–5334
6 K, Zhang Z, Zhang Z, Li Y Qiao . Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23( 10): 1499–1503
7 X, Tang D K, Du Z, He J Liu . PyramidBox: a context-assisted single shot face detector. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 812–828
8 J, Deng J, Guo Y, Zhou J, Yu I, Kotsia S Zafeiriou . RetinaFace: single-stage dense face localisation in the wild. 2019, arXiv preprint arXiv: 1905.00641
9 J F, Henriques R, Caseiro P, Martins J Batista . High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37( 3): 583–596
10 D, Held S, Thrun S Savarese . Learning to track at 100 FPS with deep regression networks. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 749–765
11 B, Li W, Wu Q, Wang F, Zhang J, Xing J Yan . SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 4277–4286
12 B, Li J, Yan W, Wu Z, Zhu X Hu . High performance visual tracking with Siamese region proposal network. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8971–8980
13 Q, Wang Z, Teng J, Xing J, Gao W, Hu S Maybank . Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4854–4863
14 Z, Zhu Q, Wang B, Li W, Wu J, Yan W Hu . Distractor-aware Siamese networks for visual object tracking. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 103–119
15 S, Yan C, Liu S Z, Li H, Zhang H Y, Shum Q Cheng . Face alignment using texture-constrained active shape models. Image and Vision Computing, 2003, 21( 1): 69–75
16 T F, Cootes G J, Edwards C J Taylor . Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23( 6): 681–685
17 A, Bulat G Tzimiropoulos . How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In: Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017, 1021–1030
18 X, Dong Y, Yan W, Ouyang Y Yang . Style aggregated network for facial landmark detection. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 379–388
19 M, Kowalski J, Naruniec T Trzcinski . Deep alignment network: a convolutional neural network for robust face alignment. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2017, 2034–2043
20 A, Newell K, Yang J Deng . Stacked hourglass networks for human pose estimation. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 483–499
21 X, Wang L, Bo F Li . Adaptive wing loss for robust face alignment via heatmap regression. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, 6970–6980
22 W, Wu C, Qian S, Yang Q, Wang Y, Cai Q Zhou . Look at boundary: a boundary-aware face alignment algorithm. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 2129–2138
23 X, Guo S, Li J, Yu J, Zhang J, Ma L, Ma W, Liu H Ling . PFLD: a practical facial landmark detector. 2019, arXiv preprint arXiv: 1902.10859
24 J, Lv X, Shao J, Xing C, Cheng X Zhou . A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, 3691–3700
25 Y, Sun X, Wang X Tang . Deep convolutional network cascade for facial point detection. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3476–3483
26 R, Valle J M, Buenaposada A, Valdés L Baumela . A deeply-initialized coarse-to-fine ensemble of regression trees for face alignment. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 609–624
27 Z, Zhang P, Luo C C, Loy X Tang . Facial landmark detection by deep multi-task learning. In: Proceedings of the 13th European Conference on Computer Vision (ECCV). 2014, 94–108
28 Z H, Feng J, Kittler M, Awais P, Huber X J Wu . Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 2235–2245
29 J, Shen S, Zafeiriou G G, Chrysos J, Kossaifi G, Tzimiropoulos M Pantic . The first facial landmark tracking in-the-wild challenge: benchmark and results. In: Proceedings of 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). 2015, 1003–1011
30 A, Asthana S, Zafeiriou S, Cheng M Pantic . Incremental face alignment in the wild. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1859–1866
31 E, Sánchez-Lozano B, Martinez G, Tzimiropoulos M Valstar . Cascaded continuous regression for real-time incremental face tracking. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 645–661
32 X, Dong S I, Yu X, Weng S E, Wei Y, Yang Y Sheikh . Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 360–368
33 Y, Jin X, Guo Y, Li J, Xing H Tian . Towards stabilizing facial landmark detection and tracking via hierarchical filtering: a new method. Journal of the Franklin Institute, 2020, 357( 5): 3019–3037
34 W, Liu D, Anguelov D, Erhan C, Szegedy S, Reed C Y, Fu A C Berg . SSD: single shot MultiBox detector. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 21–37
35 A G, Howard M, Zhu B, Chen D, Kalenichenko W, Wang T, Weyand M, Andreetto H Adam . MobileNets: efficient convolutional neural networks for mobile vision applications. 2017, arXiv preprint arXiv: 1704.04861
36 Y, Chen Z, Wang Y, Peng Z, Zhang G, Yu J Sun . Cascaded pyramid network for multi-person pose estimation. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7103–7112
37 Y, Liu H, Shen Y, Si X, Wang X, Zhu H, Shi Z, Hong H, Guo Z, Guo Y, Chen B, Li T, Xi J, Yu H, Xie G, Xie M, Li Q, Lu Z, Wang S, Lai Z, Chai X Wei . Grand challenge of 106-point facial landmark localization. In: Proceedings of 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). 2019, 613–616
38 Q, Cao L, Shen W, Xie O M, Parkhi A Zisserman . VGGFace2: a dataset for recognising faces across pose and age. In: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition. 2018, 67–74
39 C, Sagonas E, Antonakos G, Tzimiropoulos S, Zafeiriou M Pantic . 300 faces in-the-wild challenge: database and results. Image and Vision Computing, 2016, 47: 3–18
40 D P, Kingma J Ba . Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
41 A, Kumar R Chellappa . Disentangling 3D pose in a dendritic CNN for unconstrained 2D face alignment. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 430–439
42 B D, Lucas T Kanade . An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence. 1981, 674–679
43 V, Kazemi J Sullivan . One millisecond face alignment with an ensemble of regression trees. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1867–1874
44 D E King . Dlib-ml: a machine learning toolkit. The Journal of Machine Learning Research, 2009, 10: 1755–1758
[1] FCS-22059-OF-SA_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed