ARCosmetics: a real-time augmented reality cosmetics try-on system

doi:10.1007/s11704-022-2059-8

Frontiers of Computer Science

2023, Vol. 17

Issue (4): 174706 https://doi.org/10.1007/s11704-022-2059-8

本期目录

ARCosmetics: a real-time augmented reality cosmetics try-on system

Shan AN^1,², Jianye CHEN¹, Zhaoqi ZHU¹, Fangru ZHOU¹, Yuxing YANG¹, Yuqing MA², Xianglong LIU², Haogang ZHU²(

)

¹. Tech. & Data Center, JD.COM Inc., Beijing 100176, China
². State Key Lab of Software Development Environment, Beihang University, Beijing 100191, China

全文: PDF(21642 KB) HTML

Abstract：

A virtual cosmetics try-on system provides a realistic try-on experience for consumers and helps them efficiently choose suitable cosmetics. In this article, we propose a real-time augmented reality virtual cosmetics try-on system for smartphones (ARCosmetics), taking speed, accuracy, and stability into consideration at each step to ensure a better user experience. A novel and very fast face tracking method utilizes the face detection box and the average position of facial landmarks to estimate the faces in continuous frames. A dynamic weight Wing loss is introduced to assign a dynamic weight to every landmark by the estimated error during training. It balances the attention between small, medium, and large range error and thus increases the accuracy and robustness. We also designed a weighted average method to utilize the information of the adjacent frame for landmark refinement, guaranteeing the stability of the generated landmarks. Extensive experiments conducted on a large 106-point facial landmark dataset and the 300-VW dataset demonstrate the superior performance of the proposed method compared to other state-of-the-art methods. We also conducted user satisfaction studies further to verify the efficiency and effectiveness of our ARCosmetics system.

Key words： facial landmark localization face tracking stabilization augmented reality virtual try-on

收稿日期: 2022-01-30 出版日期: 2022-12-25

Corresponding Author(s): Haogang ZHU

引用本文:

. [J]. Frontiers of Computer Science, 2023, 17(4): 174706.
Shan AN, Jianye CHEN, Zhaoqi ZHU, Fangru ZHOU, Yuxing YANG, Yuqing MA, Xianglong LIU, Haogang ZHU. ARCosmetics: a real-time augmented reality cosmetics try-on system. Front. Comput. Sci., 2023, 17(4): 174706.

链接本文:

https://academic.hep.com.cn/fcs/CN/10.1007/s11704-022-2059-8
https://academic.hep.com.cn/fcs/CN/Y2023/V17/I4/174706

Fig.1

Fig.2

Fig.3

Tab.1

Tab.2

Tab.3

Tab.4

Tab.5

Tab.6

Fig.4

Fig.5

Tab.7

Tab.8

Fig.6

Fig.7

Tab.9

1	H J, Chen K M, Hui S Y, Wang L W, Tsao H H, Shuai W H Cheng . BeautyGlow: on-demand makeup transfer framework with reversible generative network. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 10034–10042
2	W, Jiang S, Liu C, Gao J, Cao R, He J, Feng S Yan . PSGAN: pose and expression robust spatial-aware GAN for customizable makeup transfer. In: Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2020, 5193–5201
3	D P, Kingma P Dhariwal . Glow: generative flow with invertible 1×1 convolutions. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 10236–10245
4	P, Viola M J Jones . Robust real-time face detection. International Journal of Computer Vision, 2004, 57( 2): 137–154
5	H, Li Z, Lin X, Shen J, Brandt G Hua . A convolutional neural network cascade for face detection. In: Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015, 5325–5334
6	K, Zhang Z, Zhang Z, Li Y Qiao . Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 2016, 23( 10): 1499–1503
7	X, Tang D K, Du Z, He J Liu . PyramidBox: a context-assisted single shot face detector. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 812–828
8	J, Deng J, Guo Y, Zhou J, Yu I, Kotsia S Zafeiriou . RetinaFace: single-stage dense face localisation in the wild. 2019, arXiv preprint arXiv: 1905.00641
9	J F, Henriques R, Caseiro P, Martins J Batista . High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37( 3): 583–596
10	D, Held S, Thrun S Savarese . Learning to track at 100 FPS with deep regression networks. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 749–765
11	B, Li W, Wu Q, Wang F, Zhang J, Xing J Yan . SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2019, 4277–4286
12	B, Li J, Yan W, Wu Z, Zhu X Hu . High performance visual tracking with Siamese region proposal network. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 8971–8980
13	Q, Wang Z, Teng J, Xing J, Gao W, Hu S Maybank . Learning attentions: residual attentional Siamese network for high performance online visual tracking. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4854–4863
14	Z, Zhu Q, Wang B, Li W, Wu J, Yan W Hu . Distractor-aware Siamese networks for visual object tracking. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 103–119
15	S, Yan C, Liu S Z, Li H, Zhang H Y, Shum Q Cheng . Face alignment using texture-constrained active shape models. Image and Vision Computing, 2003, 21( 1): 69–75
16	T F, Cootes G J, Edwards C J Taylor . Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23( 6): 681–685
17	A, Bulat G Tzimiropoulos . How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In: Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). 2017, 1021–1030
18	X, Dong Y, Yan W, Ouyang Y Yang . Style aggregated network for facial landmark detection. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 379–388
19	M, Kowalski J, Naruniec T Trzcinski . Deep alignment network: a convolutional neural network for robust face alignment. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 2017, 2034–2043
20	A, Newell K, Yang J Deng . Stacked hourglass networks for human pose estimation. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 483–499
21	X, Wang L, Bo F Li . Adaptive wing loss for robust face alignment via heatmap regression. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 2019, 6970–6980
22	W, Wu C, Qian S, Yang Q, Wang Y, Cai Q Zhou . Look at boundary: a boundary-aware face alignment algorithm. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 2129–2138
23	X, Guo S, Li J, Yu J, Zhang J, Ma L, Ma W, Liu H Ling . PFLD: a practical facial landmark detector. 2019, arXiv preprint arXiv: 1902.10859
24	J, Lv X, Shao J, Xing C, Cheng X Zhou . A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, 3691–3700
25	Y, Sun X, Wang X Tang . Deep convolutional network cascade for facial point detection. In: Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. 2013, 3476–3483
26	R, Valle J M, Buenaposada A, Valdés L Baumela . A deeply-initialized coarse-to-fine ensemble of regression trees for face alignment. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). 2018, 609–624
27	Z, Zhang P, Luo C C, Loy X Tang . Facial landmark detection by deep multi-task learning. In: Proceedings of the 13th European Conference on Computer Vision (ECCV). 2014, 94–108
28	Z H, Feng J, Kittler M, Awais P, Huber X J Wu . Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 2235–2245
29	J, Shen S, Zafeiriou G G, Chrysos J, Kossaifi G, Tzimiropoulos M Pantic . The first facial landmark tracking in-the-wild challenge: benchmark and results. In: Proceedings of 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). 2015, 1003–1011
30	A, Asthana S, Zafeiriou S, Cheng M Pantic . Incremental face alignment in the wild. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1859–1866
31	E, Sánchez-Lozano B, Martinez G, Tzimiropoulos M Valstar . Cascaded continuous regression for real-time incremental face tracking. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 645–661
32	X, Dong S I, Yu X, Weng S E, Wei Y, Yang Y Sheikh . Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 360–368
33	Y, Jin X, Guo Y, Li J, Xing H Tian . Towards stabilizing facial landmark detection and tracking via hierarchical filtering: a new method. Journal of the Franklin Institute, 2020, 357( 5): 3019–3037
34	W, Liu D, Anguelov D, Erhan C, Szegedy S, Reed C Y, Fu A C Berg . SSD: single shot MultiBox detector. In: Proceedings of the 14th European Conference on Computer Vision (ECCV). 2016, 21–37
35	A G, Howard M, Zhu B, Chen D, Kalenichenko W, Wang T, Weyand M, Andreetto H Adam . MobileNets: efficient convolutional neural networks for mobile vision applications. 2017, arXiv preprint arXiv: 1704.04861
36	Y, Chen Z, Wang Y, Peng Z, Zhang G, Yu J Sun . Cascaded pyramid network for multi-person pose estimation. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7103–7112
37	Y, Liu H, Shen Y, Si X, Wang X, Zhu H, Shi Z, Hong H, Guo Z, Guo Y, Chen B, Li T, Xi J, Yu H, Xie G, Xie M, Li Q, Lu Z, Wang S, Lai Z, Chai X Wei . Grand challenge of 106-point facial landmark localization. In: Proceedings of 2019 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). 2019, 613–616
38	Q, Cao L, Shen W, Xie O M, Parkhi A Zisserman . VGGFace2: a dataset for recognising faces across pose and age. In: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition. 2018, 67–74
39	C, Sagonas E, Antonakos G, Tzimiropoulos S, Zafeiriou M Pantic . 300 faces in-the-wild challenge: database and results. Image and Vision Computing, 2016, 47: 3–18
40	D P, Kingma J Ba . Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations. 2015
41	A, Kumar R Chellappa . Disentangling 3D pose in a dendritic CNN for unconstrained 2D face alignment. In: Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 430–439
42	B D, Lucas T Kanade . An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence. 1981, 674–679
43	V, Kazemi J Sullivan . One millisecond face alignment with an ensemble of regression trees. In: Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1867–1874
44	D E King . Dlib-ml: a machine learning toolkit. The Journal of Machine Learning Research, 2009, 10: 1755–1758

[1]

FCS-22059-OF-SA_suppl_1

Download

Viewed

Full text

Abstract

Cited

Shared

Discussed