Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (3) : 183313    https://doi.org/10.1007/s11704-023-2570-6
Artificial Intelligence
Scattering-based hybrid network for facial attribute classification
Na LIU, Fan ZHANG, Liang CHANG, Fuqing DUAN()
School of Artificial Intelligence, Beijing Normal University, Beijing 100875, China
 Download: PDF(7483 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Face attribute classification (FAC) is a high-profile problem in biometric verification and face retrieval. Although recent research has been devoted to extracting more delicate image attribute features and exploiting the inter-attribute correlations, significant challenges still remain. Wavelet scattering transform (WST) is a promising non-learned feature extractor. It has been shown to yield more discriminative representations and outperforms the learned representations in certain tasks. Applied to the image classification task, WST can enhance subtle image texture information and create local deformation stability. This paper designs a scattering-based hybrid block, to incorporate frequency-domain (WST) and image-domain features in a channel attention manner (Squeeze-and-Excitation, SE), termed WS-SE block. Compared with CNN, WS-SE achieves a more efficient FAC performance and compensates for the model sensitivity of the small-scale affine transform. In addition, to further exploit the relationships among the attribute labels, we propose a learning strategy from a causal view. The cause attributes defined using the causality-related information can be utilized to infer the effect attributes with a high confidence level. Ablative analysis experiments demonstrate the effectiveness of our model, and our hybrid model obtains state-of-the-art results in two public datasets.

Keywords wavelet scattering transform      causality-related learning      facial attribute classification     
Corresponding Author(s): Fuqing DUAN   
Just Accepted Date: 16 March 2023   Issue Date: 22 May 2023
 Cite this article:   
Na LIU,Fan ZHANG,Liang CHANG, et al. Scattering-based hybrid network for facial attribute classification[J]. Front. Comput. Sci., 2024, 18(3): 183313.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-2570-6
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I3/183313
Fig.1  Illustration of three challenges that existed in the field of facial attribute classification. (a) Challenge #1: The local texture features extracted by existing methods are not fine enough; (b) Challenge #2: Current models are not robust to small-scale affine transformations; (c) Challenge #3: Complex inter-attribute relationships allow current multi-label estimation models to exploit redundant correlations
Fig.2  (a) illustrates some feature maps of WST with a random sample. Here we set different scales and rotation angles to show the wavelet scattering coefficients of different nodes. (b) represents the hierarchical representation of the WST network. We adopt J=2, γ=2, and m=2 for easier presentation. Nodes and curves of different colors represent different scales and rotations, respectively. The numbers in the nodes correspond to the level order traversal of WST. The traversal order shows that we adopt the depth-first manner
Fig.3  The schema of SE (a) and the proposed WS-SE Block (b)
Index Semantic Index Semantic
1 5_o_Clock_Shadow 21 Male
2 Arched_Eyebrows 22 Mouth_Slightly_Open
3 Attractive 23 Mustache
4 Bags_Under_Eyes 24 Narrow_Eyes
5 Bald 25 No_Beard
6 Bangs 26 Oval_Face
7 Big_Lips 27 Pale_Skin
8 Big_Nose 28 Pointy_Nose
9 Black_Hair 29 Receding_Hairline
10 Blond_Hair 30 Rosy_Cheeks
11 Blurry 31 Sideburns
12 Brown_Hair 32 Smiling
13 Bushy_Eyebrows 33 Straight_Hair
14 Chubby 34 Wavy_Hair
15 Double_Chin 35 Wearing_Earrings
16 Eyeglasses 36 Wearing_Hat
17 Goatee 37 Wearing_Lipstick
18 Gray_Hair 38 Wearing_Necklace
19 Heavy_Makeup 39 Wearing_Necktie
20 High_Cheekbones 40 Young
Tab.1  Summary of the 40 face attributes provided with the CelebA and LFWA datasets
Method CelebA-A CelebA-W LFWA
LNets+ANet [54] ? 87 84
MOON [55] 90.94 ? ?
AFFACT [56] 91.67 91.45 ?
kT-MTL [30] 91.19 ? ?
Nian et al. [4] 92.1 ? 87.3
SA [57] ? 91.47 87.13
DMM-CNN [5] 91.70 ? 86.56
Lingenfelter et al. [58] 90.88 90.43 ?
SSPL [6] 91.77 ? 86.53
Hybrid network 92.16 91.50 87.41
Tab.2  Comparisons with the state-of-the-art. (measured by MA(%))
Fig.4  Illustration of the proposed WS-SE integration designs. (a) Block of WS-A; (b) Block of WS-R
Variants L2 L3 L4 L5 CelebA-A CelebA-W
WS-R_1 89.88 89.10
WS-R_2 89.84 88.95
WS-R_3 89.80 88.92
WS-R_4 88.71 87.60
WS-R_5 90.09 89.77
WS-R_6 88.23 87.57
WS-R_7 88.32 87.63
WS-R_8 88.45 87.88
WS-R_9 90.01 88.24
BA ? ? ? ? 90.03 89.23
WST ? ? ? ? 83.69 82.76
WS-A ? ? ? ? 91.64 90.72
Tab.3  Results of different variants of WS-A and WS-R on the CelebA-A and CelebA-W datasets (measured by MA(%))
Variants γ CelebA-A CelebA-W
WS-A_1 2 91.52 90.37
WS-A_2 3 91.64 90.72
WS-A_3 4 91.53 90.51
WS-A_4 5 91.53 90.53
WS-A_5 6 91.53 90.60
Tab.4  Selection of different rotation factors γ in WS-A (measured by MA(%))
Fig.5  Representative feature maps in the compared networks
Fig.6  The hybrid structure of PF and FM. (a) Parallel fusion architecture; (b) feature map fusion architecture
Variants CNN WST CelebA-A CelebA-W
PF ? ? 89.12 88.73
FM_1 64*56*56 27*56*56 88.34 88.01
FM_2 128*28*28 57*28*28 90.81 90.26
FM_3 256*14*14 99*14*14 90.82 90.35
FM_4 512*7*7 153*7*7 90.92 90.41
Tab.5  Results of different variants of PF and FM on the CelebA-A and CelebA-W datasets (measured by MA(%))
τ R C E CelebA-A CelebA-W
0.6 103 32 11 91.33 91.00
0.7 56 28 10 91.91 91.40
0.8 26 16 9 92.16 91.50
0.9 4 4 2 91.70 91.43
Tab.6  Comparison with different τ. The columns of R, C, and E indicate the number of variables (measured by MA(%))
Fig.7  The MA of CelebA-A and CelebA-W datasets with different weight α
Fig.8  The detailed attribute e results of the three representative models when α is 0, 0.6, and 1 on the CelebA-A dataset
Fig.9  Causality-related attribute representations of the CelebA and LFWA datasets for τ=0.8
Datasets Causal relationship
CelebA LFWA ?
CelebA-A 92.16 91.97 91.64
LFWA 87.36 87.41 87.21
Tab.7  The results of the CelebA-A and LFWA datasets with/without applying causal relationships (measured by MA(%))
Fig.10  The MA change of small-scale affine transformation in the CelebA-A and CelebA-W datasets, where (a), (b), (c), and (d) represents rotation, scaling, left-right and up-down translation, respectively
Fig.11  The comparison of our method with other robust FAC methods on the CelebA-A dataset
  
  
  
  
1 P, Samangouei V M, Patel R Chellappa . Facial attributes for active authentication on mobile devices. Image and Vision Computing, 2017, 58: 181–192
2 Y, Liu Q, Li Z Sun . Attribute-aware face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 11869–11878
3 C, Cao F, Lu C, Li S, Lin X Shen . Makeup removal via bidirectional tunable de-makeup network. IEEE Transactions on Multimedia, 2019, 21( 11): 2750–2761
4 F, Nian X, Chen S, Yang G Lv . Facial attribute recognition with feature decoupling and graph convolutional networks. IEEE Access, 2019, 7: 85500–85512
5 L, Mao Y, Yan J H, Xue H Wang . Deep multi-task multi-label CNN for effective facial attribute classification. IEEE Transactions on Affective Computing, 2022, 13( 2): 818–828
6 Y, Shu Y, Yan S, Chen J H, Xue C, Shen H Wang . Learning spatial-semantic relationship for facial attribute recognition with limited labeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 11911–11920
7 C, Szegedy W, Zaremba I, Sutskever J, Bruna D, Erhan I J, Goodfellow R Fergus . Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations, 2014
8 A, Azulay Y Weiss . Why do deep convolutional networks generalize so poorly to small image transformations?. Journal of Machine Learning Research, 2019, 20( 184): 1–25
9 R Zhang . Making convolutional networks shift-invariant again. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 7324–7334
10 Suchitra S, Chitrakala S, Nithya J. A robust face recognition using automatically detected facial attributes. In: Proceedings of 2014 International Conference on Science Engineering and Management Research (ICSEMR). 2014, 1–5
11 S, Yang Z, Wang J, Liu Z Guo . Controllable sketch-to-image translation for robust face synthesis. IEEE Transactions on Image Processing, 2021, 30: 8797–8810
12 N, Zhang J, Zhao F, Duan Z, Pan Z, Wu M, Zhou X Gu . An end-to-end conditional generative adversarial network based on depth map for 3D craniofacial reconstruction. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022, 759–768
13 Rozsa A, Günther M, Rudd E M, Boult T E. Are facial attributes adversarially robust? In: Proceedings of the 23rd International Conference on Pattern Recognition (ICPR). 2016, 3121–3127
14 A, Rozsa M, Günther E M, Rudd T E Boult . Facial attributes: accuracy and adversarial robustness. Pattern Recognition Letters, 2019, 124: 100–108
15 S Mallat . Wavelets for a vision. Proceedings of the IEEE, 1996, 84( 4): 604–614
16 H, Liang J, Gao N Qiang . A novel framework based on wavelet transform and principal component for face recognition under varying illumination. Applied Intelligence, 2021, 51( 3): 1762–1783
17 N B, Kar D R, Nayak K S, Babu Y D Zhang . A hybrid feature descriptor with Jaya optimised least squares SVM for facial expression recognition. IET Image Processing, 2021, 15( 7): 1471–1483
18 H, Huang R, He Z, Sun T Tan . Wavelet-SRNet: a wavelet-based CNN for multi-scale face super resolution. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 1698–1706
19 P, Li Y, Hu R, He Z Sun . Global and local consistent wavelet-domain age synthesis. IEEE Transactions on Information Forensics and Security, 2019, 14( 11): 2943–2957
20 S Mallat . Group invariant scattering. Communications on Pure and Applied Mathematics, 2012, 65( 10): 1331–1398
21 J, Bruna S Mallat . Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35( 8): 1872–1886
22 J, Bruna S Mallat . Classification with scattering operators. In: Proceedings of the CVPR 2011. 2011, 1561–1566
23 A, Singh N Kingsbury . Efficient convolutional network learning using parametric log based dual-tree wavelet ScatterNet. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017, 1140–1147
24 E, Oyallon E, Belilovsky S Zagoruyko . Scaling the scattering transform: Deep hybrid networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 5619–5628
25 Cotter F, Kingsbury N. A learnable scatternet: locally invariant convolutional layers. In: Proceedings of 2019 IEEE International Conference on Image Processing (ICIP). 2019, 350–354
26 Minskiy D, Bober M. Scattering-based hybrid networks: an evaluation and design guide. In: Proceedings of 2021 IEEE International Conference on Image Processing (ICIP). 2021, 2793–2797
27 S, Gauthier B, Thérien L, Alséne-Racicot M, Chaudhary I, Rish E, Belilovsky M, Eickenberg G Wolf . Parametric scattering networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 5739–5748
28 E M, Hand R Chellappa . Attributes for improved attributes: a multi-task network utilizing implicit and explicit relationships for facial attribute classification. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 4068–4074
29 J, Cao Y, Li Z Zhang . Partially shared multi-task convolutional neural network with local constraint for face attribute learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4290–4299
30 Fanhe X, Guo J, Huang Z, Qiu W, Zhang Y. Multi-task learning with knowledge transfer for facial attribute classification. In: Proceedings of 2019 IEEE International Conference on Industrial Technology (ICIT). 2019, 877–882
31 Lai X, Chen S, Wang D H, Zhu S. Multi-task learning with deep dual-path network for facial attribute recognition. In: Proceedings of the 9th International Conference on Computing and Pattern Recognition. 2020, 161–167
32 J, Hu L, Shen G Sun . Squeeze-and-excitation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7132–7141
33 J, Lanchantin T, Wang V, Ordonez Y Qi . General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 16473–16483
34 T, Wiatowski M, Tschannen A, Stanic P, Grohs H Bölcskei . Discrete deep feature extraction: a theory and new architectures. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 2149–2158
35 T, Wiatowski H Bolcskei . A mathematical theory of deep convolutional neural networks for feature extraction. IEEE Transactions on Information Theory, 2018, 64( 3): 1845–1866
36 J, Andén S Mallat . Deep scattering spectrum. IEEE Transactions on Signal Processing, 2014, 62( 16): 4114–4128
37 T, Angles S Mallat . Generative networks as inverse problems with scattering transforms. In: Proceedings of the 6th International Conference on Learning Representations. 2018
38 J, Wu X, Qiu J, Zhang F, Wu Y, Kong G, Yang L, Senhadji H Shu . Fractional wavelet-based generative scattering networks. Frontiers in Neurorobotics, 2021, 15: 752752
39 D, Minskiy M Bober . Efficient hybrid network: inducting scattering features. In: Proceedings of the 26th International Conference on Pattern Recognition (ICPR). 2022, 2300–2306
40 J Woodward . Causation in biology: stability, specificity, and the choice of levels of explanation. Biology & Philosophy, 2010, 25( 3): 287–318
41 J Pearl . Theoretical impediments to machine learning with seven sparks from the causal revolution. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018, 3
42 H, Han A K, Jain F, Wang S, Shan X Chen . Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40( 11): 2597–2609
43 K, He X, Zhang S, Ren J Sun . Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
44 E, Tola V, Lepetit P Fua . DAISY: an efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32( 5): 815–830
45 F, Cotter N Kingsbury . Deep learning in the wavelet domain. 2018, arXiv preprint arXiv: 1811.06115
46 M, Andreux T, Angles G, Exarchakis R, Leonarduzzi G, Rochette L, Thiry J, Zarka S, Mallat J, Andén E, Belilovsky J, Bruna V, Lostanlen M, Chaudhary M J, Hirn E, Oyallon S, Zhang C, Cella M Eickenberg . Kymatio: scattering transforms in python. Journal of Machine Learning Research, 2020, 21( 60): 1–6
47 I W, Selesnick R G, Baraniuk N C Kingsbury . The dual-tree complex wavelet transform. IEEE Signal Processing Magazine, 2005, 22( 6): 123–151
48 E, Oyallon S, Zagoruyko G, Huang N, Komodakis S, Lacoste-Julien M, Blaschko E Belilovsky . Scattering networks for hybrid representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41( 9): 2208–2221
49 S, Fan X, Wang C, Shi K, Kuang N, Liu B Wang . Debiased graph neural networks with agnostic label selection bias. IEEE Transactions on Neural Networks and Learning Systems, 2022,
https://doi.org/10.1109/TNNLS.2022.3141260
50 J, Pearl M, Glymour N P Jewell . Causal Inference in Statistics: A Primer. Chichester: John Wiley & Sons, 2016
51 Y, Sun Y, Chen X, Wang X Tang . Deep learning face representation by joint identification-verification. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 1988–1996
52 G B, Huang M, Mattar T, Berg E Learned-Miller . Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In: Proceedings of the Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition. 2008
53 Boer P T, De D P, Kroese S, Mannor R Y Rubinstein . A tutorial on the cross-entropy method. Annals of Operations Research, 2005, 134( 1): 19–67
54 Z, Liu P, Luo X, Wang X Tang . Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3730–3738
55 E M, Rudd M, Gunther T E Boult . MOON: A mixed objective optimization network for the recognition of facial attributes. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 19–35
56 Günther M, Rozsa A, Boult T E. AFFACT: Alignment-free facial attribute classification technique. In: Proceedings of 2017 IEEE International Joint Conference on Biometrics (IJCB). 2017, 90–99
57 M M, Kalayeh M Shah . On symbiosis of attribute prediction and semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43( 5): 1620–1635
58 Lingenfelter B, Hand E M. Improving evaluation of facial attribute prediction models. In: Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). 2021, 1–7
59 A, Singh N Kingsbury . Multi-resolution dual-tree wavelet scattering network for signal classification. 2017, arXiv preprint arXiv: 1702.03345
60 Singh A, Kingsbury N. Dual-tree wavelet scattering network with parametric log transformation for object classification. In: Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017, 2622–2626
61 E, Oyallon S Mallat . Deep roto-translation scattering for object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2865–2873
62 S, Kang G, Kim C D Yoo . Fair facial attribute classification via causal graph-based attribute translation. Sensors, 2022, 22( 14): 5271
[1] FCS-22570-OF-NL_suppl_1 Download
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed