Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (6) : 186604    https://doi.org/10.1007/s11704-023-2791-8
RESEARCH ARTICLE
Federated learning-outcome prediction with multi-layer privacy protection
Yupei ZHANG1,2, Yuxin LI1,2, Yifei WANG1,2, Shuangshuang WEI1,2, Yunan XU1,2, Xuequn SHANG1,2()
1. School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
2. MIIT Lab of Big Data Storage and Management, Xi’an 710129, China
 Download: PDF(14921 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Learning-outcome prediction (LOP) is a long-standing and critical problem in educational routes. Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue. To this end, this study proposes a distributed grade prediction model, dubbed FecMap, by exploiting the federated learning (FL) framework that preserves the private data of local clients and communicates with others through a global generalized model. FecMap considers local subspace learning (LSL), which explicitly learns the local features against the global features, and multi-layer privacy protection (MPP), which hierarchically protects the private features, including model-shareable features and not-allowably shared features, to achieve client-specific classifiers of high performance on LOP per institution. FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part, a local part, and a classification head in clients and averaging the global parts from clients on the server. To evaluate the FecMap model, we collected three higher-educational datasets of student academic records from engineering majors. Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP, compared with the state-of-the-art models. This study makes a fresh attempt at the use of federated learning in the learning-analytical task, potentially paving the way to facilitating personalized education with privacy protection.

Keywords federated learning      local subspace learning      hierarchical privacy protection      learning outcome prediction      privacy-protected representation learning     
Corresponding Author(s): Xuequn SHANG   
Issue Date: 28 September 2023
 Cite this article:   
Yupei ZHANG,Yuxin LI,Yifei WANG, et al. Federated learning-outcome prediction with multi-layer privacy protection[J]. Front. Comput. Sci., 2024, 18(6): 186604.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-2791-8
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I6/186604
Fig.1  The formulated data point for a student. Past achievements refer to the student’s grades in other courses, where gi is the grade in the i-th course. Student demographics and course description refers to relevant information about students and courses, where si and ci are the features, and gt is the target grade to be predicted
Fig.2  The FecMap model trained in an iterative manner. An FL communication is completed by (1) training the local model, (2) uploading to the server, (3) computing the global model, and (4) updating the local model
Fig.3  Schematic diagram of the FecMap client neural network model. Circle points are network nodes; green box is sharable; orange boxes are non-sharable; FCN is the abbreviation for fully connected network. gi is the grade, and dist is the distance between the two networks. The learned representations are composed of student features si and course features ci. Outputs are the levels of the course scores
Fig.4  The online FecMap flowchart
  
Major Student Course Record 0≤grade< 60 60≤grade< 70 70≤grade< 80 80≤grade< 90 90≤grade≤100
CST 1463 44 46512 1874 7434 10755 15720 10729
SE 1621 46 37952 1445 7050 9437 13048 6972
EIE 1937 43 49546 1917 9128 10655 16715 11131
Tab.1  Data set information, where the values in brackets are the ratios
Fig.5  Visualization of data representation. The first column is the global representations, the second is the local representations, the third is the combined representations, and the fourth is the discriminative representations. Each row represents a client, and each color represents a category
Methods CST SE EIE
FedAvg ([17]) 80.46 76.82 77.86
FedProx ([10]) 79.23 78.23 78.19
LG-Fed ([38]) 76.50 75.85 78.51
FedPer ([39]) 81.58 78.29 77.52
FedRep ([11]) 81.87 83.95 82.99
FecMap (Ours) 83.10 84.92 85.41
Tab.2  Accuracy comparison of various methods
Fig.6  Confusion matrix for FecMap (a) and FedRep (b)
Fig.7  The accuracy of FecMap-LSL and FecMap-MPP against the number of communications, compared to FedRep. (a) FecMap-LSL; (b) FecMap-MPP
Methods CST SE EIE
FecMap - LSL 82.58 84.56 85.30
FecMap - MPP 82.01 84.36 83.18
FedRep 81.87 83.95 82.99
FecMap 83.10 84.92 85.41
Tab.3  Ablation study
Fig.8  Parameter discussion on the number of clients n (a) and the data set size d (the number of samples per client, (b)) in the proposed model and other comparison methods
Fig.9  Loss function (a) and accuracy (b) for FecMap in the case study with 20 communication rounds
  
  
  
  
  
  
1 Y, Zhang R, An S, Liu J, Cui X Shang . Predicting and understanding student learning performance using multi-source sparse attention convolutional neural networks. IEEE Transactions on Big Data, 2023, 9( 1): 118–132
2 Y, Zhang H, Dai Y, Yun S, Liu A, Lan X Shang . Meta-knowledge dictionary learning on 1-bit response data for student knowledge diagnosis. Knowledge-Based Systems, 2020, 205: 106290
3 P, Symeonidis D Malakoudis . Multi-modal matrix factorization with side information for recommending massive open online courses. Expert Systems with Applications, 2019, 118: 261–271
4 Y, Zhang Y, Yun H, Dai J, Cui X Shang . Graphs regularized robust matrix factorization and its application on student grade prediction. Applied Sciences, 2020, 10( 5): 1755
5 H Bydžovská . Student performance prediction using collaborative filtering methods. In: Proceedings of the 17th International Conference on Artificial Intelligence in Education. 2015, 550−553
6 Al-Shehri H, Al-Qarni A, Al-Saati L, Batoaq A, Badukhen H, Alrashed S, Alhiyafi J, Olatunji S O. Student performance prediction using support vector machine and k-nearest neighbor. In: Proceedings of the 30th IEEE Canadian Conference on Electrical and Computer Engineering. 2017, 1−4
7 A, Polyzou G Karypis . Feature extraction for next-term prediction of poor student performance. IEEE Transactions on Learning Technologies, 2019, 12( 2): 237–248
8 Y, Zhang Y, Yun R, An J, Cui H, Dai X Shang . Educational data mining techniques for student performance prediction: method review and comparison analysis. Frontiers in Psychology, 2021, 12: 698490
9 T, Li S, Hu A, Beirami V Smith . Ditto: Fair and robust federated learning through personalization. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 6357−6368
10 T, Li A K, Sahu M, Zaheer M, Sanjabi A, Talwalkar V Smith . Federated optimization in heterogeneous networks. In: Proceedings of Machine Learning and Systems. 2020, 429−450
11 L, Collins H, Hassani A, Mokhtari S Shakkottai . Exploiting shared representations for personalized federated learning. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 2089−2099
12 F, Haddadpour M Mahdavi . On the convergence of local descent methods in federated learning. 2019, arXiv preprint arXiv: 1910.14425
13 T, Li A K, Sahu A, Talwalkar V Smith . Federated learning: Challenges, methods, and future directions. IEEE Signal Processing Magazine, 2020, 37( 3): 50–60
14 Q, Li Z, Wen Z, Wu S, Hu N, Wang Y, Li X, Liu B He . A survey on federated learning systems: vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 2023, 35( 4): 3347–3366
15 Tan A Z, Yu H, Cui L, Yang Q. Towards personalized federated learning. IEEE Transactions on Neural Networks and Learning Systems, 2022, 1–17
16 Y, Li X, Liu X, Zhang Y, Shao Q, Wang Y Geng . Personalized federated learning via maximizing correlation with sparse and hierarchical extensions. 2021, arXiv preprint arXiv: 2107.05330
17 B, McMahan E, Moore D, Ramage S, Hampson B A Y Arcas . Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 2017, 1273−1282
18 Y R, Chen A, Rezapour W G Tzeng . Privacy-preserving ridge regression on distributed data. Information Sciences, 2018, 451-452: 34–49
19 D K, Dennis T, Li V Smith . Heterogeneity for the win: One-shot federated clustering. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 2611−2620
20 M, Ribero J, Henderson S, Williamson H Vikalo . Federating recommendations using differentially private prototypes. Pattern Recognition, 2022, 129: 108746
21 P, Zhou K, Wang L, Guo S, Gong B Zheng . A privacy-preserving distributed contextual federated online learning framework with big data support in social recommender systems. IEEE Transactions on Knowledge and Data Engineering, 2021, 33( 3): 824–838
22 Y, Zhang Y, Xu S, Wei Y, Wang Y, Li X Shang . Doubly contrastive representation learning for federated image recognition. Pattern Recognition, 2023, 139: 109507
23 X, Ma J, Zhang S, Guo W Xu . Layer-wised model aggregation for personalized federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10082–10091
24 X C, Li D C, Zhan Y, Shao B, Li S Song . FedPHP: Federated personalization with inherited private models. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. 2021, 587−602
25 X, Zhang Y, Li W, Li K, Guo Y Shao . Personalized federated learning via variational bayesian inference. In: Proceedings of International Conference on Machine Learning. 2022, 26293−26310
26 V, Smith C K, Chiang M, Sanjabi A Talwalkar . Federated multi-task learning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 4427–4437
27 M, Duan D, Liu X, Chen R, Liu Y, Tan L Liang . Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Transactions on Parallel and Distributed Systems, 2021, 32( 1): 59–71
28 C I, Bercea B, Wiestler D, Rueckert S Albarqouni . Federated disentangled representation learning for unsupervised brain anomaly detection. Nature Machine Intelligence, 2022, 4( 8): 685–695
29 Q, Wu X, Chen Z, Zhou J Zhang . FedHome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Transactions on Mobile Computing, 2022, 21( 8): 2818–2832
30 N, Wang Y, Chen Y, Hu W, Lou Y T Hou . FeCo: Boosting intrusion detection capability in IoT networks via contrastive learning. In: Proceedings of IEEE INFOCOM 2022-IEEE Conference on Computer Communications. 2022, 1409−1418
31 G, Long M, Xie T, Shen T, Zhou X, Wang J Jiang . Multi-center federated learning: clients clustering for better personalization. World Wide Web, 2023, 26( 1): 481–500
32 F, Sattler K R, Müller W Samek . Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32( 8): 3710–3722
33 X, Li M, Jiang X, Zhang M, Kamp Q Dou . FedBN: Federated learning on non-IID features via local batch normalization. In: Proceedings of the 9th International Conference on Learning Representations. 2021
34 Y, Zhang S, Wei S, Liu Y, Wang Y, Xu Y, Li X Shang . Graph-regularized federated learning with shareable side information. Knowledge-Based Systems, 2022, 257: 109960
35 L, Yang J, Huang W, Lin J Cao . Personalized federated learning on non-IID data via group-based meta-learning. ACM Transactions on Knowledge Discovery from Data, 2023, 17( 4): 49
36 K A, Bonawitz H, Eichner W, Grieskamp D, Huba A, Ingerman V, Ivanov C, Kiddon J, Konecný S, Mazzocchi B, McMahan Overveldt T, Van D, Petrou D, Ramage J Roselander . Towards federated learning at scale: System design. In: Proceedings of Machine Learning and Systems. 2019, 374−388
37 X C, Li L, Gan D C, Zhan Y, Shao B, Li S Song . Aggregate or not? Exploring where to privatize in DNN based federated learning under different non-IID scenes. 2021, arXiv preprint arXiv: 2107.11954
38 Liang P P, Liu T, Ziyin L, Allen N B, Auerbach R P, Brent D, Salakhutdinov R, Morency L P. Think locally, act globally: federated learning with local and global representations. 2020, arXiv preprint arXiv: 2001.01523
39 M G, Arivazhagan V, Aggarwal A K, Singh S Choudhary . Federated learning with personalization layers. 2019, arXiv preprint arXiv: 1912.00818
40 der Maaten L, van G Hinton . Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9( 86): 2579–2605
[1] FCS-22791-OF-YZ_suppl_1 Download
[1] Nan SUN, Wei WANG, Yongxin TONG, Kexin LIU. Blockchain based federated learning for intrusion detection for Internet of Things[J]. Front. Comput. Sci., 2024, 18(5): 185328-.
[2] Xinwen GAO, Shaojing FU, Lin LIU, Yuchuan LUO. BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning[J]. Front. Comput. Sci., 2024, 18(5): 185810-.
[3] Shiwei LU, Ruihu LI, Wenbin LIU. FedDAA: a robust federated learning framework to protect privacy and defend against adversarial attack[J]. Front. Comput. Sci., 2024, 18(2): 182307-.
[4] Xianfeng LIANG, Shuheng SHEN, Enhong CHEN, Jinchang LIU, Qi LIU, Yifei CHENG, Zhen PAN. Accelerating local SGD for non-IID data using variance reduction[J]. Front. Comput. Sci., 2023, 17(2): 172311-.
[5] Kaiyue ZHANG, Xuan SONG, Chenhan ZHANG, Shui YU. Challenges and future directions of secure federated learning: a survey[J]. Front. Comput. Sci., 2022, 16(5): 165817-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed