Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (5) : 185810    https://doi.org/10.1007/s11704-023-3142-5
RESEARCH ARTICLE
BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning
Xinwen GAO, Shaojing FU(), Lin LIU, Yuchuan LUO
College of Computer, National University of Defence Technology, Changsha 410000, China
 Download: PDF(8813 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than t participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.

Keywords federated learning      differential private      verifiable aggregation      Byzantine fault-tolerance     
Corresponding Author(s): Shaojing FU   
Just Accepted Date: 05 June 2023   Issue Date: 31 July 2023
 Cite this article:   
Xinwen GAO,Shaojing FU,Lin LIU, et al. BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning[J]. Front. Comput. Sci., 2024, 18(5): 185810.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-3142-5
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I5/185810
Fig.1  The architecture of the adversarial model in federated learning
Entities
AS aggregation server
C client
[CN] a set of N clients {C1,C2,,CN}
U,V,H a set of multiple clients
Indices
b the number of Byzantine clients in all clients
T total number of iterations
i,j,s index of clients, i,j,p,g{1,2,...,n}
k index of iterations, k{1,2,...,T}
n the batch size of local datasets of clients
Global variables
d,m dimension of input
t threshold for secret sharing
pk,sk a pair of public and private key for key agreement
θ global model parameter
Eq elliptic curve over finite field Zq
Local variables
g local gradient
h hash value
r random value for commitment
c commitment value h
Interactive variables
Kij agreed key generated by key agreement between client Ci and Cj
Cij agreed cipher that client Ci encrypted by agreed key Kij and send to Cj
[[i]]j the secret shares that client Ci shares to Cj
Tab.1  Involved notations
  
  
Fig.2  High-level view of BVDFed
Fig.3  Full view of BVDFed
Privacy preservation Byzantine tolerance Verifiability Collusion resilience
LDP-Fed × × ×
FLTrust × × ×
VeriFL × ×
DPBFL × ×
BVDFed
Tab.2  Detailed comparison of functionalities
Fig.4  The Experimental Result of Feasibility of Verification. (a) Comparison to the sum of gradients before and after completing the protocol. A point (x, y) in this figure indicates that the values of elements in x index is y. Note that all the blue square points and the red circular points are overlapped, which indicates that they are identical; (b) comparison to the number of honest clients accepting or rejecting aggregatedresults in different settings, where b denotes the number of Byzantine clients
Fig.5  The Experiment on Accuracy of Classification. Note that b denotes the number of Byzantine clients and ? denotes the privacy budgets. (a) The baseline of accuracy of classification with small noises and no Byzantines; (b) the accuracy of classification with different sizes of privacy budgets and no Byzantines; (c) the accuracy of classification with different number of Byzantines and small noises; (d) the accuracy of classification with both noises and Byzantines in different settings
Privacy budgets of LDP Numbers of Byzantine clients Types of overheads Initialization Round 1 Round 2 Round 3 Total
?=1000 b=0 computation 30 ms 5993 ms 0 ms 19555 ms 25578 ms
communication 0.22 KB 15185 KB 0.46 KB ? 15186 KB
?=1000 b=10 computation 31 ms 5992 ms 0 ms 17607 ms 23630 ms
communication 0.22 KB 15187 KB 0.46 KB ? 15188 KB
b=20 computation 31 ms 5955 ms 0 ms 15760 ms 21746 ms
communication 0.22 KB 15188 KB 0.46 KB ? 15189 KB
b=30 computation 30 ms 6126 ms 0 ms 13939 ms 20095 ms
communication 0.22 KB 15189 KB 0.46 KB ? 15190 KB
b=40 computation 32 ms 6269 ms 0 ms 11880 ms 18181 ms
communication 0.22 KB 15191 KB 0.46 KB ? 15192 KB
b=50 computation 32 ms 6335 ms 0 ms 9972 ms 16339 ms
communication 0.22 KB 15193 KB 0.46 KB ? 15194 KB
?=100 b=0 computation 31 ms 6318 ms 0 ms 19677 ms 26026 ms
communication 0.22 KB 15928 KB 0.46 KB ? 15929 KB
?=10 computation 32 ms 6338 ms 0 ms 19815 ms 26185 ms
communication 0.22 KB 17316 KB 0.46 KB ? 17317 KB
?=1 computation 31 ms 6102 ms 0 ms 19172 ms 25305 ms
communication 0.22 KB 18928 KB 0.46 KB ? 18929 KB
?=0.1 computation 34 ms 6099 ms 0 ms 19145 ms 25278 ms
communication 0.22 KB 20553 KB 0.46 KB ? 20554 KB
?=0.01 computation 32 ms 6028 ms 0 ms 19367 ms 25427 ms
communication 0.22 KB 22177 KB 0.46 KB ? 22178 KB
Tab.3  Overheads of each client in different settings with N=100 clients
Privacy budgets of LDP Numbers of Byzantine clients Types of overheads Initialization Round 1 Round 2 Round 3 Total
?=1000 b=0 computation 1 ms 80003 ms 720 ms ? 80723 ms
communication 22 KB 22359 KB 25 KB ? 22406 KB
?=1000 b=10 computation 1 ms 78194 ms 641 ms ? 78836 ms
communication 22 KB 22297 KB 22 KB ? 22341 KB
b=20 computation 1 ms 79704 ms 563 ms ? 80267 ms
communication 22 KB 22225 KB 20 KB ? 22267 KB
b=30 computation 1 ms 83506 ms 538 ms ? 84044 ms
communication 22 KB 22142 KB 17 KB ? 22181 KB
b=40 computation 1 ms 87102 ms 456 ms ? 87558 ms
communication 22 KB 22048 KB 15 KB ? 22085 KB
b=50 computation 1 ms 83193 ms 363 ms ? 83556 ms
communication 22 KB 21936 KB 12 KB ? 21970 KB
?=100 b=0 computation 1 ms 86114 ms 738 ms ? 86852 ms
communication 22 KB 22755 KB 25 KB ? 22802 KB
?=10 computation 1 ms 88788 ms 739 ms ? 89527 ms
communication 22 KB 23574 KB 25 KB ? 23621 KB
?=1 computation 1 ms 93862 ms 751 ms ? 94613 ms
communication 22 KB 25071 KB 25 KB ? 25118 KB
?=0.1 computation 1 ms 85029 ms 713 ms ? 85742 ms
communication 22 KB 26698 KB 25 KB ? 26745 KB
?=0.01 computation 1 ms 83182 ms 734.0 ms ? 83917 ms
communication 22 KB 28322 KB 25 KB ? 28369 KB
Tab.4  Overheads of server in different settings with N=100 clients
Fig.6  The more specific impacts of different settings on the overheads of computation and communication in Round 1. Note that b denotes the number of Byzantine clients and ? denotes the privacy budgets. (a) The computation overheads of each client in Round 1 with different settings of privacy budgets and Byzantine clients. Locally training, generating noise and sign and generating ciphers are the main compositions; (b) the communication overheads of each client in Round 1 with different settings of privacy budgets and Byzantine clients. Disguised gradient and sign are the main compositions; (c) the computation overheads of the server in Round 1 with different settings of privacy budgets and Byzantine clients. The process of DPLoss and aggregating are the main compositions; (d) the communication overheads of the server in Round 1 with different settings of privacy budgets and Byzantine clients. The aggregated result and ACS et are the main compositions
  
  
  
  
1 McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 2017, 1273−1282
2 L, Zhu Z, Liu S Han . Deep leakage from gradients. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 1323
3 Zhao B, Mopuri K R, Bilen H. iDLG: improved deep leakage from gradients. 2020, arXiv preprint arXiv: 2001.02610
4 J, Geiping H, Bauermeister H, Dröge M Moeller . Inverting gradients - how easy is it to break privacy in federated learning?. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1421
5 R C, Geyer T, Klein M Nabi . Differentially private federated learning: a client level perspective. 2017, arXiv preprint, arXiv: 1712.07557
6 Hitaj B, Ateniese G, Perez-Cruz F. Deep models under the GAN: Information leakage from collaborative deep learning. In: Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017, 603−618
7 W, Wei L Liu . Gradient leakage attack resilient deep learning. IEEE Transactions on Information Forensics and Security, 2022, 17: 303–316
8 Shejwalkar V, Houmansadr A. Manipulating the byzantine: optimizing model poisoning attacks and defenses for federated learning. In: Proceedings of the 28th Annual Network and Distributed System Security Symposium. 2021
9 G, Xu H, Li S, Liu K, Yang X Lin . VerifyNet: secure and verifiable federated learning. IEEE Transactions on Information Forensics and Security, 2020, 15: 911–926
10 M, Li D, Xiao J, Liang H Huang . Communication-efficient and byzantine-robust differentially private federated learning. IEEE Communications Letters, 2022, 26( 8): 1725–1729
11 J, Zhou N, Wu Y, Wang S, Gu Z, Cao X, Dong K K R Choo . A differentially private federated learning model against poisoning attacks in edge computing. IEEE Transactions on Dependable and Secure Computing, 2023, 20( 3): 1941–1958
12 X, Ma X, Sun Y, Wu Z, Liu X, Chen C Dong . Differentially private byzantine-robust federated learning. IEEE Transactions on Parallel and Distributed Systems, 2022, 33( 12): 3690–3701
13 Xiang M, Su L. β-stochastic sign SGD: a byzantine resilient and differentially private gradient compressor for federated learning. 2022, arXiv preprint arXiv: 2210.00665
14 X, Guo Z, Liu J, Li J, Gao B, Hou C, Dong T Baker . VeriFL: communication-efficient and fast verifiable aggregation for federated learning. IEEE Transactions on Information Forensics and Security, 2021, 16: 1736–1751
15 Abadi M, Chu A, Goodfellow I, McMahan H B, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In: Proceedings of 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, 308−318
16 Q, Yang Y, Liu T, Chen Y Tong . Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10( 2): 12
17 P, Kairouz H B, McMahan B, Avent A, Bellet M, Bennis , et al.. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 2021, 14(1−2): 210
18 V, Tolpegin S, Truex M E, Gursoy L Liu . Data poisoning attacks against federated learning systems. In: Proceedings of the 25th European Symposium on Research in Computer Security. 2020, 480−501
19 G, Xia J, Chen C, Yu J Ma . Poisoning attacks in federated learning: a survey. IEEE Access, 2023, 11: 10708–10722
20 C Dwork . Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming. 2006, 1−12
21 C, Dwork A Roth . The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 2014, 9(3−4): 211−407
22 Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. 2006, 265−284
23 C Dwork . A firm foundation for private data analysis. Communications of the ACM, 2011, 54( 1): 86–95
24 Krohn M N, Freedman M J, Mazieres D. On-the-fly verification of rateless erasure codes for efficient content distribution. In: Proceedings of IEEE Symposium on Security and Privacy, 2004, 226−240
25 T P Pedersen . Non-interactive and information-theoretic secure verifiable secret sharing. In: Proceedings of Annual International Cryptology Conference. 1992, 129−140
26 A Shamir . How to share a secret. Communications of the ACM, 1979, 22( 11): 612–613
27 Lyu L, Yu H, Ma X, Chen C, Sun L, Zhao J, Yang Q, Yu P S. Privacy and robustness in federated learning: attacks and defenses. IEEE Transactions on Neural Networks and Learning Systems, 2022, doi: 10.1109/TNNLS.2022.3216981.
28 H B, McMahan D, Ramage K, Talwar L Zhang . Learning differentially private recurrent language models. In: Proceedings of the 6th International Conference on Learning Representations. 2018
29 L, Lyu K, Nandakumar B, Rubinstein J, Jin J, Bedo M Palaniswami . PPFA: privacy preserving fog-enabled aggregation in smart grid. IEEE Transactions on Industrial Informatics, 2018, 14( 8): 3733–3744
30 Rastogi V, Nath S. Differentially private aggregation of distributed time-series with transformation and encryption. In: Proceedings of 2010 ACM SIGMOD International Conference on Management of Data. 2010, 735−746
31 Agarwal N, Suresh A T, Yu F, Kumar S, McMahan H B. cpSGD: communication-efficient and differentially-private distributed sgd. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 7575−7586
32 Duchi J C, Jordan M I, Wainwright M J. Local privacy and statistical minimax rates. In: Proceedings of the 54th IEEE Annual Symposium on Foundations of Computer Science. 2013, 429−438
33 N, Wu F, Farokhi D, Smith M A Kaafar . The value of collaboration in convex machine learning with differential privacy. In: Proceedings of 2020 IEEE Symposium on Security and Privacy. 2020, 304−317
34 Y, Zhou X, Liu Y, Fu D, Wu C, Li S Yu . Optimizing the numbers of queries and replies in federated learning with differential privacy. 2021, arXiv preprint, arXiv: 2107.01895
35 Xie C, Koyejo S, Gupta I. Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 6893−6901
36 Wilcox-O’Hearn Z. Bitcoin privacy technologies - zerocash and confidential transactions. weusecoins.com/bitcoin-privacy-technologies-zerocash-confidential-transactions/. 2015
37 Truex S, Liu L, Chow K H, Gursoy M E, Wei W. LDP-fed: federated learning with local differential privacy. In: Proceedings of the 3rd ACM International Workshop on Edge Systems, Analytics and Networking. 2020, 61−66
38 X, Cao M, Fang J, Liu N Z Gong . FLTrust: Byzantine-robust federated learning via trust bootstrapping. In: Proceedings of the 28th Annual Network and Distributed System Security Symposium. 2021
39 Y, Xu C, Peng W, Tan Y, Tian M, Ma K Niu . Non-interactive verifiable privacy-preserving federated learning. Future Generation Computer Systems, 2022, 128: 365–380
40 Blanchard P, Mhamdi E M E, Guerraoui R, Stainer J. Machine learning with adversaries: Byzantine tolerant gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 118−128
41 E M E, Mhamdi R, Guerraoui S Rouault . The hidden vulnerability of distributed learning in byzantium. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 3518−3527
42 D, Yin Y, Chen K, Ramchandran P L Bartlett . Byzantine-robust distributed learning: Towards optimal statistical rates. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 5636−5645
43 C, Xie O, Koyejo I Gupta . Generalized byzantine-tolerant SGD. 2018, arXiv preprint arXiv: 1802.10116
44 Ma Z, Ma J, Miao Y, Li Y, Deng R H. Shieldfl: mitigating model poisoning attacks in privacy-preserving federated learning. IEEE Transactions on Information Forensics and Security, 2022, 17: 1639–1654
45 Z, Gu Y Yang . Detecting malicious model updates from federated learning on conditional variational autoencoder. In: Proceedings of 2021 IEEE International Parallel and Distributed Processing Symposium. 2021, 671−680
[1] FCS-23142-OF-XG_suppl_1 Download
[1] Shiwei LU, Ruihu LI, Wenbin LIU. FedDAA: a robust federated learning framework to protect privacy and defend against adversarial attack[J]. Front. Comput. Sci., 2024, 18(2): 182307-.
[2] Xianfeng LIANG, Shuheng SHEN, Enhong CHEN, Jinchang LIU, Qi LIU, Yifei CHENG, Zhen PAN. Accelerating local SGD for non-IID data using variance reduction[J]. Front. Comput. Sci., 2023, 17(2): 172311-.
[3] Kaiyue ZHANG, Xuan SONG, Chenhan ZHANG, Shui YU. Challenges and future directions of secure federated learning: a survey[J]. Front. Comput. Sci., 2022, 16(5): 165817-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed