Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than t participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.
a pair of public and private key for key agreement
global model parameter
elliptic curve over finite field
Local variables
local gradient
hash value
random value for commitment
commitment value
Interactive variables
agreed key generated by key agreement between client and
agreed cipher that client encrypted by agreed key and send to
the secret shares that client shares to
Tab.1
Fig.2
Fig.3
Privacy preservation
Byzantine tolerance
Verifiability
Collusion resilience
LDP-Fed
√
×
×
×
FLTrust
×
√
×
×
VeriFL
√
×
√
×
DPBFL
√
√
×
×
BVDFed
√
√
√
√
Tab.2
Fig.4
Fig.5
Privacy budgets of LDP
Numbers of Byzantine clients
Types of overheads
Initialization
Round 1
Round 2
Round 3
Total
computation
30 ms
5993 ms
0 ms
19555 ms
25578 ms
communication
0.22 KB
15185 KB
0.46 KB
?
15186 KB
computation
31 ms
5992 ms
0 ms
17607 ms
23630 ms
communication
0.22 KB
15187 KB
0.46 KB
?
15188 KB
computation
31 ms
5955 ms
0 ms
15760 ms
21746 ms
communication
0.22 KB
15188 KB
0.46 KB
?
15189 KB
computation
30 ms
6126 ms
0 ms
13939 ms
20095 ms
communication
0.22 KB
15189 KB
0.46 KB
?
15190 KB
computation
32 ms
6269 ms
0 ms
11880 ms
18181 ms
communication
0.22 KB
15191 KB
0.46 KB
?
15192 KB
computation
32 ms
6335 ms
0 ms
9972 ms
16339 ms
communication
0.22 KB
15193 KB
0.46 KB
?
15194 KB
computation
31 ms
6318 ms
0 ms
19677 ms
26026 ms
communication
0.22 KB
15928 KB
0.46 KB
?
15929 KB
computation
32 ms
6338 ms
0 ms
19815 ms
26185 ms
communication
0.22 KB
17316 KB
0.46 KB
?
17317 KB
computation
31 ms
6102 ms
0 ms
19172 ms
25305 ms
communication
0.22 KB
18928 KB
0.46 KB
?
18929 KB
computation
34 ms
6099 ms
0 ms
19145 ms
25278 ms
communication
0.22 KB
20553 KB
0.46 KB
?
20554 KB
computation
32 ms
6028 ms
0 ms
19367 ms
25427 ms
communication
0.22 KB
22177 KB
0.46 KB
?
22178 KB
Tab.3
Privacy budgets of LDP
Numbers of Byzantine clients
Types of overheads
Initialization
Round 1
Round 2
Round 3
Total
computation
1 ms
80003 ms
720 ms
?
80723 ms
communication
22 KB
22359 KB
25 KB
?
22406 KB
computation
1 ms
78194 ms
641 ms
?
78836 ms
communication
22 KB
22297 KB
22 KB
?
22341 KB
computation
1 ms
79704 ms
563 ms
?
80267 ms
communication
22 KB
22225 KB
20 KB
?
22267 KB
computation
1 ms
83506 ms
538 ms
?
84044 ms
communication
22 KB
22142 KB
17 KB
?
22181 KB
computation
1 ms
87102 ms
456 ms
?
87558 ms
communication
22 KB
22048 KB
15 KB
?
22085 KB
computation
1 ms
83193 ms
363 ms
?
83556 ms
communication
22 KB
21936 KB
12 KB
?
21970 KB
computation
1 ms
86114 ms
738 ms
?
86852 ms
communication
22 KB
22755 KB
25 KB
?
22802 KB
computation
1 ms
88788 ms
739 ms
?
89527 ms
communication
22 KB
23574 KB
25 KB
?
23621 KB
computation
1 ms
93862 ms
751 ms
?
94613 ms
communication
22 KB
25071 KB
25 KB
?
25118 KB
computation
1 ms
85029 ms
713 ms
?
85742 ms
communication
22 KB
26698 KB
25 KB
?
26745 KB
computation
1 ms
83182 ms
734.0 ms
?
83917 ms
communication
22 KB
28322 KB
25 KB
?
28369 KB
Tab.4
Fig.6
1
McMahan B, Moore E, Ramage D, Hampson S, Arcas B A Y. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 2017, 1273−1282
2
L, Zhu Z, Liu S Han . Deep leakage from gradients. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 1323
3
Zhao B, Mopuri K R, Bilen H. iDLG: improved deep leakage from gradients. 2020, arXiv preprint arXiv: 2001.02610
4
J, Geiping H, Bauermeister H, Dröge M Moeller . Inverting gradients - how easy is it to break privacy in federated learning?. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1421
5
R C, Geyer T, Klein M Nabi . Differentially private federated learning: a client level perspective. 2017, arXiv preprint, arXiv: 1712.07557
6
Hitaj B, Ateniese G, Perez-Cruz F. Deep models under the GAN: Information leakage from collaborative deep learning. In: Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications Security. 2017, 603−618
7
W, Wei L Liu . Gradient leakage attack resilient deep learning. IEEE Transactions on Information Forensics and Security, 2022, 17: 303–316
8
Shejwalkar V, Houmansadr A. Manipulating the byzantine: optimizing model poisoning attacks and defenses for federated learning. In: Proceedings of the 28th Annual Network and Distributed System Security Symposium. 2021
9
G, Xu H, Li S, Liu K, Yang X Lin . VerifyNet: secure and verifiable federated learning. IEEE Transactions on Information Forensics and Security, 2020, 15: 911–926
10
M, Li D, Xiao J, Liang H Huang . Communication-efficient and byzantine-robust differentially private federated learning. IEEE Communications Letters, 2022, 26( 8): 1725–1729
11
J, Zhou N, Wu Y, Wang S, Gu Z, Cao X, Dong K K R Choo . A differentially private federated learning model against poisoning attacks in edge computing. IEEE Transactions on Dependable and Secure Computing, 2023, 20( 3): 1941–1958
12
X, Ma X, Sun Y, Wu Z, Liu X, Chen C Dong . Differentially private byzantine-robust federated learning. IEEE Transactions on Parallel and Distributed Systems, 2022, 33( 12): 3690–3701
13
Xiang M, Su L. β-stochastic sign SGD: a byzantine resilient and differentially private gradient compressor for federated learning. 2022, arXiv preprint arXiv: 2210.00665
14
X, Guo Z, Liu J, Li J, Gao B, Hou C, Dong T Baker . VeriFL: communication-efficient and fast verifiable aggregation for federated learning. IEEE Transactions on Information Forensics and Security, 2021, 16: 1736–1751
15
Abadi M, Chu A, Goodfellow I, McMahan H B, Mironov I, Talwar K, Zhang L. Deep learning with differential privacy. In: Proceedings of 2016 ACM SIGSAC Conference on Computer and Communications Security. 2016, 308−318
16
Q, Yang Y, Liu T, Chen Y Tong . Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10( 2): 12
17
P, Kairouz H B, McMahan B, Avent A, Bellet M, Bennis , et al.. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 2021, 14(1−2): 210
18
V, Tolpegin S, Truex M E, Gursoy L Liu . Data poisoning attacks against federated learning systems. In: Proceedings of the 25th European Symposium on Research in Computer Security. 2020, 480−501
19
G, Xia J, Chen C, Yu J Ma . Poisoning attacks in federated learning: a survey. IEEE Access, 2023, 11: 10708–10722
20
C Dwork . Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming. 2006, 1−12
21
C, Dwork A Roth . The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 2014, 9(3−4): 211−407
22
Dwork C, McSherry F, Nissim K, Smith A. Calibrating noise to sensitivity in private data analysis. In: Proceedings of the 3rd Theory of Cryptography Conference. 2006, 265−284
23
C Dwork . A firm foundation for private data analysis. Communications of the ACM, 2011, 54( 1): 86–95
24
Krohn M N, Freedman M J, Mazieres D. On-the-fly verification of rateless erasure codes for efficient content distribution. In: Proceedings of IEEE Symposium on Security and Privacy, 2004, 226−240
25
T P Pedersen . Non-interactive and information-theoretic secure verifiable secret sharing. In: Proceedings of Annual International Cryptology Conference. 1992, 129−140
26
A Shamir . How to share a secret. Communications of the ACM, 1979, 22( 11): 612–613
27
Lyu L, Yu H, Ma X, Chen C, Sun L, Zhao J, Yang Q, Yu P S. Privacy and robustness in federated learning: attacks and defenses. IEEE Transactions on Neural Networks and Learning Systems, 2022, doi: 10.1109/TNNLS.2022.3216981.
28
H B, McMahan D, Ramage K, Talwar L Zhang . Learning differentially private recurrent language models. In: Proceedings of the 6th International Conference on Learning Representations. 2018
29
L, Lyu K, Nandakumar B, Rubinstein J, Jin J, Bedo M Palaniswami . PPFA: privacy preserving fog-enabled aggregation in smart grid. IEEE Transactions on Industrial Informatics, 2018, 14( 8): 3733–3744
30
Rastogi V, Nath S. Differentially private aggregation of distributed time-series with transformation and encryption. In: Proceedings of 2010 ACM SIGMOD International Conference on Management of Data. 2010, 735−746
31
Agarwal N, Suresh A T, Yu F, Kumar S, McMahan H B. cpSGD: communication-efficient and differentially-private distributed sgd. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 7575−7586
32
Duchi J C, Jordan M I, Wainwright M J. Local privacy and statistical minimax rates. In: Proceedings of the 54th IEEE Annual Symposium on Foundations of Computer Science. 2013, 429−438
33
N, Wu F, Farokhi D, Smith M A Kaafar . The value of collaboration in convex machine learning with differential privacy. In: Proceedings of 2020 IEEE Symposium on Security and Privacy. 2020, 304−317
34
Y, Zhou X, Liu Y, Fu D, Wu C, Li S Yu . Optimizing the numbers of queries and replies in federated learning with differential privacy. 2021, arXiv preprint, arXiv: 2107.01895
35
Xie C, Koyejo S, Gupta I. Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 6893−6901
36
Wilcox-O’Hearn Z. Bitcoin privacy technologies - zerocash and confidential transactions. weusecoins.com/bitcoin-privacy-technologies-zerocash-confidential-transactions/. 2015
37
Truex S, Liu L, Chow K H, Gursoy M E, Wei W. LDP-fed: federated learning with local differential privacy. In: Proceedings of the 3rd ACM International Workshop on Edge Systems, Analytics and Networking. 2020, 61−66
38
X, Cao M, Fang J, Liu N Z Gong . FLTrust: Byzantine-robust federated learning via trust bootstrapping. In: Proceedings of the 28th Annual Network and Distributed System Security Symposium. 2021
39
Y, Xu C, Peng W, Tan Y, Tian M, Ma K Niu . Non-interactive verifiable privacy-preserving federated learning. Future Generation Computer Systems, 2022, 128: 365–380
40
Blanchard P, Mhamdi E M E, Guerraoui R, Stainer J. Machine learning with adversaries: Byzantine tolerant gradient descent. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 118−128
41
E M E, Mhamdi R, Guerraoui S Rouault . The hidden vulnerability of distributed learning in byzantium. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 3518−3527
42
D, Yin Y, Chen K, Ramchandran P L Bartlett . Byzantine-robust distributed learning: Towards optimal statistical rates. In: Proceedings of the 35th International Conference on Machine Learning. 2018, 5636−5645
43
C, Xie O, Koyejo I Gupta . Generalized byzantine-tolerant SGD. 2018, arXiv preprint arXiv: 1802.10116
44
Ma Z, Ma J, Miao Y, Li Y, Deng R H. Shieldfl: mitigating model poisoning attacks in privacy-preserving federated learning. IEEE Transactions on Information Forensics and Security, 2022, 17: 1639–1654
45
Z, Gu Y Yang . Detecting malicious model updates from federated learning on conditional variational autoencoder. In: Proceedings of 2021 IEEE International Parallel and Distributed Processing Symposium. 2021, 671−680