Federated Learning for Privacy-Preserving Big Data Analytics in Distributed Systems

Federated Learning for Privacy-Preserving Big Data Analytics in Distributed Systems

Authors

  • Ahmed Gheni Dawood University of Diyala
  • Ekhlas Muthanna Turki University of Diyala

DOI:

https://doi.org/10.21070/joincs.v9i1.1699

Keywords:

Federated Learning, Privacy-Preserving Analytics, Distributed Systems, Big Data, Data Heterogeneity, Communication Efficiency, Differential Privacy, , Secure Multi-Party Computation, Secure Multi-Party Computation, Scalability, Machine Learning, Fairness, Energy Efficiency, Edge Computing, Blockchain, Quantum Computing, Federated Generative Models

Abstract

Federated Learning (FL) is an important concept in big data analytics because it has changed the way collaborative model training can be done on devices that are decentralized while ensuring user privacy, an essential requirement in an accurate evidence-based and regulated environment with even stricter requirements from regulations like GDPR, HIPAA, CCPA and future laws on data sovereignty. This paper analyzed FL in depth. It described foundational concepts, architectural approaches, algorithmic approaches, real-world and practical applications and challenges in distributed systems. Key issues such as communication overhead, data heterogeneity, security risks, fairness, scalability, energy efficiency and compliance with regulations were also discussed and analyses were provided on any underpinning implications on FL performance. Seven tables provide comprehensive overviews of the algorithms, datasets, metrics of performance and applications, while nine figures in unique styles visualize trends, comparisons and data analytics to aid readability. Applications were provided in healthcare, IoT, financial sectors, smart cities and autonomous systems which lend evidence to the promise of FL as a revolutionary technology for privacy-respecting related analytics. Future directions for integrating FL highlights potential synergies with emergent technology such as quantum computing, blockchain, edge artificial intelligence and federated generative models, with supported rationales and inferences when necessary. This work provides a comprehensive and definitive reference point to enhance the scope and level of enquiry for researchers and practitioners who are trying to advance the development of distributed machine learning in sensitive situations to ultimately support the emergence of secure, scalable, ethical, and privacy-preserving analytics, which can drive future paradigm shifts

Author Biographies

Ahmed Gheni Dawood, University of Diyala

College of Education for Human Sciences

Ekhlas Muthanna Turki, University of Diyala

College of Education for Human Sciences –

References

[1]. McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Aguera y Arcas, B. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 54, 1273–1282. https://doi.org/10.48550/arXiv.1602.05629

[2]. Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., Eichner, H., Kiddon, C., & Ramage, D. (2018). Federated learning for mobile keyboard prediction. arXiv. https://doi.org/10.48550/arXiv.1811.03604

[3]. Bonawitz, K., Eichner, H., Grieskamp, W., Huba, D., Ingerman, A., Ivanov, V., Kiddon, C., Konečný, J., Mazzawi, H., McMahan, H. B., Ramage, D., Roselander, J., & Van Overveldt, T. (2019). Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems (MLSys). https://doi.org/10.48550/arXiv.1902.01046.

[4]. Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems (MLSys). https://doi.org/10.48550/arXiv.1812.06127.

[5]. Wang, J., Liu, Q., Liang, H., Joshi, G., & Poor, H. V. (2020). Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in Neural Information Processing Systems (NeurIPS), 33, 7611–7623. https://doi.org/10.48550/arXiv.2002.02503

[6]. Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S. J., Stich, S. U., & Suresh, A. T. (2020). SCAFFOLD: Stochastic controlled averaging for federated learning. Proceedings of the 37th International Conference on Machine Learning (ICML), 119, 5132–5143. https://doi.org/10.48550/arXiv.1910.06378

[7]. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., D'Oliveira, R. G. L., Eichner, H., El Rouayheb, S., Evans, D., Feldman, J., Fouque, P.-A., Gardner, J., Garrett, Z., Gascón, A., ... Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210. https://doi.org/10.1561/2200000083

[8]. Mothukuri, V., Parizi, R. M., Pouriyeh, S., Huang, Y., Dehghantanha, A., & Srivastava, G. (2021). A survey on security and privacy of federated learning. Future Generation Computer Systems, 115, 619–640. https://doi.org/10.1016/j.future.2020.10.007

[9]. Zhang, C., Li, S., Xia, J., Wang, W., Yan, F., & Liu, Y. (2020). BatchCrypt: Efficient homomorphic encryption for cross-silo federated learning. Proceedings of the 2020 USENIX Annual Technical Conference. https://doi.org/10.48550/arXiv.2007.07634

[10]. Li, Q., Wen, Z., Wu, Z., Hu, S., Wang, Y., & He, B. (2021). A survey on federated learning systems: Vision, hype and reality for data privacy and protection. IEEE Transactions on Knowledge and Data Engineering, 35(4), 3347–3366. https://doi.org/10.1109/TKDE.2021.3124599

[11]. Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), Article 12. https://doi.org/10.1145/3298981

PICTURE Ahmed

Downloads

Published

2026-03-02

How to Cite

Gheni Dawood, A., & Muthanna Turki, E. (2026). Federated Learning for Privacy-Preserving Big Data Analytics in Distributed Systems: Federated Learning for Privacy-Preserving Big Data Analytics in Distributed Systems. JOINCS (Journal of Informatics, Network, and Computer Science), 9(1), 11–23. https://doi.org/10.21070/joincs.v9i1.1699