DC8 Project: Privacy-Preserving Federated Generative Models for Decentralised Data Synthesis

 

Doctoral Candidate


Lingyu Qiu, MSc


Main Supervisor: Francesco Piccialli (UNINA)

Co-Supervisors: David Camacho (UPM), Jia-Chun Lin (NTNU), Dariusz Mrozek (SUT) 

R&D cooperation: CONFORM

 

Objectives: developing novel methodologies for data synthesis in distributed and privacy-sensitive environments. 

The research addresses one of the most pressing challenges in modern Artificial Intelligence: enabling collaborative learning across decentralised data sources (e.g., hospitals, financial institutions, or IoT devices) while ensuring the protection of sensitive information. The core idea is to combine generative modelling techniques (e.g., GANs, VAEs, Diffusion Models) with the principles of federated learning, in order to design frameworks capable of generating realistic, high-quality synthetic data without the need for centralised data aggregation. This will allow researchers and practitioners to train and validate AI models in scenarios where direct data sharing is not possible due to privacy, regulatory, or ethical constraints.

The doctoral candidate will focus on several key research objectives:

  1. Design of federated generative architectures that operate over heterogeneous and non-IID datasets, ensuring stability and robustness of training across distributed nodes.
  2. Integration of privacy-preserving mechanisms, such as differential privacy, secure multi-party computation, and homomorphic encryption, to guarantee that no sensitive information is leaked during model training or synthetic data generation.
  3. Evaluation of the quality and authenticity of the generated data by benchmarking against centralised generative models, assessing both fidelity (similarity to real data) and utility (effectiveness for downstream ML tasks).
  4. Development of deployment strategies for real-world applications in privacy-sensitive sectors (e.g., healthcare, finance, and critical infrastructure), tackling practical challenges such as data heterogeneity, communication bottlenecks, and regulatory compliance.

 

Expected Results: 

The development of a robust prototype federated generative model validated on real-world datasets.

A comprehensive framework of protocols and best practices for designing, training, and deploying privacy-preserving generative models in decentralised settings.

High-impact scientific publications in international conferences and journals, disseminating both theoretical contributions and applied results.

Concrete contributions to the industrial partners of the TUAI network (e.g., CONFORM), providing them with tools for privacy-preserving data analysis and simulation.

Active involvement in secondments and collaborations with partner universities and companies (UPM, SUT, NTNU), enabling interdisciplinary knowledge transfer and strengthening the candidate’s expertise.


Applied research: The applied research of this project focuses on developing a federated generative model prototype for data synthesis in privacy-sensitive sectors like healthcare and finance. It involves creating, testing, and refining the model using real-world datasets to ensure realistic data synthesis compliant with privacy regulations. The project aims to establish deployment strategies and best practices for integrating these models into various industry settings, addressing challenges like network latency and data heterogeneity, ensuring the model's practical viability and scalability in real-world applications.

 

Planned secondments: UPM(4 months); SUT(4 months); NTNU (4 months)                   

 

Enrolment in Doctoral degree: UNINA