Synthetic data generation (SDG) for structured health data is increasingly promoted as a solution to longstanding barriers in health data access, offering the promise of privacy-preserving data reuse for research, innovation, and policy.
Despite rapid technical advances, the adoption of synthetic health data in real-world settings remains limited. Challenges related to data quality, representativeness, infrastructure readiness, trust, and legal uncertainty continue to shape its implementation in practice.
This viewpoint draws on the experiences of seven European research initiatives within the HealthData4EU cluster to explore how SDG is being operationalized across different contexts. It synthesizes cross-project insights to highlight recurring methodological and governance tensions, while examining their implications for trust and responsible use.
The analysis argues that trustworthy SDG cannot be achieved through technical optimization alone. Instead, it requires alignment between evaluation practices, upstream data stewardship, regulatory clarity, and sustained stakeholder engagement. Addressing these conditions is essential for moving synthetic data beyond experimental pilots and toward becoming a credible and sustainable component of European health research ecosystems.


