Methods and Applications of Synthetic Data in Pharmacoepidemiology (Nov 15, 2023)

Methods and Applications of Synthetic Data in Pharmacoepidemiology (November 15, 2023)

Synthetic data is a form of model generated data that shares the same patterns and characteristics as real data. The applications of synthetic data in pharmacoepidemiology are still emerging, but can offer new ways for us to:

1. Enable the internal reuse of datasets and sharing data with external parties in a privacy-preserving manner (i.e., it can be seen as anonymization 2.0)

2. Augment and expand datasets that are small for training machine learning models

3. Mitigate bias in datasets by simulating observations from the under-represented groups

4. Simulate patients for clinical trials that are experiencing problems, including under-recruitment

Synthetic data can be created when a generative AI model is trained on source data, such as claims database or an EMR database. This AI-generated data would not have a one-to-one mapping to the original data, and therefore will have strong privacy preserving characteristics, and the generated data can be much larger than the original data. Some of these use cases have already been applied in practice and some are still in the formative stage of development. This webinar will give an overview of synthetic data generation and walk through some of the above applications.
This webinar is sponsored by the Digital Epidemiology Special Interest Group.

This webinar is aimed towards industry/service providers, academia, government/regulatory and students.


Components visible upon registration.