Conference Posters

Generating Synthetic Control Subjects Using Machine Learning for Clinical Trials in Alzheimer's Disease (DIA 2019)

Charles K. Fisher, Yannick Pouliot, Aaron M. Smith, Jonathan R. Walsh

Objective: To develop a method to model disease progression that simulates detailed patient trajectories. To apply this model to subjects in control arms of Alzheimer's disease clinical trials. Methods: We used a robust data processing framework to build a machine learning dataset from a database of subjects in the control arms of a diverse set of 28 different clinical trials on Alzheimer's disease. From this dataset, we selected 1908 subjects with 18-month trajectories of 44 variables and trained 5 cross-validated models capable of simulating disease progression in 3-month intervals across all variables. Results: Based on a statistical analysis comparing data from actual patients with simulated patients, the model generates accurate patient-level distributions across variables and through time. Focusing on a common clinical trial endpoint for Alzheimer’s disease (ADAS-Cog), we show the model can predict disease progression as accurately as several supervised models. Our model also predicts the outcome of a clinical trial whose data are distinct from the training and test datasets. Conclusion: The ability to simulate dozens of patient characteristics simultaneously is a powerful tool to model disease progression. Such models have useful applications for clinical trials, from analyzing control groups to supplementing real subject data in control arms.