The goal of this series thus far is to explore how we find a match between control and treated subjects with many covariates. Previous posts have shown two methods that are commonly used (direct matching from historical trial data and propensity score matching). However, these methods have limitations. Matched subjects may be quite different from one another, and as a result, the variance of an estimated treatment effect may be quite large.

This post describes a more effective approach to matching—specifically a machine-learning approach that can generate digital twins (digital control subjects that exactly match actual treated subjects). This paired approach to matching overcomes the limitations of the two previous methods and serves as a powerful tool for clinical trials.

The purpose of finding an exact match between the external control and treated subjects with many covariates is to determine more precisely how effective a drug is (i.e., the average treatment effect). Previous posts have shown two methods that are commonly used to find matches:

- direct matching (matching subjects that have the same baseline covariates)
- propensity score matching or PSM (using the propensity score to match subjects).

However, both direct matching and PSM aren’t entirely effective. Direct matching is good for estimating the average treatment effect but becomes infeasible for a large number of covariates. On the other hand, propensity score matching makes matching on a large number of covariates easier, but when it estimates the average treatment effect, the variance is large (fig. 1).

Given that these methods have two distinct problems,** is there another matching method that solves both these problems?**

The answer is yes. Using a machine-learning model, we can *create *exact matches for actual subjects at baseline, which are called **digital twins**.

Let’s unpack the concept of digital twins. In the rest of this post, we’ll learn more about what this machine-learning model is; how it makes predictions; and how it creates digital twins that exactly match treated subjects.

At a very high level, this machine-learning model is a machine that can predict a patient’s future state.

We want to be able to predict probabilistically how measurements of a patient’s variables will change over time. Let’s say there’s an equation [1] for patient X that tells us how all of these variables will change over time. We can create a machine-learning model that learns this equation from patient X’s data. After learning this equation, the model can then predict values for all of the variables, in a way that reflects the variability in real data.

So how does the model predict values? If we have information about what patient X looks like both in the past and today (visit #1 and visit #2), we can predict what X will look like at the next visit (visit #3). We can then take this prediction and figure out what X will look like at visit #4, and so forth.

What does this prediction look like mathematically? Let’s think of the math in terms of inputs and outputs of the model. In order for the model to make a prediction, it needs data as input. So let’s say that we measure demographic, lab test, and clinical test variables today. We’ll call these baseline measurements, shown in green at t=0. Then we’ll give the model these baseline measurements as input, and it will produce new values at a future time point as output (shown in blue, t=1). We can repeat this process using values at t=1 as input to get values at t=2 as output, etc. The animation below illustrates this process in action.

Now let’s look at the process of generating predictions more closely. Because patient outcomes are stochastic, or random, the model will generate a distribution of values across every variable (notated as v(1)-v(n)), rather than a single value per variable. The model then selects a value for each variable from this distribution (the output) and uses these new values as input for the next prediction (fig. 2).

In summary, we can think of the predictive process as iterative. We take values from previous time points to make predictions about what will happen in the future.** **

To answer this question, we’ll go through an example from this recent Alzheimer’s paper.

In order to predict how a patient’s variables will change over time, we have to create a model that learns the equation describing how these variables will change.

In this example, the model learns and predicts how Alzheimer’s patients progress over time. Using this model, we can generate a digital Alzheimer’s patient that looks just like an actual Alzheimer’s patient. Let’s go through this short video to explain what that means: https://youtu.be/JZRTTwmeZ98.

In the beginning of the video, we see a complete record of an actual patient in an Alzheimer’s clinical trial, containing measurements in 3 month intervals (in red). We will call this patient an **actual control subject **because she was part of the group that didn’t take the drug in the clinical trial. Running an Alzheimer’s clinical trial requires measuring different variables related to Alzheimer’s, including the ADAS-Cog component scores. (ADAS-Cog tests measure different aspects of cognitive dysfunction in a clinical trial--the higher the score, the worse the dysfunction. citations) So in this video, we are tracking the progression of these different variables in a subject’s record.

How do we use such a record to generate a digital control subject? As mentioned in the previous section, we can take the baseline measurements from this actual control subject and give them to our model as input. The model will use these measurements to predict how they will change at 3 months. Then it can use the measurements at 3 months to predict what they will be at 6 months, and so forth. These measurements make up the record of a digital control subject.

Now that we’ve seen how the model makes a digital control subject, we can learn how it makes an exact match of an actual treated subject. In the previous section, the model is able to make a digital control subject that looks just like an actual control subject. We can use this same model to create a digital control subject that matches a treated subject at baseline. This is basically taking a treated subject’s record right before the clinical trial and making a digital copy.

To create an exact match, we again start out with a record containing just the baseline values from the treated subject. If we input the relevant baseline measurements into our model, the model will be able to create a synthetic record that shows how a treated subject would have progressed without taking the drug. We can call this generated record a digital twin because it is an exact match** **for the treated subject.

There are two important points to emphasize:

- Because the model learns data that reproduces real subject data variability, it is able to create a digital control subject with measurements that statistically resemble measurements from an actual control subject.
**In other words, the digital control subject is realistic.** - It can create a digital twin, or counterpart of a treated subject. Essentially, we can use this counterpart to ask: if the treated subject didn’t take the drug, how would he/she have progressed?
**The digital twin is thus the exact match for the treated subject.**

As shown earlier, direct matching and propensity score matching have two major flaws. Either finding a match is challenging, or estimating the average treatment effect is not precise enough. However, a machine-learning approach overcomes both of these challenges by creating a digital twin (an exact and statistically realistic match) for a subject in the treatment group (fig. 3).

When used, digital twins have two major benefits. They increase a clinical trial’s statistical power and need fewer human subjects for recruitment.

Recall that a standard clinical trial compares two groups of human subjects: placebo control subjects who don’t take the drug and treated subjects who do. Usually, recruited subjects are randomly assigned to either group so that they are similar enough to be compared.

Using digital twins, we can run trials with unequal randomization that still have high statistical power. For example, we can run a trial in which 5 subjects receive the experimental treatment for every 1 subject receiving the placebo, and add digital twins to increase the power to the desired level. As a result, we need fewer human subjects for a trial that incorporates digital twins.

Digital twins thus represent a powerful tool for clinical trials. In a future post, we will further explore how digital twins can supplement and increase efficiency of clinical trials.

Gif animation created by Charles Fisher; video by Jon Walsh and Graham Siegel

1. Because human biology is so complex, there isn’t a simple equation *per se* that dictates how variables change over time. We are using the term “equation” to illustrate that the model must learn how these variables change over time, expressed using some mathematical formula.