In June 2024, the U.S. Food and Drug Administration (FDA) issued draft guidance outlining the need for “Diversity Action Plans” as a way for clinical trial sponsors to demonstrate their consideration toward trial representation. Affecting phase 3 studies, and others as appropriate, the guidance iterates on a previous draft from 2022 borne from the Food and Drug Omnibus Reform Act (FDORA) that same year.
As sponsors plan their compliance with this near-final guidance, they’re likely to rethink their protocol development process more systemically—or at least, they should, experts say.
For the brands including big pharma that tend to work from libraries of global protocol templates with pre-established inclusion and exclusion criteria, the guidance will likely push sponsors to rewrite those templates with a more inclusive approach.
But it will take the right data—as well as the capacity to interpret that data at the necessary scale. That’s why experts like Justin North, Director of Product Management at TriNetX, expect electronic health record (EHR) inputs plus artificial intelligence (AI) to play a critical role in Diversity Action Plan development in the future.
EHR data and the value of self-reported identity
Many clinical trial sponsors assess claims data to scope out their potential participant populations—as well as those participants’ identities as part of a Diversity Action Plan. While claims data is increasingly incorporating self-reported race/ethnicity data, those fill rates are not always as reliable as data sourced directly from the health system’s EHR. That’s especially the case for patients who have been seen within the past few years, North adds.
“As protocols become more complex with more nuanced exclusion criteria, it can be hard to understand how subtle changes to the protocol might improve representation,” he says. “But if you’re working from an EHR source that has high fill rates of self-reported race/ethnicity data, you can tap into that to inform those inclusion decisions.”
With its expansive medical record history, EHR data may also uncover other questions beyond race and ethnicity, such as comorbidities, sexual orientation, pregnancy status and other attributes. Importantly, all of those descriptors have been included in the FDA’s guidance—which changed its title from “underrepresented racial and ethnic populations” in 2022 to simply “underrepresented populations” in the 2024 version.
AI as a tool to inform protocol decision-making
Even if sponsors are working from the right datasets, there could be thousands of possible permutations of a protocol change. That’s an order of magnitude above what human protocol writers could feasibly review, which presents a classic use case for AI, North says.
“For a single human to try to evaluate all the complexities of a protocol to understand how it’s impacting inclusivity, that’s just not realistic … and it’s going to remain a huge burden across the industry,” he says. “What AI can do for us is run thousands of permutations of a protocol at once and identify which criteria could have the most impact on inclusion.”
In that function, AI should be an enabler, not the ultimate decider, however.
“There’s a balance,” North says. “There’s a great opportunity for AI and automation to step in where it most makes sense but then still have the human come in at the last mile. They’re the ones ultimately making the final clinical decision on how they want to shape the protocol to improve diversity. The technology is just a tool so that they can make those choices from a more informed place.”
Avoiding blanket definitions of “inclusivity”
Even with the use of empirical inputs and tools including EHR data and AI, much of what ultimately goes into a Diversity Action Plan is individualized. “Diversity” is not a blanket benchmark that can be defined or applied to all indications uniformly—it’s much more nuanced than a canvassed approach, North says.
“You need to understand not just ‘What is an inclusive protocol overall?’ but rather, ‘What is an inclusive protocol for this disease?’” he says. “Sponsors will need to establish more epidemiological benchmarks for specific indications to define the representative population according to the need—because that definition isn’t going to be the same for every therapeutic area or indication.”