How to unlock the power of real-world data in clinical research
It’s now more important than ever for healthcare organizations to ensure information quality and integrity in today’s data-driven healthcare landscape.
Through the utilization of real-world data (RWD) and the generation of real-world evidence (RWE), researchers have the potential to improve patient outcomes by optimizing clinical trials, accelerating new, specific treatments and demonstrating proof of a therapy’s effectiveness across different patient populations.
The U.S. Food and Drug Administration has traditionally accepted RWE to monitor and evaluate the safety of drug products for post-market studies, and the expanding use of RWE to support additional regulatory decisions has been gaining traction.
As RWE becomes more engrained in oncology drug research, regulatory evaluations and commercial decisions about medical interventions, data stewardship — the management and oversight of an organization's data assets — is paramount. However, effective stewardship of RWD can present myriad challenges, including ensuring the data is high-quality, credible and fit for purpose.
Ascertaining data quality
Data quality encompasses the processes which turn raw RWD into datasets that comply with the reference standards specific to the purposes for which these sets are to be used. There are a few important dimensions that a data steward must take into consideration to assess data quality – data completeness, plausibility, traceability and interoperability.
Data completeness. This is more than just confirming there is no missing information at the individual record level. It also involves data validation processes to ensure broad coverage, representativeness and accuracy of all the elements representing patient demographics, as well as clinical and outcome variables.
The process depends on the specific questions the data is intended to answer, but every process should aim to identify the limitations of the data source (and what additional data could mitigate those gaps); ensure the data available is sufficient to fully evaluate the study elements of interest; and employ mitigation measures, such as data linkage, to improve data completeness.
Data linkage is especially useful for filling in gaps for cancer patients. Electronic health records (EHRs) contain extensive patient data, including details about diagnoses, test results and treatments, genetic biomarker information and outcomes, but can still fall short of encapsulating a patient’s full journey.
For example, did the patient seek treatment elsewhere or what was the cost of their treatment? By linking EHR data with other source data, such as health insurance claims data, researchers have a better, more comprehensive understanding of that individual’s story and can alleviate the shortcomings associated with data completeness.
Plausibility. Data plausibility assesses whether data credibly reflects reality. Even with a validated method for capturing data, sometimes the resulting values can still deviate from what was expected.
With RWE, there will always be a certain dissonance between perceptions of how decisions should be made and reality, such as treatment guidelines not always being followed. One way to approach these problems is by encouraging collaboration among clinical experts, informaticists and data scientists to vet all aspects of a data element’s provenance.
When an organization’s data is high-quality and effective data stewardship measures are in place, researchers can trust that an abnormality in results is not simply a user error and is worth deeper evaluation.
Traceability. This is the ability to trace a study’s results back to its original source data. When it comes to RWE, acquiring a complete understanding of the clinical subject matter often requires obtaining data from multiple sources then aggregating that data into a single dataset.
Raw data, often acquired from EHRs or claims data, must go through cleaning, transformation, and linkages before it becomes analyzable data. These processes are often tedious and prone to error.
To ensure trustworthy traceability processes are implemented, good data stewards must employ lineage methodology and tools which confidently answer key questions such as where did the data come from (for example, from patient health records, clinic databases, insurance claims, EHRs or elsewhere); what criteria were used in its selection, abstraction and curation; and what standards are being applied.
Interoperability. Data interoperability refers to “the ways in which data is formatted that allow diverse datasets to be merged or aggregated in meaningful ways.” Unfortunately, a lack of communication standards across platforms often leads to different databases not being compatible or correctly interpreting information.
One way to address this issue is by using open application programming interfaces (APIs). APIs enable data to be seamlessly shared between EHRs and health information technology systems.
In the RWE space, vendors that develop common data models that enable them to augment the value of their datasets with complementary input from external sources will have a competitive advantage over those whose data models do not effectively incorporate data interoperability.
Fit for purpose data
Confirming that data is fit for purpose is more than just ensuring it is of high quality. The data must also answer the specific questions pertinent to the end user’s needs.
A diligent data steward ensures data is deemed fit for purpose by developing a set of data quality rules and by regularly auditing input data. Some questions to ask to determine if data is fit for its intended purpose include:
- • Are the data representative of the addressable patient population for the specific area of research interest?
- • Does the data have longitudinal depth, meaning that the data tracks the same type of information at different points in time?
- • Are the data timely and reliable?
RWD and RWE will continue to play an evolving role as we strive to better understand clinical research and work to inform future regulatory decisions around how different medicines affect different patient populations. It is imperative that data stewards take care in the collection, management and dissemination of that information through rigorous processes that ensure data quality.
Vasu Chandrasekaran is vice president of real-world data and analytics for Ontada.