Medidata Blog
The Medidata Platform – The Next-Generation Data Architecture
This blog was authored by Robert Lyons, VP, Engineering, Data Platform at Medidata.
There’s no doubt we live at a time of accelerated technical innovation, full of incredible potential. Never before has there been so much realized—and unrealized—technological potential to address clinical trial challenges, improve processes, increase efficiencies, reduce risks, and enable better results faster. Our goal is that patients awaiting new therapies can look forward to life-changing treatments sooner rather than later.
Ironically, industry experience has shown that implementing these new technologies, which are often disparate, is far from straightforward—potentially leading to painful delays.
From a data architecture perspective, sponsors face problems with integration of data from disparate sources, streamlining of that data into a single source of truth, complex data management, data quality management, interoperability, and scalability.
The good news is the industry has worked to address these issues, making continued efforts to introduce better data standards and interoperability and working towards a unified clinical trial ecosystem.
In this complex, evolving landscape, Medidata’s unified platform has always stood apart from the crowd, addressing many of the challenges highlighted above and leading the charge for the betterment of the industry and the advancement of patient health outcomes.
One of Medidata’s key areas of ongoing focus is the continuous improvement of its award-winning, best-in-class platform. The goal is to address future challenges as well as those faced today—to build an increasingly advanced platform that remains unified and data/patient-centric, that is more closely aligned with the patient’s journey throughout a study. Such progress has been enabled by advancements in precision medicine processes and developments in data acquisition, data processing technologies, and increased patient engagement. This has facilitated deeper study insights while delivering better patient experiences and health outcomes.
Building a solid foundation is critical in any endeavor, and it’s especially relevant when designing a system that needs to enable the complexities of data management, interoperability, scalability, and flexibility within a broad ecosystem that interacts with multiple solutions across the clinical trial landscape.
This ecosystem includes multiple sources (Figure 1): electronic data capture (EDC), supply data, electronic clinical outcome assessments (eCOA), imaging, real-world data (RWD), labs, electronic health/medical records (EHR/EMR), sensors, and other systems. Even data from these sources is not uniform—there are multiple sensor devices, and the information may need to be aggregated to produce a full data set.
Fig. 1. Integrating an ecosystem of data.
Add to this the exponential increase in data volumes as a consequence of the increased adoption of complex study methodologies across the industry. Also, the velocity at which this complex data is produced must be taken into account, as it needs to be processed effectively by multiple other systems that enable reviews, monitoring, and analysis across the interdependent organizations involved in clinical discovery.
Needless to say, an advanced architecture for a unified, scalable, interoperable platform capable of supporting this highly complex and dynamic environment is critical.
The Medidata Platform – Next-Generation Data Architecture
Medidata’s new platform data architecture was announced at NEXT New York 2023, ushering in a new era of advanced data integration, interoperability, scalability, quality, and management.
Building on this foundation, the Medidata Platform will simplify and unify clinical discovery and clinical data/risk management experiences, potentially cutting study conduct timelines substantially.
As an introduction to the new data architecture, here is a high-level view of the processes and touchpoints.
1. Activity Centered Study Design and Data Acquisition (Figure 2).
Fig. 2. The Medidata platform's next-generation data architecture process – building data definitions.
To transform the study-build experience, we’ve centered the process around study activity data instead of forms.
From the outset, during the study build, we define the data to be collected using biomedical concepts (the description and construct of common data elements in a clinical trial). Examples include vital signs comprised of heart rate, blood pressure, and body temperature.
A knowledge graph of known concepts or data shapes enables data to be quickly recognized, assimilated, and retrieved in context. Acquired data is then automatically placed, as the system knows to expect the semantics of what it’s called, its format, where it fits in the timeline, and how it relates to other data. This streamlined process speeds up the availability of data for review, analysis, reuse, etc.
This enables study builders to reuse these definitions across multiple data sources, including EDC forms, eCOA questionnaires, EMR acquisition, lab ingestion, sensor connections, etc. This accelerates study builds, realizing economies of scale across these different data acquisition modes, and enhances downstream analytical applications.
2. Streamlined Data Ingestion – Creating a Single Source of Truth (Figure 3).
Fig. 3. The Medidata platform's next-generation data architecture process – integration and ingestion.
The data platform enables the collection of any type of data from any source (both Medidata and non-Medidata sourced data), standardizing acquired data during the process.
To make data management and processing easier and quicker for users, data ingestion from non-Medidata sources is designed to be streamlined and self-service. This reduces effort and cost, improves the quality of data, and ultimately expands on the generated evidence for better clinical discovery.
3. Data Preparation – Aggregate, Standardize, Enrich (Figure 4).
Fig. 4. The Medidata platform's next-generation data architecture process – data preparation (aggregate, standardize, and enrich).
Good data is the foundation for everything built upon it. But in reality, gathering complete, good-quality data is a challenge—especially when many data sources can be differently formatted, contain anomalies, have duplicates or gaps, etc. Put simply, if you put rubbish in, you will get rubbish out.
With the new data architecture, this challenge is met by built-in data delivery streams that standardize and contextualize sources such as Rave EDC, master study data, and other key data streams, including sensors and labs. These feed into unified patient observations which are ready for further aggregation, enrichment, and transformation.
There are significant benefits of working with streaming data instead of downloaded sequential batches. As data is available sooner, clinical research teams can work simultaneously in parallel, making data review, analysis, and standardization much faster and more efficient.
Another exciting development within the new architecture is that the self-service experience includes no-code and low-code data transformation capabilities to link and create new data sets. This is on top of our standard and integrated third-party data sets.
This semantic integration with cataloged metadata, linked standard concepts, governance, and master data management (MDM) is a key foundation for better scientific discovery.
4. Next-generation Interoperability (Figure 5).
Fig. 5. The Medidata platform's next-generation data architecture process – interoperability.
Collaboration in our industry is not just a ‘nice to have’; it’s an inherent requirement of our ecosystem. And yet, interoperability across the many systems, solutions, and services that enable successful clinical studies has been a key issue for decades. Any platform that works in isolation is just another disparate system. Thankfully, there’s light on the horizon as the industry does its utmost to drive change. Industry initiatives, such as TEFCA, are set to standardize healthcare data, potentially transforming clinical trial data interoperability in the future.
Medidata has been one of the leading lights in enabling a harmonious ecosystem of interoperability with its unified platform and extensive partner program. Building on this, the next-generation data architecture extends the existing platform, enabling it to reach far beyond the limitations and boundaries of any single system or organization; this provides data access across multiple sources and interoperation with your own and your partners’ workflows.
The system uses a new unified application programming interface called “One API” to better integrate data using standard programming practices.
What’s even more exciting is another new access method being introduced this year called Snowflake data sharing. This enables secure and direct access to any relevant data sets in our data lakehouse without first needing to extract and move that data to your infrastructure. Data flows into our lakehouse and is automatically, quickly, and virtually available within your infrastructure. This is a game changer as it enables you to run your workflows in your environments—on top of data sets that are held in a secure and reliable platform—with as little latency as possible.
New Data Experiences
The new data architecture of the Medidata Platform also lets us deliver new data experiences on the platform. A sophisticated data quality management experience is being built on top of the data platform to ensure data quality through AI-driven, human-in-the-loop, and visual approaches to review and reconcile data and manage trial risks. This new data experience is called the Medidata Clinical Data Studio.
In Conclusion
To meet the challenges of today’s clinical trials and to power the next generation of clinical research requires a new, dynamic, interoperable, and scalable data architecture. Medidata is building on 25 years of experience delivering innovative, industry-leading capabilities by future-proofing the Medidata Platform and delivering new data experiences.
See how Medidata is transforming clinical research by watching exclusive content from our premier event, NEXT New York.
Enjoyed this article? Click here to share it with your network.