Real World Data

Real world data (RWD) plays an increasingly important role in clinical research and health care decision making; the 2016 US 21st Century Cures Act places additional emphasis on the use of these types of data to support regulatory decision making. Generally, RWD is observational data obtained outside regulated clinical trials and generated during routine clinical practice. The US Food and Drug Administration (FDA) states that this data may come from a number of sources including electronic health records (EHRs), medical claims and billing activities, product and disease registries, patient-generated data (including in home-use settings) and data gathered from sources that can inform on health status, such as mobile devices.

Global regulatory authorities and the CDISC community have taken note.  Global regulators, such as the US FDA, Japan’s Pharmaceutical and Medical Devices Agency (PMDA), the European Medicines Agency (EMA) and China’s National Medical Products Agency (NMPA) are increasingly interested in leveraging the potential of RWD to complement randomized, controlled trials by providing insights into efficacy, safety and post-market surveillance as a means of supporting regulatory decision making across the product lifecycle.  Indeed, the US FDA is accepting observational data to support efficacy determinations and the EMA is assessing the use of registry data for rare diseases.  Moreover, in a recent survey, 45% of CDISC member organizations indicated they would like to receive more information from CDISC on representing RWD.

It is well known that because RWD is not collected with research as its primary purpose, there are significant challenges in using and representing these data. These challenges include bias, data variability and heterogeneity, which can make analysis of RWD difficult and resource consuming. However, the benefits of connecting RWD to CDISC standards (i.e., improvements in data sharing, cross-study analysis and meta-analysis of data for all clinical researchers) may outweigh the challenges if efficiencies achieved expedite global regulatory reviews, contribute to the evaluation of new treatments for patients and drive next generation discovery.

SDTM Implementation for Observational Studies v1.0   

28 February 2024

   Considerations for SDTM Implementation in Observational Studies and Real-World Data v1.0 
   Considerations for SDTM Implementation in Observational Studies and Real-World Data v1.0 
   - Public Review Comments.xlsx


The purpose of this document is to provide guidance for handling commonly encountered issues when using SDTM for observational studies & real world data. The scope of version 1.0 is limited to SDTM and does not duplicate information found in the SDTM and SDTMIG. Rather, the focus is on concepts that require deviations from those documents or that require specialized implementation strategies.

Public Review Comments

CDISC posts Public Review comments and resolutions to ensure transparency and show implementers how comments were addressed in the standard development process.

In 2006, CDISC published the landmark paper on “Leveraging the CDISC Standards to Facilitate the use of Electronic Source Data within Clinical Trials”, which explored the use of technology and eSource data in the context of existing global regulatory requirements. Since then, CDISC has faithfully participated in HL7’s Connectathons, eSource Thought Leaders Forums, and the TransCelerate eSource Roundtable, to name a few activities.

Moving forward to 2018, CDISC engaged the community on how we create, maintain and deliver clinical data standards via the Blue Ribbon Commission. Findings from the Blue Ribbon Commission now shape CDISC’s vision and strategy.  One key recommendation from the Blue Ribbon Commission was to involve the academic community in a conversation around how CDISC standards can be effectively and efficiently deployed in RWD settings. 

CDISC will continue to collaborate with regulatory authorities, strategic partners and fellow standards development organizations on projects related to RWD.  The following information provides summaries of the projects, collaborations and initiatives related to connecting RWD to CDISC standards.

LOINC to LB Mapping File

The LOINC to LB Mapping File is intended to show examples of LOINC code mappings to CDISC variables and terminology to aid researchers’ adoption of LOINC codes in RWD settings (e.g., public health, academic research, observational studies, registries, and other settings) and for regulatory submissions.


With the emergence of FHIR, HL7’s standard for exchanging healthcare information electronically, CDISC began exploring its use with our standards. Fast Healthcare Interoperability Resources (FHIR, pronounced "fire") is an interoperability standard that describes data formats and elements (known as "resources") and provides an application programming interface (API) for the exchange of healthcare information and electronic health records (EHR).

With a grant from the FDA to pursue demonstration projects in eSource, CDISC leveraged FHIR resources and the CDISC data exchange standard, ODM, to retrieve EHR data and pre-populate research study case report forms as illustrated in Figure 2. This project, recently completed, included the unprecedented work of involving a multi-center study with sites from different health systems.  This project continues with the “Data FITS initiative” which seeks to demonstrate and use FHIR to extract data from EHR systems via eSource software and translate those data into FDA-consumable semantics via CDISC’s ODM2.0 prototype.


CDISC has also collaborated with our strategic partner, PHUSE, on three pilot projects, involving FHIR and CDISC standards with each project building on the previous project. The findings of these three projects are described in the following papers:

  1. “Use of Fast Healthcare Interoperability Resources (FHIR) in the Generation of Real World Evidence (RWE)” - The purpose of this pilot was to test the use of FHIR using a synthetic EHR database to generate RWD consistent with CDISC standards using the CDISC Therapeutic Area User Guide for Diabetes. The pilot also assessed the level of harmonization between FHIR and CDASH and SDTM.

    This pilot served as an important first step in the exploration of the use of FHIR in the generation of CDISC-compliant RWD. Findings demonstrated that electronic case report form pages/fields of interest can be populated using EHR data by mapping FHIR concepts to CDASH/SDTM variables. To make FHIR to CDISC mapping more useful, developing code that performs real-time translation is recommended.

    Additionally, it was noted that while it is possible to map EHR data using the FHIR standard to CDASH compliant-outputs, there are limitations to the use of this data in a real-world, clinical-trial setting. Limitations included missing and/or incomplete data, irregular and inconsistent data collection across hospitals/clinical sites and heterogeneity in dictionaries used in the original data collection. Moreover, FHIR does not provide metadata or question text for a given concept as compared to CDASH. This was not a concern for the variables chosen for this project, but it may be a concern for other variables.
  2. “Use of HL7 FHIR as eSource to Pre-populate CDASH Case Report Forms using a CDISC ODM API” – This project (1) developed a prototype that examined how CDISC standards can work together with FHIR to advance the state of pre-populating CRF forms with EHR data, (2) identified gaps in automation using the standards, and (3) recommended extensions to existing CDISC standards to better support eSource processes.

    Extending the conclusions drawn from the previous project, this project introduced the use of open, standard APIs and automating eSource data retrieval tasks. The prototype demonstrated populating CRF fields of interest using ODMv2.0 / CDASH metadata and the FHIR and ODMv2.0 APIs. Taken together, the prototype software in these two projects could retrieve patient data from an EHR, post that data to an EDC system, and then convert the EDC data to SDTM.

    A fundamental challenge with this project is the fact that healthcare data tends to be event-based while research data is protocol-based. For example, you may need to load the EHR data into a particular visit, but there is no concept of a visit in the EHR.
  3. “Use of FHIR in Clinical Research: From Electronic Medical Records to Analysis”– This project builds upon previous papers to examine extending the use of FHIR to provide near real-time analytics directly from EHRs, and whether FHIR provides standardized data that can be used to drive the production of analytics. CDISC developed a model and automated process to extract data from EHRs using FHIR resources and APIs and used these data as input for creating ADaM analysis datasets, which were used to generate analytics/reports.

    The project illustrated how existing standards for provider and research data can be integrated to provide more timely safety analytics to medical staff. It also demonstrated the lack of harmonization between provider (FHIR) and research (CDISC) standards. For example, FHIR typically uses SNOMEDCT, while CDISC uses MEDRA.


HL7 Biomedical Research and Regulation Work Group

The Biomedical Research Integrated Domain Group (BRIDG) Model, authorized by CDISC, HL7 and ISO, is a broad high-level model that seeks to connect protocol-driven research and health care.  The BRIDG steering committee resides in the HL7 Biomedical Research and Regulation (BR&R) Work Group in which CDISC is a founding member. We also provide a senior technical staff member to participate in the BR&R Work Group to support expansion and support of the BRIDG model. 


CDISC-2-FHIR Lab IG and Associated Mapping

CDISC is engaged in a BR&R sub-team that has leveraged a CDISC-2-FHIR Implementation Guide and mapping for laboratory data developed by the TransCelerate eSource Initiative with the goal of facilitating the flow of data into submission data sets. The end-to-end mapping process includes site data storage, site data preparation/transformation, production of FHIR format files, transformation from FHIR to CDISC laboratory data standards, and consumption of data by sponsor systems. Since EHR data and FHIR use LOINC, the work CDISC is doing with LOINC aligns well. Figure 3 depicts the FHIR to CDISC Lab Flow included in the CDISC Lab Semantics in FHIR Implementation Guide.

Figure 3: CDISC LAB to FHIR Flow


CDISC and the PHUSE Working Group have been collaborating with the BR&R group on mapping the entirety of CDISC ODM and its extensions (SDM-XML, Define-XML, CTR-XML) to FHIR resources to allow the identification of new extensions to foster greater interoperability as well as identify gaps between the two standards. This assessment has been slow and complex, but will yield a true understanding of how FHIR resources can be illustrated in the overall use case.

Common Data Model Harmonization

Researchers funded by the US National Institutes of Health are obligated to share data and they are sharing data; however, it is difficult to maximize the data’s potential due to the different ways the data is collected and represented. This makes aggregation, cross-study analysis, reusability and sharing very difficult. In response, FDA initiated the Common Data Model Harmonization (CDMH) project, which is part of a larger Patient-Centered Outcomes Research Trust Fund (CORTF) funded project. The CDMH project uses the BRIDG model as an intermediary across PCORnet, i2b2, OHDSI/OMOP and Sentinel, to harmonize these models and builds on existing resources, standards and tools, mapping them to open, consensus-based standards like CDISC SDTM and FHIR as depicted in Figure 4. The resulting common data architecture will be validated via a pharmacovigilance use case that will be run against RWD from EHRs to test the ability to run automated queries across data models and the ability to map data back to the BRIDG and SDTM models.The longer-term goal of CDMH project is to generate open source tools to help researchers and data scientists move data across data standards.


In the fall of 2019, CDISC initiated the “CDISC RWD Connect” project.  The first phase of this initiative is to listen to the academic community, using Delphi methodology, to better understand the barriers to implementing CDISC standards for RWD and to get a picture of what tools and guidance may be needed to facilitate implementation of CDISC standards.  From this work, CDISC will produce a white paper that describes the methodology, findings related to current CDISC standards utilization for RWD. Building on this information and leveraging our previous work on RWD, we will create a strategy for fostering the use of CDISC standards in the academic community. The second phase of this project will be to raise funds to develop software tools and make guidance a reality.  

In 2018, CDISC’s Blue Ribbon Commission, composed of global leaders from academia, the pharmaceutical industry, government agencies (including regulatory authorities), patient foundations and fellow standards development organizations, developed a list of recommendations to support CDISC ‘s suite of clinical data standards now and in the coming decade. One of their key recommendations was to involve the academic community in a conversation around how CDISC standards can be effectively and efficiently deployed in RWD settings.

Accordingly, CDISC initiated the “CDISC RWD Connect” project to engage the academic community. The CDISC RWD Connect team carried out a modified qualitative Delphi survey process with key stakeholders who formed the CDISC RWD Connect Expert Advisory Board (EAB).

We invite you to read the CDISC RWD Connect Report, which represents a consolidation of the results of the qualitative Delphi consultation, describing the EAB’s views and recommendations for a way forward with CDISC standards and the world of RWD.