Home / 2018 CDISC Europe Interchange

2018 CDISC Europe Interchange

Program subject to change without notice.

25 Apr 2018

Session 1: Opening Plenary

Chair: Joerg Dillert, Chair, CDISC E3C
09:00 - 10:30

Welcome Address

Stephen Pyke, GSK, CDISC Board Chair

Keynote - Using Electronic Health Records for Real World Trials: Our Experience from the Salford Lung Study

Martin Gibson, NorthWest EHealth

Dr. Gibson is Chief Executive Officer, Northwest EHealth and Director, NIHR Clinical Research for Greater Manchester. He is also a consultant physician specializing in diabetes and lipid disorders at Salford Royal NHS Foundation Trust. An active clinical trialist, Dr. Gibson has had a long-term interest in the use of electronic clinical data systems to improve healthcare and facilitate research.

State of CDISC Union

David Bobbitt, CDISC

CDISC Standards Update

Rhonda Facile, CDISC

Session 2: Second Opening Plenary - Regulatory Presentations

Chair: Dr. Nicole Harmon, CDISC
11:00 - 12:30

Dr. Yuki Ando, PMDA

Dr. Alison Cave, EMA

Dr. Ron Fitzmartin, FDA, Center for Drug Evaluation and Research

Dr. Lisa Lin, FDA, Center for Biologics Evaluation and Research


Session 3, Track A: Global Submission Experience

Chair: Sujit Khune, Novo Nordisk
14:00 - 15:30

Is It Possible to Make a Global CDISC Submission?

Marianne Carames, Novo Nordisk A/S

What is a global submission? Previously, sponsors submitted globally to EMA, FDA and PMDA at the same time. Today, CDISC submission packages must be prepared separately for the FDA and PMDA because the requirements differ between agencies. 

The choice of e-data standards for a submission must be based on the Data Standards Catalogue which lists agency approved versions of SDTM, ADaM, Define-XML and controlled terminologies. FDA and PMDA publish separate versions of the catalogue. The standards chosen must be acceptable to both regulatory agencies to avoid problems during trial planning and execution.

Sponsors are required to provide different documents at different time points describing the standards chosen for the submission package depending on the receiving agency. FDA requires the Study Data Standardisation Plan to be submitted for the pre-IND and no later than end-of-phase 2, while PMDA requires the Appendix 8 – ’Consultation on data format of submission of electronic study data’ to be prepared for the e-data consultations close to the submission date.

The trials accepted by the two agencies must be CDISC compliant however the cut-off for when compliance requirements apply also differs. FDA requires CDISC compliant data for all trials started after 16-Dec-2016 while PMDA requires CDISC compliant data for all submissions submitted after 01-Apr-2017.

Challenges of Submitting Electronic Study Data to Two Authorities: PMDA & FDA

Ina Assfalg, Boehringer Ingelheim

The U.S. Food and Drug Administration (FDA) and the Japanese Pharmaceuticals and Medical Devices Agency (PMDA) are currently expecting (FDA) or strongly recommending (PMDA) that sponsors submit standardized e-study data. But what are the challenges if a pharmaceutical company wants to submit their data to both PMDA and FDA? Since October 2016 the PMDA started a transition period of 3.5 years where they initiated the submissions of e-study data before it will become a requirement in 2020. The requirements that are defined by both authorities are similar. However there are some different rules that need to be taken into account to have accepted e-study data packages by the PMDA and FDA. In this presentation the hands-on experience is coming from converting similar e-study data packages for the same substance by Business & Decision Life Sciences to the CDISC standard for the pharmaceutical company Boehringer Ingelheim. 

Comparing the deliverable packages for both PMDA and FDA we can see that they are requesting the annotated CRF, the Study and Analysis Data Reviewer’s Guide, SDTM and ADaM datasets in SAS xpt format and define.xml. Besides the 95% conformity of the deliverables there are some differences:

•    The Pinnacle21 rules for FDA and PMDA are slightly different which results in a difference in final data (e.g. for laboratory units)
•    The PMDA is requesting laboratory SI units, whereas it must be discussed with FDA if SI units or rather US conventional units are expected in the e-study data and/or analysis.
•    The FDA are requesting to receive the Patient Data Report (PDRs) in the eCTD submission, whereas the PMDA are requesting the PDRs only after the application (not as part of the eData package but as submission documentation after the application)
•    At least one eData consultation meeting needs to be scheduled with the PMDA to explain what will be in the deliverables package, while a Type C meeting might be requested for the FDA
•    Clarify timelines when PMDA wants to see the Pinnacle 21 results and their explanation in the reviewer’s guide as this might be needed prior to the final submission package
•    Analysis Results Metadata (ARM) is required in the ADaM package for PMDA but not yet for the FDA 

The submission packages of the same substance have been sent to both PMDA and FDA and were accepted. Main challenges and difficulties reaching these milestones will be highlighted in the presentation.

Analysis Results Metadata for PMDA submission: Business Case Presentation

Roxane Debrus, Business & Decision Life Sciences

Since July 2015, the Japanese Pharmaceuticals and Medical Devices Agency (PMDA) recommends that “the definition documents of the ADaM datasets should preferably include Analysis Results Metadata, which shows the relationship between the analysis results and the corresponding analysis dataset and the variables used, for the analyses performed to obtain the main results of efficacy and safety and clinical study results that provide the rationales for setting of the dosage and administration”. 

In the past years, BDLS developed a set of macros in order to generate Define-XML 1.0 and 2.0 based on the study specifications. More recent, these macros have been updated to also include the generated Analysis Results Metadata (ARM) into the Define-XML 2.0. In 2017, we received a first sponsor request to implement ARM in a submission of 2 studies to the PMDA. In November 2017, these 2 studies were accepted by the Japanese authorities. 

As known, ARM provides traceability for a given analysis result to the specific ADaM data that were used as input to generate the analysis results; it also provides information about the analysis method used and the reason the analysis was performed. In this presentation, we will share our experience gained on the client project. With a practical example we will demonstrate how we integrated ARM in the Define-XML by using the information present in the SAP and the specifications. In addition we will present the challenges encountered, the solutions implemented but also the future options that we plan to develop.



Session 3, Track B: eSource

Chair: Jozef Aerts, University of Applied Sciences FH Joanneum
14:00 - 15:30

Into the Fire, CDISC & FHIR

Dave Iberson-Hurst, Assero Limited & A3 Informatics

As an industry there has been much talk over the years of integrating clinical research with healthcare. There have been many attempts with large projects, small projects with some rather elaborate solutions. Then along came FHIR (Fast Healthcare Interoperability Resources) from HL7 (Health Level 7). The game changed.

This presentation will report on research and development work that was designed to investigate:

•    How easy it would be to integrate Biomedical Concepts (BCs) with FHIR resources; and 
•    Could an SDTM domain be generated from data held in a database with that data being extracted from an EHR via FHIR; and
•    Could the data extracted from the EHR be driven by common research artefacts such as form and study designs

Linking FHIR with Biomedical Concepts could help with integrating research with healthcare and make the integration of data much easier. If SDTM could also be generated, then we could start to automate our work and this could be viewed as the first step on a journey towards better handling and storage of clinical trial data.

This presentation will report on and detail:

•    The approach taken and the technology used within the demonstration application.
•    How to formulate the requests to the EHR, what data to ask for and what data are returned.
•    How the resulting FHIR responses can be processed and stored such that they are useful within the research domain.
•    How the stored EHR data can then be used to generate SDTM datasets.
•    Examine the SDTM derived data and what can be done to automate its generation.
•    Look at the issues caused by the terminologies used within research and healthcare such as CDISC, LOINC and UCUM
•    How mappings can be automated with special thought being given regarding the pre-coordinated nature of LOINC codes.
•    How UCUM might be used to automate data conversions

The presentation will conclude with a summary of results, next steps and potential impact.

eSource to SDTM: The Trade-offs and Pay-offs

Donald Benoot and Swapna Pothula, SGS Life Sciences

The advent of digital data, global networks and trustworthy cryptography have made data universally available. There is no longer a need to create physical copies of the data for different parties. With a proper eSource solution in place, electronically captured data can be transferred from the capture device to the eSource database and subsequently into the SDTM database, dissolving the boundary between source and CRF.

When looking at eSource systems from the traditional paradigm of paper/eCRF clinical trials, one might easily fall into the trap of seeing the eSource system simply as a digital replacement for paper source. However, eSource systems must be seen as part of a larger entity that integrates data capture, site automation, data flow facilitation, data visualization and data management. Therefore, the implementation of an eSource system will have a profound impact on the existing workflow, where we have to decide on important trade-offs to reap the resulting pay-offs. 

Implemented Clinical Data Sending Function in Open Source Type EHR/EMR

Professor Takahiro Kiuchi and Yoshiteru Chiba, UMIN

Since 1989, University hospital Medical Information Network (UMIN) has been under the National University Hospital Association and has become an important information infrastructure for academic activities of biomedical sciences in Japan. It is the largest and most versatile academic medical information network in the world.

UMIN has been supporting management of UMIN Internet Data and Information Center for medical research (INDICE), a researcher-led clinical research support system, accommodating more than 5.47 million clinical cases since 2000. It also supports electronic registration  of clinical case data in the CDISC ODM format. The interface that receives data electronically in this format is widely and publicly available as UMIN INDICE Lower level data communication protocol for CDISC ODM so that anyone can use it. Its transmission protocol  is provided by SOAP which is one of the standards.

The interface has been mainly used by the large university hospitals and medical institutions with enough development resources by modifying large-scale electronic medical records (EHR / EMR) of their own facilities. Thus this time UMIN tried to apply and deploy this interface in small medical institutions and clinics aiming at more widely use.

Recently, UMIN implemented clinical data sending function in open source type EHR/EMR for clinics. This function enables to send data to UMIN INDCIE in CDISC ODM form, which is currently under research and development at Grant-in-Aid for Scientific Research (A) from Ministry of Education, Culture, Sports, Science and Technology-Japan (MEXT). As a result, clinical trials such as with patients with chronic diseases and lifestyle  diseases are expected to have more candidates in clinics than in large hospitals. Conducting clinical trials in clinics in Japan was not very efficient due to lower dissemination rate of electronicization of clinic trials in their environment. Based on this achievement, we will promote electronicization of clinic trials, and we are expecting the feasibility of effective trials and clinical researches for chronic diseases and lifestyle diseases.

This open source electronic medical record is developed in the Java language, and the same program runs in both the computer environment of MicroSoft Corporation Windows and Apple Computer Co. MaxOS. We also plan to disclose the middleware information adopted in relation to development and publicize the source code of the additional modification part.

We hope that the results of this trial will be widely applied from large hospitals to small and medium medical institutions worldwide.

Session 3, Track C: Newcomers Session: A CDISC Overview

Chair: Rhonda Facile, CDISC
14:00 - 15:30

Who/Where: A CDISC History

John Owen, CDISC

Why: The Purpose and Benefits of Standards

Dr. Sam Hume, CDISC

How: COPs, Tools and Processes

Amy Palmer, CDISC

Session 4, Track A: ADaM

Chair: Simon Lundberg, AstraZeneca
Ballroom III (GF)
16:00 - 18:00

Workshop: The Unveiled Secrets of ADaM

Angelo Tinazzi, Cytel and Silvia Faini, CROS-NT

After a quick review of the main ADaM principles and rules, use cases will be presented and discussed with the audience. Examples are but not limited to wrong interpretation of ADaM IG, way of achieving good data-points traceability, make your ADaM ‘Analysis-ready’.

ADaM Mapping - Opportunities for Metadata Driven Automation

Elena Glathe, Bayer AG

One key task in statistical programming is the mapping of CDISC compliant analysis data sets following ADaM guidelines. Analysis data sets need to be clearly derived and traceable from SDTM data sets and need to consider the study specific analysis specification per SAP. This wish puts up some challenges for automation. At Bayer we took on this challenge and implemented a target oriented metadata driven solution, that deals with the mapping from SDTM+ to ADaM and extensive QC checks.

Some questions addressed in this paper:
* How to derive variables in several iteration steps where variables derived in the first Iteration are not (yet) available
* Ensure tracebility to SDTM 
* Ensure consistency - reuse item attributs from SDTM
* How to allow the mapping of several variables in only one step
* Automatic decode handling

ADaM 2018: What's New and What's Coming

Monika Kawohl, HMS Analytical Software

Introducing The ADaM Implementation Guide v1.2

Terek Peterson, Covance

The long-awaited CDISC ADaM Implementation Guide (IG) v1.2 will be released by the end of 2018! This poster created by the members of the CDISC ADaM IG v1.2 sub-team highlights some of the changes users can expect in the new version of the implementation guide. Significant updates included in version 1.2 of the ADaM IG:

  • New Permissible variable within the Basic Data Structure (BDS): PARQUAL – Parameter Qualifier
  • Nomenclature for Stratification variables within the Subject-Level Analysis Dataset (ADSL)
  • Recommended approach for bi-directional toxicity grades
  • Additional descriptions, clarification and refinement of text and examples

Co-Authors from the CDISC ADaM IG v1.2 team include Brian Harris (MedImmune) - Lead,  Terek Peterson (Covance) - co-Lead  and include team members Alyssa Wittle (Covance), Nancy Brucken (Syneos Health), and Deb Goodfellow (PRA)

Session 4, Track B: Machine Learning

Chair: Stijn Rogiers, SAS
München (1F)
16:00 - 18:00

Machine Learning Applications for Clinical Data Scientists

Mike Collinson, Oracle

Machine Learning and Artificial Intelligence are everywhere, from banks to taxis, efficient planes and record-breaking cars. Mobile devices in clinical trials are increasing the amount of data collected, while data cleaning technology increases the quality, reliability and availability of standardized information on which to train algorithms for analysis in real world situations.

Many sectors are looking to this technology to solve problems such as component reliability, fraud detection and targeting.  How can existing Machine Learning approaches be applied to the challenges for Clinical Data Scientists?

  • Can chatbots improve data collection, patient engagement and reduce data queries?
  • Will historical transaction analysis, site query performance and quality enable us to predict problems before they arise?
  • SDTM standards are key to providing visualizations to detect anomalies across trials, how do they help us view this alongside data lakes and media streams?
  • How will on-the-go pooling into SDTM provide insight into regionalities?
  • Can we leverage predictive mapping and data healing trained on historical SDTM data to improve submission success rates?
  • How will standards assist in data collection with the EHR revolution, FHIR, E2B and ODM2?
  • Should CDISC ensure we are aligned with standards outside of clinical, beyond even preclinical and post-marketing?
  • Will clinical trial registries and standardized terminology tie any or all of this together?

How Machine Learning can be Empowered by Using Data Standards in Digital Biomarker Space

Farhan Hameed, Pfizer

Objective: Propose a structured ontological approach for incorporating data standards in machine learning and AI in particular with the wearable devices use. 

Design: Using a theoretical and practical approach in the development and implementation of data standards for digital endpoints and digital biomarker in clinical trial space, develop a ontologically driven semantically interoperable approach for utilization of data standard, utilize such ontologies to streamline the data collection with metadata alignment with raw device data, create machine learning use cases and apply these standards to solve practical problems.

Method: Data collected from disparate sources such as devices, sensors, and electronic data capturing (EDC) systems requires an integrated platform for storage and retrieval for analysis. In order to automate the existing manual curation process, we implemented data standardization and conformity in existing data collection methodologies by utilization of current industry standards such as CDISC, Systematized Nomenclature of Medicine Clinical Terms (SNOMED) and ontologies including Semantic Sensor Network Ontology (SSNO). 
Also, to overcome the challenges to associate wearable device sensor data with biomedical models and patient data, one of the major goals for this ontological approach has been to align the data through standardized labels and variables development. Although we have evaluate existing standardized interoperable ontologies but it was inevitable to add a customized layer of standardized labels, concepts and variable to complement data capturing for on-going  clinical studies workflows. The interoperable frameworks can be utilized for the development of metadata dictionaries, sensor data reporting and failure analysis system and development of classifier for future clinical studies.

Results: The proposed alignment of data standards and semantic ontologies with machine learning can improve the design, testing and application of machine learning and AI. 

Conclusion:  Our efforts are focused in the development and training analytics engine through ML by querying ontologically driven meta-data. Apply lessons learned from current configuration of electronic data capturing (EDC) systems, Nomenclatures, study workflows and Data structure to improve the use of semantic ontologies. Enhancement of existing ontologies, such as SSNO, to develop specialized ontologies for feature extraction such as gait, walk; context-based Ontologies (CONON) development for multiple therapeutic areas and unscripted daily living and activities of daily living (ADLs); Video & Sensor Ontology for Video Annotation. We would further like to improve our data collection and retrieval of data achieved through the standardized use of ontology development and data standards including labels & variables and use similar model for future studies and apply ontologically driven meta-data for precise machine learning. Our efforts involves in industrializing the standard development and their implementation across research and clinical outcome analysis.

Data Mapping Using Machine Learning

Nathan Asselstine, SAS

Standards define the targets to which source data needs to be mapped too.  People can interpret these targets in different ways, this can lead to inconsistencies in the resulting standardized data. The ability to allow different teams working on similar studies (could be located at different off sites or offshore sites) to re-use prior knowledge gained by the team would not only save significant time mapping studies, but increase the quality in the resulting standardized data. 
This paper talks about capturing source to destination data mapping as metadata into centralized libraries and applying Machine Learning algorithms to streamline and predict mapping for newer studies that have similar metadata to already mapped studies. This process could lead to consistent destination data mapping and can significantly reduce the mapping timing by re-using system suggested mappings

Mapping raw data to standards is one of the most challenging process in the healthcare industry. Reusing or reapplying the information collected during mapping processes from previously mapped studies and building upon that knowledge inference is the most important part of the mapping process. Most companies struggle with building knowledge inferences and reapplying them through the data process efficiently. In addition, sometimes, there are multiple tools/programs required to go through full-cycle of data mapping. Therefore, it is difficult for the standard user to know all the different versions and tools, and use them correctly throughout the data mapping process. 

The SAS Data Mapping Tool is a web-based tool that provides a user-friendly interface for everything from mapping raw data to generating SDTM standards (including domain templates). Simple User Interface (UI) and click-away concept design provides access to all the required information on a single screen. Auto-mapping and smart-mapping features in the SAS Data Mapping Tool, which are based on knowledge inference derived from machine learning algorithms, reduce time and effort for the user. This leads to improvements in quality, efficiency and consistency.

1.    Standards / Controlled Terminologies – Provides ability to register Standards like – SDTM, ADaM or company specific standards and CTs
2.    Studies – Provides ability to register different studies and control permissions.
3.    Data –  Provides ability to capture Data for studies from different sources
4.    Mapping – Provides ability to map source to destination data
5.    Generate SAS Programs – Provides ability to generate SAS programs based on mapping metadata
6.    Libraries – Provides ability to capture mapping metadata into different libraries.

Since mapping metadata is captured into Libraries, different type of Machine Learning algorithms can be applied to learn information about existing mapping and these algorithms can be trained to help predict mapping for new source data. 

However due to high variation of the source data significant pre-processing of the data is required 

1.    Read and clean data
2.    NGram is applied to source data
3.    Character data is converted to Numeric format using Term-Frequency Inverse-Document-Frequency (TF-IDF) 
4.    Data is balanced to handle for over and under-sampling
5.    Different Machine Learning models are evaluated and optimized parameters values are identified.
6.    Models are trained and trained models are saved in binary formats.
7.    Trained Models are used to predict mapping for new source data.

Machine algorithms can be applied to different type of metadata captured at Dataset, Variable and Value level.  Below screenshot represent Multinomial Naive Base (MNB)model Similarity vs NGram similarity for Tables Mapping

CDISC Standards in the Age of Artificial Intelligence

Jozef Aerts, University of Applied Sciences FH Joanneum

We have been developing CDISC standards for over 20 years now, and are still publishing them as either PDF documents HTML at the best. This makes them very hard to use in systems using machine learning (ML) and artificial intelligence (AI).  We also must take into account that our standards become more and more complex: whereas 10 years ago, a 2-day SDTM training course was sufficient to almost become an SDTM expert, this now just allows to get a first glimpse of the principles of SDTM). So, people implementing CDISC standards now really need the support of “intelligent” systems helping them to make the right decisions, e.g. during mapping between operational data and submission data.

The presentation will explain what needs to be done, and present what has already been done at our university, to make CDISC standards fit for the "age of artificial intelligence". 

This includes:
•    We generated machine-readable version of the SDTM-IG 3.2 (see submitted poster)
•    A number of RESTful web services (machine-machine communication) for use with CDISC standards and for use with other coding systems such as LOINC have been developed and made publicly available (see http://xml4pharmaserver.com/WebServices/index.html)
•    A RESTful web service for unit conversions using the UCUM notation has been developed. This service is now further developed and deployed in cooperation with the National Library of Medicine (NLM)
•    A software tool has been developed to connect CDISC controlled terminology with other controlled terminologies (e.g. SNOMED-CT) based on RESTful web services provided by the NLM and using UMLS. This allows to build “networks of information and knowledge” spanning all controlled terminologies used in medicine and clinical research
•    A “protocol annotation” software has been developed (see submitted poster), that uses a large number of available RESTful web services. It allows to annotate unstructured protocols with any medical and research codes. This annotation tool also allows to automatically generate a CDISC CTR-XML file for clinical trial registry submission. An automated generation of the study design in CDISC ODM format is in preparation.

In the near future we will develop further tools (e.g. extend the existing mapping tools) who will use these services and tools and start implementing machine learning and artificial intelligence. For example, we are convinced that using these techniques together with metadata repositories and SHARE, and in combination with machine learning, it will be possible to automate the generation of study designs including annotated CRFs from the protocol. Also, by reuse and ML techniques, mappings to SDTM can largely be automated. 

The possibilities in the field of data review and analyses in clinical research are even more impressive, and have nearly been explored yet. ML and AI will in future be able to help reviewers to much better and faster analyze submitted data (combined with other available knowledge) and, in combination with other available information and knowledge, come to much better risk-benefit assessments.

Session 4, Track C: Newcomers Session: Connected CDISC Standards

Chair: Peter van Reusel, CDISC
Stuttgart (1F)
16:00 - 18:00


Sujit Khune, Novo Nordisk, E3C member


Dr. Erin Muhlbradt, NCI-EVS


Silvia Faini, CROS NT, E3C member

ODM, Define and More XML

Dr. Sam Hume, CDISC

Poster Session

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum
Foyer Ground Floor (GF)
08:00 - 17:00


Bluegrass Biggs, BiggsB, "A Path Forward - Lessons Learned Validating CDISC Conversions"

Djenan Ganic, intilaris LifeSciences GmbH, "CDISC Study/Protocol Design provision drives early clinical study setup"

Judith Goud, Nurocor, "CDASH IG v2.0  Implementation Considerations"

David Roulstone, Pinnacle 21, "Define.xml: what you should and should not be documenting in your define files"

Cathal Gallagher, d-Wise, "Updates From The EMA Technical Anonymization Group & Policy 0070"

Assia Bouhadouza, Sanofi, "Management of multiple results in non-extensible codelist variables"

Bob Van den Poel, Janssen Research and Development, "Implementing the CDISC SEND Data Standard at Janssen Research & Development"

Carey Smoak, S-cubed, "A Critique of the Use of the Medical Device SDTM Domains in Therapeutic Area User Guides"

Farhan Hameed, Pfizer, "An Ontologically-driven Approach to Implement Data Standards for Machine Learning for Wearable Devices"

Éanna Kiely, Syneos Health, "CDASH, ODM and Web Technologies"

Angelo Tinazzi, Cytel Inc., "Mind the gap: Pinnacle 21 Community version vs Pinnacle 21 Enterprise version"

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum, "A Protocol Annotation Tool"

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum, "The SDTM-IG in a machine-readable Form"

Michael Walther, Hofmann La Roche/Roche Diagnostics GmbH, "Leveraging the value of clinical data by establishing an integrated Metadata Repository"

Morten Hasselstrøm Jensen, Novo Nordisk A/S, "Please automate creation of analysis results metadata!"

26 Apr 2018

Session 5, Track A: What's New

Chair: Éanna Kiely, Syneos Health™
Ballroom III (GF)
09:00 - 10:30

ODMv2 and the CDISC Data Exchange Standards: The Big Picture

Dr. Sam Hume, CDISC

ODMv2 represents a significant step towards modernizing the CDISC Data Exchange Standards to spark more efficient methods of data exchange throughout the clinical research data lifecycle. This presentation will discuss the vision driving advances in the CDISC Data Exchange standards that include ODMv2 and beyond. The Data Exchange standards play a major role in supporting standards-based automation and the planned advances in these standards aim to reduce the effort needed to implement standards-based automation while expanding the breadth of the automation opportunities available. The presentation will cover how the Data Exchange standards work together with the CDISC content standards, and how the Data Exchange standards contribute to clinical research data interoperability. This presentation will also cover how the Data Exchange standards fit into the current CDISC standards strategy, as well as how the Data Exchange standards are positioned to play a more prominent role in future strategy discussions.

Big picture themes discussed in this presentation based on ODMv2 and future Data Exchange standards will include the answers to questions such as:
•    How can ODMv2+ support the development of a new wave of innovative software applications, both open source and commercial, which provide new levels of clinical research automation and interoperability?
•    How will ODMv2+ support data exchange services that work across media types like XML, JSON, and RDF?
•    How will ODMv2+ support the implementation and automation of end-to-end standards that enable data exchange and traceability across all phases of the clinical research data lifecycle?   
•    How will ODMv2+ support the use of real world data by alignment with FHIR, supporting eSource, and other methods for retrieving real world data?
•    How will ODMv2+ support standards-based automation and data exchange by retrieving or referencing CDISC content standards metadata in SHARE?
•    How does the ODMv2+ development and release cycle differ from that of the content standards and how does this benefit implementers?
•    How could ODMv2+ impact the community of ODM developers and users?

This presentation will also briefly highlight the current ODM community. ODM provides a semantically neutral data exchange standard that has been broadly implemented by a wide range of stakeholders including industry, academia, and government researchers, in hundreds of solutions globally. ODM is the most broadly implemented CDISC standard. This presentation will outline the types of applications currently supported by ODM, as well as the phases of the clinical research data lifecycle supported by those types of applications.

QRS (Questionnaires, Ratings and Scales) Domain Mappings and Updates

Éanna Kiely, Syneos Health™

The CDISC QRS (Questionnaires, Ratings and Scales) domains are composed of the QS (Questionnaires), FT (Functional Tests) and RS (Disease Response and Clin Classification) domains. We expect these to be published in SDTMIG 3.3.  CT (Controlled Terminology) for the domains has been published.

How does a CDISC user know which domain to map new QRS instrument to? In this presentation we will step through the current rules from the QRS team with worked examples.

We will also review some of the recent decisions and open topics on QRS handling from the QRS team and FDA.

CDASH v2.0

Peter Van Reusel, CDISC

SDTMIG v3.3: New Domains - New Benefits

Nick De Donder, Business & Decision Life Sciences

At the beginning of 2018 CDISC will release SDTM v1.7 and SDTMIG v3.3. New variables will be introduced that will make it easier to represent the captured data or better diversify the data. Besides that, the new Implementation Guide will contain some new domains. In this presentation we want to guide you through the newly published domains, their use and advantages.

Several Therapeutic Area User Guides (TAUGs) have been published over the last years. In these TAUG new domains and variables have been suggested. For example, the TAUG for Asthma has been published as a provisional version. The new domains indicated in this TAUG like Respiratory System Findings (RE) and Procedure Agents (AG) could be used but they had to be considered as custom domains. With the publication of SDTMIG v3.3 those domains have now been standardized and can be used as standard domains with fixed variables names.

In 2013 we have converted an asthma trial to SDTMIG 3.1.2 A1. Now 4 years later a similar trial needs to be converted to SDTMIG 3.3. We prefer to keep consistency with the previously converted SDTM data, but we also have to take the new rules into account. In our first submission we have added all pulmonary findings in a custom domain (PF) which was appropriate at that time. With the new release, the data needs to be captured in the published RE domain. Reference ranges that were previously captured in a supplemental qualifier can now be moved to the published standard variable REORREF (Reference Result in Original Units).

Session 5, Track B: Process Implementation and Optimization

Chair: Andrea Rauch, Boehringer-Ingelheim
München (1F)
09:00 - 10:30

Cost Benefit Analysis of Using Standards

Jasmine Kestemont, Innovion

This presentation will show the costs and benefits achieved with different levels of standardization. Assuming that it is possible to achieve 100% standardization, how much effort would still be required to set up a trial? And what is the effort to achieve 100% standardization?

We will develop the analysis using an anonymized real life model of an outsourced phase III trial in a new indication. We will quantify the cost of time, effort and quality.

The project originated from a request from a small biotech that understood early in their clinical development process that ultimately they would need to receive submission ready data from their CROs. We were initially tasked with developing a library of CDISC standards and a guide for CROs to develop EDC systems based on these standards principles. Each CRO was trained on the expectations. While the Biotech saw an immediate improvement in timeliness and quality of deliverables and communication, the next question was if more standardization – which implies more governance and maintenance, and a storage system – would be worth the investment.

While one can easily imagine that more standards will lead to higher quality, management wanted better insights and an assessment of potential return on investment. Hence the question to quantify the efforts to build a library, a metadata repository, CRO training and quality of deliverables, taking into account the Biotech’s portfolio.

The analysis shows that while there is a significant impact in setting up a trial from no standardization to some level of standardization, the time gain for setting up a trial from high level of standardization to a fully standardized trial rapidly diminishes. On the other hand the cost of Quality Control is significantly reduced in a fully standardized trial over one that is highly standardized.

This presentation aims to show a clear picture of cost benefit for implementation of standards, both for managers and members of standards organizations who may need an updated analysis to convince management in standards investment.

Harmonization of Independent Read Data with 3rd Party Vendors and Alignment with CRO

Monitha Mohan Haril Kumar, Merck KGaA

The harmonization of non-CRF data can be a challenge for any organization. This is particularly true in the case of the independent read data from 3rd party scientific vendors that use a variety of systems which can further complicate data harmonization. CDISC SDTM however, includes a robust structure for the tumor domains, incorporating examples of criteria such as RECIST, Cheson and Hallek. Nevertheless, the lack of adequate knowledge in SDTM by the 3rd party vendor could result in the misinterpretation of data and considerably more effort for both the CRO and Sponsor.

In the following presentation, we want to share our lessons learnt working with CROs and 3rd party scientific vendors in standardizing the independent read data. In our experience, we faced issues due to the misalignment between the independent read vendor and CRO with regards to the Non-CRF-Specifications versus the transferred raw data. A key element here is always having a solid Non-CRF mapping specification at the start of the study. Yet, even with mapping specifications, we underestimated the complexity caused by the independent read 3rd party vendors systems. Usually, as a sponsor, we are not involved during the set-up of 3rd party scientific vendor technical systems, used for tumor assessment according to different assessment criteria. We did not anticipate that some vendors have one single system for multiple tumor assessment criteria whereas the others might have different platforms for each of the tumor assessment criteria.

With the first test transfer, we realized that the CRO had difficulties in delivering us SDTM structured data for the Tumor domains which was visible in the:

A.    Definition of Linking Variable Values between domains: Mapped tumor domains in SDTM had unexpected linking id in TU and TR domains. Same lesion with two different criteria with the same locations should normally have similar LNKIDs with the same numbering. However, our vendor stated that they had different systems for different tumor criteria due to which their systems assign readers independently irrespective of targets from same locations. This was a piece of information that was invariably missed out in the beginning and caused extra efforts and time for the CRO to have this information corrected and to map it into the right SDTM structure.

B.    Representation of split and merged tumors: According to the examples provided in SDTMIG, split and merge tumor records are recorded in TU and TR domains. We also established company standards set with our CRO to have it mapped according to these examples. However, the 3rd party vendor systems were incapable of providing split lesion records in TU domain and the only place we had split and merged lesions were in TR. This was a deviation from our standards and impacted the mapping and validation process that had to be done by the CRO as well.

C.    Interpretation of SDTM structure: TR/RS LNKGRP and TR/RS GRPID were completely missing in the beginning. We had to make it clear that subject level overall responses or overall best responses will have RSLNKGRP blank and that GRPID should have visit information populated. But again, from a CRO standpoint, it was difficult to map raw data to tumor domains without grouping information.

D.    Changes in Protocol: Changes in the assessment criteria during the conduct of a study may occur, either due to new cohort introduction or due to the removal of an assessment criteria. This had a substantial effect on the SDTM package delivered from CRO as we had both RECIST and MIRRC assessments in a single record in the Tumor domains since both had Longest diameter being measured for the same target lesion. However, it was decided MIRRC would be deleted and another criterion would be introduced and therefore, we requested the CRO to split these records.

Because of our experience stated above, we launched a cross-functional initiative to address the pain points, for not only Independent read data, but also for Biomarkers, PK etc. This initiative brings members of different departments together: scientists, functional Subject Matter Experts, Data Managers and Standard Team members – to discuss topics like understanding the nature of vendor systems, mapping alignment guides, as well as, their level of SDTM knowledge upfront, as these elements have a major impact on the data and the process downstream with the CRO. To have upfront process alignment between the 3rd party vendor, CRO and the Sponsor can have a positive impact on timelines, cost and resources for all parties concerned. SDTM is the backbone of all data standardization, but adding a governance factor to have the operational needs met in place from a Sponsor’s standpoint can ensure that we receive the expected deliveries from a CRO.

A Statistics-Based Tool to Inform Risk-Based Monitoring Approaches

Silvia Faini and Lisa Comarella, CROS NT

In light of the latest guidelines for good clinical practice (ICH E6(R2) Integrated Addendum), sponsors of clinical trials should implement a “systematic, prioritized, risk-based approach” to monitoring activities. 

With the application of electronic case report forms (CRFs) and SDTM datasets, it is possible to develop tools to support risk-based monitoring (RBM). Sites can be monitored as data is collected, taking into account risk factors, tracking study progression, and proactively addressing potential critical situations.

The following presentation shows an example tool which uses a statistically-controlled scoring method in conjunction with principal component analysis (PCA) to identify potentially problematic sites, rather than subjective thresholds

While setting up this tool, it had been important to find the task that can be simplified. The steps are mainly: data setup, PCA analysis, reporting. As long as for each study SDTM datasets are produced, the answer for this simplification in CDISC standard: the input datasets are SDTM domains. Their usage is efficient within the tool and they are ready for submission purpose, since SDTM is frequently used in submissions to regulatory agencies, and required by FDA. 

A CRO’s Perspective on Successful Partnering to Deliver SDTM/SEND Contributions

Helen Owen, LGC Group

Is standardisation at the submission level driving complexity at the contributing CRO level? As the number of organisations contributing to a data package increase so does the complexity of the delivery. Here a CRO presents the challenges, successes and recommendations for other CROs and partners.

As data standardisation has become integral to the submission process, the requirements of contract research organisations (CROs) has also shifted. CDISC guidelines are now making their impact felt far beyond the organisations responsible for the preparation and submission of data. The relationship between partnering organisations has also become ever more crucial to the successful delivery of SDTM and SEND format data.

The experiences of LGC, a CRO that acts as a test facility supporting the delivery of preclinical and clinical bioanalytical data, will be described. As an organisation working primarily downstream of test sites and clinical parties there is often limited line of sight to the resulting data submission requirements. The challenges of working with multiple partners, together with suggested solutions for providing SDTM/SEND contributions to multiple Sponsor’s or third party organisations will be reviewed.

The effect of standardisation at the submission level will be examined in the context of driving complexity at the contributing CRO level. The task at LGC is compounded by multiple organisations and systems contributing source data often in a non-standardised manner, for integration into SDTM/SEND data. The opportunities taken to simplify processes to satisfy SDTM/SEND requirements will be described together with how these capabilities have been built at LGC. Associated learnings for similarly placed CROs that provide test facility services will also be shared.

The role of CDISC, PhUSE and other industry leaders, such as the European Bioanalytical Forum (EBF), in reducing complexity and supporting standardisation across the CRO and pharmaceutical industries will also be reviewed.

Session 5, Track C: Newcomers Session: TAs, SHARE, LAB & Regulatory

Chair: Dr. Nicole Harmon, CDISC
Stuttgart (1F)
09:00 - 10:30

Therapeutic Area User Guide Overview

Bess LeRoy, CDISC


Dr. Lauren Becnel, CDISC


Dr. Erin Muhlbradt, NCI-EVS

Global Regulatory Report

Amy Palmer, CDISC

Session 6, Track A: Define XML

Chair: Silvia Faini, CROS NT
11:00 - 12:30

Live Define-XML - Real Life Experiences

Katja Glass, Bayer AG

There are various ways and tools available to deal with the define.xml generation as well as with the validation report output. This presentation will show how the define.xml version 2.0 creation and related processes like the validation report handling can look like. Study specific metadata can be used together with general metadata to collect all define information. Then these information are transformed into the define.xml format which is checked. As the checks are very important, we run the checks often and support a user friendly process to follow up changes and to provide general guidance to study users.

Value Level Metadata (VLM) - Not a Challenge Anymore

Malathi Hari, Larix A/S

Value lLvel metadata (VLM) allows a better description and representation of a vertical structure in SDTM and ADaM datasets. VLM describes the metadata of one variable based on the value of another variable. VLM is one of the most challenging topics in the Define-XML standard.

Define-XML standard suggests the following approach:
A variable is described with variable-level metadata when all of its data can be described with one set of metadata. A variable is described with value-level metadata when it is made up of multiple data elements, each requiring its own set of metadata. 

The challenge is to find out the criteria to determine when to use value level metadata?  And what variables require this metadata? The CDISC DefineXML Specification Version 2.0 document provides some general, though unclear, rules. 
 • “Value Level Metadata should be provided when there is a need to describe differing metadata attributes for subsets of cells within a column. It is most often used on SDTM Findings domains to provide definitions for Variables such as --ORRES, --ORRESU, --STRES, --STRESU that are specific to each test code (value of --TESTCD). ”
 • “It is not required for Findings domains where the results have the same characteristics in all records, such as IE domains.” 
• “In ADaM, value level metadata often describes AVAL or AVALC in BDS data structures based on values of PARAMCD.” 
• “Value Level Metadata should be applied when it provides information useful for interpreting study data. It need not be applied in all cases. As an example, the --TEST variable could either be specified by a single variable with a codelist containing all the test names, or it could have Value Level Metadata specifying exactly which test name is appropriate for each --TESTCD. Both of these approaches are valid but the Value Level Metadata approach is more complicated and may not provide any information that will benefit a consumer of the data.” 
• “It is left to the discretion of the creator when it is useful to provide Value Level Metadata and when it is not.” 

The aim of this presentation is elaborate the concept and purpose of VLM along with detailed examples and most common situations. This presentation also explains the implementation of VLM in Define-xml 2.0, linking “Where-clause” sheet with Value level sheet, avoiding most common mistakes,  PHUSE working group project on Define-XML2.0 Completion Guidelines – status etc.

Define-XML – What You See Isn’t Always What You Get

Will Greenway, Quanticate

The define.xml is a great way to transfer metadata between organisations but like everything it needs to be handled with care. One of the largest or most common misconceptions seems to be regarding the define.xml contents versus the presented view when opened in a browser with an applied stylesheet. The “I opened it, it looks fine and there were no validation findings, so it’s ok” mentality can miss a lot of genuine issues in the define.xml. Additionally the CDISC example stylesheets are not “THE standard”, but simply a good starting point that can and should be altered when required, or even to aid internal review.

Issues with the stylesheet can be incorrectly interpreted as issues with the underlying xml file but equally since the stylesheet does not show every attribute (e.g. Length) some genuine issues are going undetected by validators and reviewers alike. Some could be caught with increased validation against the corresponding datasets but others, like a defaulted constant FileOID, although possibly obvious to a human looking through the xml code may be trickier to implement in an automated fashion.

A clear distinction will be made between the define.xml as machine readable code and the stylesheet that transforms the content into a more human-friendly format for viewing, including some of the assumptions made during this transformation. 

Attendees will learn how to do some of the easier adjustments that may help them check or view some of the “hidden” metadata and/or show extra information, for example adding (<length>) to the end of the type so that it is visible, e.g. “integer(8)”. This is a very simple adjustment but now the length can be seen by a reviewer and questions raised. Are there genuinely integers with a width of 8? Or has this instead been incorrectly populated with the default SAS length?

Further to this, more complex real-life examples will be covered including adding hyperlinks from variables to the Computational Algorithms section for Derived variables, displaying Supplemental Qualifiers (from VLM) as Keys for a dataset, conditionally hiding the “Role” column for the RELREC dataset (N/A per SDTM-IG), referencing more than one aCRF properly, and dealing with Supplemental datasets for Split domains.

This will benefit a great number of attendees, both those wanting to understand/review a define.xml to understand where an issue might lie, and what they might be missing when reviewing. Also those wanting to create or adjust define.xml files properly, which can include adjusting the default stylesheet if needed.

Session 6, Track B: Utilizing CDISC

Chair: Dr. Sam Hume, CDISC
11:00 - 12:30

Clinical Data Sharing and Semantic Linking with RDF and W3C Standards

Paul Houston, CDISC

Effective data sharing really relies upon data being fully shared, without limitations,  for open interrogation and for aggregation with complementary data sets. By implementing a rigorous data standards environment and following approaches, such as FAIR data (Findable, Accessible, Interoperable and Reusable) and semantic technologies, the medical research community can create and inter-connect a vast high quality data web in a meaningful way to create new knowledge and medical breakthroughs. 

Impact of CDISC Standard Implementations in IMI Clinical and Translational Research Data

Dr. Dorina Bratfalean, CDISC

Billions of dollars are spent annually on generating clinical and translational research data. So far, regardless of such significant levels of investment, the actual data reuse remains unpredictably low because of 1) difficulties in understanding, finding and accessing datasets themselves and 2) the significant effort required to syntactically and semantically harmonize datasets. 

The Innovative Medicines Initiative (IMI) is the largest public-private partnership within the life sciences domain in Europe and is focused on developing better and safer medicines for patients through translational research and appropriate data sharing. Assessment of data from diverse geographical regions, cultures and demographics is key for translational research in Europe, as these variables may influence health outcomes. Per Rubio et al., translational research[1] is defined as the transfer of knowledge from basic research to clinical research and then eventually to practice settings and communities. CDISC as a global, open, multidisciplinary, non-profit organization has a strong focus to develop and support global, platform-independent clinical and translational research data standards that enable information system interoperability to improve medical research and related areas of healthcare. CDISC standards are utilized extensively for global regulatory research, and a growing number of translational researchers have begun using them. CDISC’s Therapeutic Area User Guides demonstrate how to apply the standards to indication-specific data, and many of these guides overlap with IMI’s strategic research agenda disease areas. 

Here we summarize our experience in CDISC standards’ implementations over the course of a 5-year cross-organizational, cross-cultural IMI collaborative programs.  We will summarize our experiences with 1) datasets with healthy volunteers (HV) through projects such as IMI SAFE-T consortium[2] BioVacSafe[3], U-BIOPRED[4] and ABIRISK[5] and 2) mapping epidemiology study participant survey forms used to capture self-reported, longitudinal data. We will describe how, in several projects, CDISC data standards including CDASH, SEND, SDTM and SDTM’s Pharmacogenomics supplement were utilized to facilitate access, transparency and interoperability of data in ways that offer unique advantages over other methods.

Within the IMI, a combination of data standards, consistent publication and training in these areas has helped achieve {faster research, greater consistency, more efficiencies, etc. with measures if possible}. While CDISC standards have become the global language for regulated research, substantial opportunity exists to leverage these and other standards for non-regulated European research.

Where Did My Terminology Go?

Johannes Ulander, S-Cubed

Keeping control in the world of standards have always been a delicate task. It doesn’t matter if you are talking about Medical Dictionaries, laboratory tests or in which format you store your data, it will always involve decision making and require robust processes for how, when and why you move to another version than what you currently are using.

When it comes to clinical trials, it is a further complication that you need to manage completed, ongoing and planned studies, which means that you need to handle multiple versions of the same standard at the same time.

It has become a de-facto standard when creating CDISC SDTM to call this process “mapping”, but the word hides the complexity of the great many things that it might consist of. It can be everything from re-labelling of synonyms to a program that involves many derivations.

One standard that exemplifies this complexity is CDISC SDTM Controlled Terminology, as a change in the submission value actually might require you to also change the “mapping”, which is not immediately obvious.

This presentation will show how it is possible to get better control and understand the impact of moving from one version of a standard to another and how that process can be automated.

It will detail and explain:
•    What happens with terminology in a semantic world 
•    Why versioning is important
•    Highlighting current issues with versioning and decisions that need to be made when moving to linked data
•    How changes can be implemented using machine readable content to automate processes

Session 6, Track C: Newcomers Session: CDISC in Academia

Chair: Jozef Aerts, University of Applied Sciences FH Joanneum
Stuttgart (1F)
11:00 - 12:30

Data Collection and Registry Standard Aiming for Easy-to-Use RWD

Satoshi Ueno, National Center of Neurology and Psychiatry (NCNP)

CDISC has developed clinical research data standards such as Protocol Representation Model (PRM), Clinical Data Acquisition Standards Harmonization (CDASH), Study Data Tabulation Model (SDTM) and Analysis Data Model (ADaM) at each stage of planning, data collection, data tabulation and analysis. In addition, CDISC has been developing new Therapeutic Area Standards (TAS) and Controlled Terminology as a part of a wide range of standards from planning to analysis. In US, "Registry Model Common Data Elements" was proposed by Global Rare Diseases Registry Data Repository (GRDR) and "Minimum Data Set for Rare Disease Registries" by European Union Committee of Experts on Rare Diseases (EUCERD) in EU was proposed as registry standards. However, as of January 2018 CDISC has not developed registry standards.

In this presentation, the case of Duchenne muscular dystrophy (DMD), which is a severe type of muscular dystrophy, was taken as a case example, and the collection items of the neuromuscular disease registry were examined using Duchenne Muscular Dystrophy Therapeutic Area User Guide (TAUG-DMD), GRDR and EUCERD.

In a case of DMD, we aim at examining common items considering data collection and data conversion for constructing a neuromuscular disease registry.

To clarify the differences between collection items and responses in consideration of the current situation in Japan, we examined with experts of neuromuscular disorders. First, the collection items were classified into three categories: "genetic information" on genetic variation and its details, "clinical information" on patient background, clinical evaluation and biopsy, "patient information" on personal information and epidemiological research items. Specifically, genetic information was compared with the TAUG-DMD and the other CDISC standards based on the collection items by Remudy, which is a patient registry of neuromuscular disease in Japan. Regarding clinical information, the items gathered frequently in the natural history study were listed, and the collection items and responses were compared based on CDASH, SDTM and TAUG-DMD. For patient information, we compared with GRDR and EUCERD on the basis of collection items collected by Remudy and examined differences between collection items and responses.

Regarding the genetic information, there was no difference. In the clinical information, there was no major difference in the collection items. However, it was suggested that data collection method was different in the examinations such as 6-minute walk test. In the patient information, the basic collection items were common, but there were items requiring consideration as to how much information should be gathered such as “family composition” and “income”.

Development of a registry that does not consider data standards such as the CDISC standards causes problems in data handling and analysis when we conduct cross-sectional analysis. However, it is also feasible that there is too much confidence in data standards, which is problematic. Considering clinical data collection, it is important to have both viewpoints of clinical practice and data standards.

In clinical information, international standardization from data collection becomes possible by promoting international unification of diagnostic standards and measurement methods in addition to data standards.

However, it is expected that the data collection method will be different depending on the clinical situation of each country. It is also possible to standardize the collection data as data for analysis by performing data conversion considering SDTM from collected data. In addition, clearly separating collected data and analytical data contributes to quality assurance by data traceability.
Patient information is a valuable information source for promoting medical research including epidemiological research. To revitalize the utilization of RWD which is easy to use, it is considered great value and significance to develop and popularize registry standards.

Portal of Medical Data Models to Foster Best Practice Sharing in Clinical Research and Re-use of EHR Data

Martin Dugas, University of Münster

Electronic forms for documentation of patient data are an integral part within the workflow of physicians, both in clinical research and care. A huge amount of data is collected either through routine documentation forms for electronic health records (EHRs) or as case report forms (CRFs) for clinical trials. Data integration between routine care and clinical research ("real life data") requires compatible data models. However, different health information systems are not necessarily compatible with each other and thus information exchange of structured data is hampered. Software vendors provide a variety of individual documentation forms, which function as isolated applications. Furthermore, the vast majority of those forms is not available to the clinical resesarch community. Based on this lack of transparency harmonization of data models in health care is extremely cumbersome, thus work and know-how of completed clinical trials and routine documentation in hospitals are hard to be re-used.

The Portal of Medical Data Models (MDM-Portal, https://medical-data-models.org) is a metadata registry for searching, creating, analyzing, sharing and reusing medical forms. It is a registered European research information infrastructure, developed by the Institute of Medical Informatics, University of Münster in Germany with funding from the German Research Foundation (DFG). Sustainability of MDM is achieved through collaboration with the University Library of Münster. MDM contains predominantly CRFs from clinical research, but also routine documentation forms from EHRs. At present (as of 12/2017) it provides more than 15,000 system-independent forms (thereof ~10.000 from clinical trials), more than 380,000 data elements for more than 950 registered users; it constitutes Europe's largest collection of medical forms. Among those, numerous core data sets, common data elements or data standards, code lists and value sets are provided, which can be exported in 18 formats e.g. CDISC ODM, HL7 FHIR, OpenClinica, REDCap, Excel or SPSS. MDM provides full version control; through collaboration with ZBMED (Cologne, Germany), digital object identifiers (DOIs) for CRFs can be provided, i.e. persistent and citable versions of CRFs. Most data elements are annotated with semantic codes, in particular UMLS codes (UMLS consists of more than 100 terminologies, e.g. MedDRA is a subset of UMLS). These semantic codes are manually curated by medical experts. Semantic coding can be leveraged to compare different data models, even between different languages, e.g. CRFs from clinical trials regarding the same disease. This semantic analysis can be applied to develop common data elements (CDEs).

Transparency of data models and availability of medical forms from clinical research as well as routine care can foster best practice sharing. The MDM portal is available as information infrastructure to support this task.

Observational Trial in Academic Setting: Current Initiatives and Challenges

Dr. Kavita Gaadhe, Clinical Research Unit, Charite

Session 7: Challenges and Doing Better

Chair: Angelo Tinazzi, Cytel
Ballroom III (GF)
14:00 - 15:30

Top 5 Challenges at Novo Nordisk Complying with CDISC Standards

Sujit Khune and Anja Lundgreen, Novo Nordisk

Submission of SDTM datasets continues to be a challenge, as the SDTMIG gives options for interpretation and technical conformance guides differs between regulatory agencies. This presentation will explore below top five challenges at Novo Nordisk complying with CDISC SDTM requirements.

- Event Adjudication: FDA Technical Conformance Guide specifies requirements to clearly identify investigator reported data from adjudication data.  Different types of event data in different clinical settings are to be adjudicated.  Currently the SDTMIG does not offer explicit guidance on how adjudication results are to be represented in SDTM.
- SI Units: Requirements for units differ between regulatory authorities. The definition of an SI unit depends on where you search. The internal process of converting collected units into conventional units or SI units is a continued effort due to PMDA requirements.
- AP domains: Many types of clinical studies collect information on people other than the person participating in the study. This has been formalized with the publication of the Associated Persons SDTM IG. As with any new requirements there are challenges implementing data collection and SDTM standards for special types of studies, e.g. pregnancy studies.
- PP domain: Pharmacokinetic data are considered derived data by most sponsors and this is reflected in the data analysis and internal processes. The inclusion of the PP definition in the SDTMIG reverses the situation by prescribing that derived data be treated as source data. This complicates the processes and programs for SDTM.
- Race & Ethnicity: The new recommendation from FDA to allow trial participants multiple selections of ethnicity and race challenges established data collection standards and reporting. 

As a sponsor we spend a significant amount of time in our data standards teams addressing these types of issues when defining data collection and SDTM data standards. The aim of this presentation is to create awareness of how we as a sponsor would benefit from improved standardization and guidance within these areas.

Leveraging the Value of Clinical Data by Establishing an Integrated Metadata Repository

Michael Walter, F. Hofmann La Roche

As a field report the presentation describes Roche Diagnostics’ (DIA) approach of implementing an integrated Metadata Repository solution: 

Starting with the exercise to create awareness in the DIA organization about data standardization as a precondition to maximize the value of data collected in clinical studies. In order to allow for a smooth transition into production a staggered approach was chosen: ‘Implementation of the Metadata Repository tool’, ‘Execution of a Data Governance Process Pilot’ and ‘Integration of the Metadata Repository into a Clinical Data Warehouse’.

Project phase 1 focused on the technical implementation of the Metadata Repository and the meta model to govern the DIA clinical metadata therein. Part of the DIA clinical metadata was the DIA Subject and Sample Data Standard which is derived from the CDISC industry standard CDASH and SDTM. The technical implementation also comprised the data governance and the versioning concept. The presentation illustrates, how DIA utilizes data standard templates to simplify the usage of clinical data standards in individual prospective studies and to enable mapping of non-standard to standard variables.

In a 2nd project phase a data governance pilot tested the data governance process in a real life situation and prepared the study teams for the adoption of data standardization. The data governance pilot introduced an organization covering Global Data Standards Core Teams along with an Advisory Board, and a Subject Matter Expert extension of the core team. A selected number of prospective studies has been standardized in the course of the pilot. Based on the results of the Data Governance Pilot a recommendation for a Divisional data governance model has been approved and rolled out. 

Project phase 3 covered the technical integration of the Meta Data Repository into a Clinical Data Warehouse. This allows joining data standards with clinical data. As a result the Clinical Data Warehouse serves as a storage basin for metadata and data collected by study teams at Roche and at CROs. 

The presentation highlights the challenges DIA faced in achieving data standardization and the first results in increasing the ratio of standardization per prospective study to 80% by end of this year. It shall encourage companies to invest into data standardization and to take into consideration the following lessons learnt
1.    Clinical data standardization requires both technical means and processes
2.    A pilot helps to convince study teams about the value of clinical data standards
3.    CDISC standards perfectly support data standardization

SDTM: It is Not all Black and White

Swapna Pothula, SGS Life Sciences

CDISC SDTM is no longer recommended but is rather a required standard specification for all clinical trial data submissions. SDTM comes with SDTMIG, a guide to help us create SDTM compliant databases.

SDTM is a vast standard in theory but a guide in practice with gaps and gray areas left for interpretation. The more the variety of therapeutic areas, (e)Source systems, providers and sponsors we work with the more will be the differences in how SDTM is interpreted and databases are created. The possibility of SDTMIG being interpreted differently by different people while still being SDTM compliant can jeopardize the very concept of standardization.
This presentation will give a programmers’ perspective of the conflicts that arise while creating SDTM databases given the requirements, recommendations, expectations, limitations and gaps. Examples along with approaches used at SGS to navigate these gaps to optimize compliance and enhance consistency will be discussed in detail.

CDISC SDTM is no longer recommended or desired but is rather a required standard specification for all clinical trial data submissions as of October 2016 for the PDMA and December 2016 for the FDA. SDTM enhances quality, eases exchange and speeds up the overall review process. SDTM is intended to standardize the way the data is tabulated and comes with SDTMIG a guide to help us create CDISC SDTM compliant databases.

SDTM is a vast standard in theory but a guide in practice with gaps and gray areas left for interpretation. The possibility of the implementation guide being interpreted differently by different people while still being SDTM compliant can jeopardize the very concept of standardization.

It is not just the domains but also variables especially those that are derived are open for various interpretations. The more the variety of therapeutic areas, (e)Source systems, providers and sponsors we work with the more will be the differences in how SDTM is interpreted and databases are created. 

SDTM serves the purpose of standardization at the core. All the entities that are part of the clinical trial life cycle benefit of SDTM standardization at varying degrees while performing their day to day tasks. SDTM could serve other purposes apart from standardization depending on the entity the database is used by. For example, compliance to SDTM standard could be a major concern for a programmer who creates the databases, while a statistician might be interested in extent SDTM supports his analysis and a reviewer on the other hand might prioritise traceability over standardization. This is another major factor to be considered while creating SDTM compliant databases which can lead to different interpretations of the standards due the difference in the purpose of SDTM for different entities.

The task of creating SDTM compliant databases can be daunting for a programmer given the requirements, recommendations, purpose, expectations, limitations and gaps. This presentation will give a programmers’ perspective of the confusions and conflicts that arise while creating SDTM compliant databases. Real time examples and case studies will be presented along with the approach used at SGS to navigate these gaps, fine tune the process of SDTM mapping while optimizing compliance and enhancing consistency.

Session 8: Closing Plenary & Keynote Presentation

Moderator: Peter Van Reusel, CDISC
16:00 - 17:15

Keynote Presentation: Disruption Leads to Innovation

Chris Decker, d-Wise and CDISC Board Member

Over the last 30 years our industry has made little to no progress with the way we design, collect, transform and analyze clinical information. During that same time, technology in other industries has exploded allowing us to use our cell phone to find the best restaurant wherever we are standing, ask Alexa to order us groceries to be delivered in 2 hours, and virtually attend a concert from our living room. In our industry we still collect way too much data on virtual paper, transform that data more times than we can count, and lose context of the clinical information. CDISC was able to deliver the first iteration at standardizing information across our industry and we must all applaud that effort. However, the original standards were designed and development on legacy technology with a mindset of a two-dimensional world. Clinical information is, and always has been, complex and multidimensional and in recent years, the explosion of data from a breadth of sources requires us to change. It requires us to look at the world in a different way. It requires us to take risk in a risk adverse industry and embrace new ways of capturing our clinical information to enable the next generation. 

Closing Panel Discussion


Dr. Yuki Ando, PMDA

David Bobbitt, CDISC

Dr. Alison Cave, EMA

Dave Evans, Accenture

Dr. Ron Fitzmartin, FDA, Center for Drug Evaluation and Research

Dave Iberson-Hurst, Assero Limited & A3 Informatics

Closing Remarks

Joerg Dillert, Chair, CDISC E3C

Poster Session

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum
Foyer Ground Floor (GF
08:00 - 17:00


Bluegrass Biggs, BiggsB, "A Path Forward - Lessons Learned Validating CDISC Conversions"

Djenan Ganic, intilaris LifeSciences GmbH, "CDISC Study/Protocol Design provision drives early clinical study setup"

Judith Goud, Nurocor, "CDASH IG v2.0  Implementation Considerations"

David Roulstone, Pinnacle 21, "Define.xml: what you should and should not be documenting in your define files"

Cathal Gallagher, d-Wise, "Updates From The EMA Technical Anonymization Group & Policy 0070"

Assia Bouhadouza, Sanofi, "Management of multiple results in non-extensible codelist variables"

Bob Van den Poel, Janssen Research and Development, "Implementing the CDISC SEND Data Standard at Janssen Research & Development"

Carey Smoak, S-cubed, "A Critique of the Use of the Medical Device SDTM Domains in Therapeutic Area User Guides"

Farhan Hameed, Pfizer, "An Ontologically-driven Approach to Implement Data Standards for Machine Learning for Wearable Devices"

Éanna Kiely, Syneos Health, "CDASH, ODM and Web Technologies"

Angelo Tinazzi, Cytel Inc., "Mind the gap: Pinnacle 21 Community version vs Pinnacle 21 Enterprise version"

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum, "A Protocol Annotation Tool"

Dr. Jozef Aerts, University of Applied Sciences FH Joanneum, "The SDTM-IG in a machine-readable Form"

Michael Walther, Hofmann La Roche/Roche Diagnostics GmbH, "Leveraging the value of clinical data by establishing an integrated Metadata Repository"

Morten Hasselstrøm Jensen, Novo Nordisk A/S, "Please automate creation of analysis results metadata!"