FDA Public Meeting on Transport Standards – 5 November 2012 On 5 November, the Food and Drug Administration (FDA) announced a meeting entitled “Regulatory New Drug Review: Solutions for Study Data Exchange Standards” with the purpose of soliciting input from industry, technology vendors, and other members of the public regarding the advantages and disadvantages of current and emerging open, consensus-based standards for the exchange of regulated study data. The meeting was held at the FDA’s White Oak Campus in a large room with ~ 15 round tables, each seating 5-8 individuals (i.e. about 100 attendees). There were at least a dozen FDA representatives present, 5 with speaker/facilitator/presenter roles. This is simply a brief summary with the key points that came across to me, with a goal of reporting this objectively.


Mary Ann Slack (FDA, CDER OPI) opened the meeting reiterating the purpose to get industry input, objectively without debates. She said that they really wanted to have a discussion and that they will provide an e-mail address for thoughts that folks would like to send, even after the event. She cited some scenarios that will be placed on the FDA website to convey the ‘pressing challenges’ that FDA has with respect to receiving and reviewing data. Ms. Slack highlighted a few of the comments that were received in writing, including the need for high quality standards across the spectrum of research, recommendations including an end of Phase II data standards meeting for those developing new treatments and the need for more consistency---governance is key, she said.


Doug Warfield (FDA, eData Management Solutions, OBI) spoke next about the limitations for SAS v5, including a max 8 character length. He commented that CDISC standards extensively use character types for tabulation, and unless sponsors clearly specify maximum length, SAS XPT will reserve the maximum space. Since sponsors often neglect to do this, file size of CDISC submissions is a challenge for FDA. Hence, they did a pilot with the exact same data, in JMP and SAS v7 and SASXPT; the latter file size was far bigger (by ~ 70%), so they tested decreasing the width of the columns to the size needed (vs. the maximum size) and also converted to 'vanilla' XML. They asked the PhRMA ERS companies to participate in this pilot with actual studies and they were requested to resize datasets. Out of 15 studies from 14 companies, they decreased size by 68-69% in column. The remaining 30% is still a concern, he stated.


Chuck Cooper (FDA, CDER, CSC) spoke about how the reviewers need help to be more efficient. Going forward, in order to see the modern review environment improve, the needs are:


*Audit trail - gap in current submissions

*Analysis specific trail, with information on how the analyses were created

*Flexibility - rapidly adapt to new emerging standards

*Data integration of data that are scattered across the different domains

*Metadata - need robust metadata to support the reviewers (who start by printing the define file)


Armando Oliva (FDA, CDER, CSC) defined exchange standards as shipping container and content standards as what was to be shipped (e.g. cars). He stated that the content will drive the shipping container requirements. He cited his key issue, which is that flat files lose meaning from 3 dimensional (round) data that are hierarchical. The solution, he says, necessitates a shift from antiquated paper-based CRFs. He then said he wanted to ‘shift gears to a light hearted example’. His scenario was that a company was developing a new product that would take you wherever you want to go in a given region; another company had a product to take you to another region, etc. He said, “FDAs never ending request for flat maps leads to inefficiency”. Then, he cited a sponsor who had a globe, with relational data models. Unfortunately, he ended by stating that the globe is not end-user friendly, but clinical data are not flat.


Bill Gibson of SAS was the first of the industry speakers. He stated that the last time SAS XPT v5 format was changed was 1999 (although it was developed much earlier). He spoke about a new extension macro that SAS has just created to take care of limitations (such as 8 character field length and 200 character fields) that is forced by v5. He said that it is backwardly compatible with the current v5 and would be so with the legacy data at FDA. He said that this could be a nice short-term solution, but that long term he recommended CDISC ODM (Operational Data Model), which strongly aligns with existing investments of industry and FDA.


Melissa Binz and Peter Messenbrink (Novartis) spoke about terminology management, EVS, version management, and disease area standards. They requested that FDA be flexible to work with sponsors as these progress. They requested that FDA announce the standards reviews as they become available. They requested metrics on how much data and what data are being used by reviewers to come to the decisions that they reach. It will be key to involve industry in decisions and for FDA to clearly communicate the decisions on standards and time frames for compliance, they stated. They want to know that the data being collected are what the reviewers want to review. Also, they mentioned that there should be data standards meetings, e.g. end of Phase 2, and clarified that these meetings should be different from other FDA meetings with sponsors, since the attendees will be different. One question they received was “Why not start the standards discussion at the beginning of Phase 2 vs end of Phase 2 meeting?”


Mathias Brochhausen and William Hogan of the University of Arkansas (UMAS) spoke about a medication ontology they had developed. They stated that RxNORM and NDF-RT were fraught with scientific mistakes and errors in logic. They proposed looking at open source semantically richer ontologies that specify and unify languages, e.g. OWL using description logic. Well- built semantic models help in a number of ways, including covering implicit knowledge. Their conclusion was that FDA should us support data exchange models that support biomedical ontologies.


Gary Kramer of ASTM spoke about AnIML, which was developed by ASTM for spectroscopy and chemistry data. AnIML meets requirements of: extensible, verifiable, traceable, validatable (which are also requirements of ODM)


Charlie Mead spoke on behalf of the W3C. He brought a Semantic Web Proposition (W3C), which he said makes it easier to re-use data standards than to make up of your own data standards. He recommends that both CDISC and HL7 represent the standards through a semantic web CDISC RDF project that is ongoing. He stated that this should take place in stages: 1) Standards as is 2) Standards in context 3) Interoperability across standards ( e.g. TermInfo = Common Structures for Shared Semantics with Terminology Model


He said that there is no magic here, and it is not trivial, but the standards should be put into RDF and then submitted. Putting suboptimal standards into RDF will not solve the problems but will expose the issues so that they can be addressed, i.e. it doesn't solve the issues but enables a solution. He also supported the idea of exploring use of RDF with ODM.


Armando Oliva spoke again, relating a pilot that they are doing through HL7 for transport. He clarified that this is an R and D effort only, and there are no policies around this work or its use. The theory is that HL7 v3 seems to be a promising option for transporting CDISC content; he has done this work due to the increasing use of HL7 in the U.S. Federal government. He also stated that there is increasing support for CDISC content standards use within FDA, CDER. There are 3 projects (patient narrative, study design, subject data). These, he stated, support the exchange of round data. The third project is in need of the data warehouse (Janus/CTR) to be available. He commented that harmonization of standards across SDOs is important. He also showed on one slide that the IGs are being developed through a VA tool, the Model Driven Health Tool (MDHT).


Dave Gemzik (Medidata Solutions) stated that he is responsible standards and for implementation within his company. He uses the standards on a daily basis. They are currently running ~ 3,000 studies, with 2.5 M patients. They need to scale quickly and be flexible, hence they need open, scalable, and flexible standards. Medidata uses CDISC ODM because it is flexible and extensible and widely adopted around the globe. HL7 expertise is hard to come by. They can bring someone up to speed on ODM quickly and have had no success finding programming resources with HL7 expertise. They use web services to deploy ODM, enabled by RESTful APIs. They have recently created a tool that uses the CDISC Study Design Model (SDM.xml), eCRFs, CDASH. The preferred standard with their customers is ODM.


Wayne Kubick (CDISC) spoke about CDISC ODM. It is a hierarchical model, supports end to end data collection and transport, has a full audit trail support and can work with HL7 CCD to take EHR data into research studies. Wayne brought letters signed by the CDISC Board and numerous attendees at the CDISC Interchange last week; these are signed by various individuals indicating support for ODM. They read as follows:


  • The SAS XPORT transport format should be replaced by a more modern data exchange standard for electronic regulatory submissions to FDA based on current prevailing XML technology.
  • The choice of transport standards for study data should capitalize on existing knowledge and investment within the global bio-pharmaceutical industry.
  • The choice of transport standards should ensure that commonly used data structures, specifically domain datasets and analysis files and their associated metadata, can be accurately exchanged, utilized and reproduced.
  • The CDISC Operational Data Model (ODM), which has been in production use for more than ten years, is an ideal choice as a new study data exchange standard for the following reasons:


  1. ODM can streamline the clinical development process by supporting metadata-driven data transport end-to-end across the entire clinical research lifecycle, with traceability from protocol through analysis.
  2. ODM is fully compliant with regulatory guidance and 21 CFR Part 11, including audit trail and electronic signatures.
  3. ODM is already widely understood and used extensively for global clinical research, and can be deployed for submissions without significant added financial burden on industry.
  4. ODM is fully compatible with current metadata submission standards, and is the basis for the CDISC define.xml standard already accepted by FDA.
  5. ODM accurately represents and easily reproduces tabular dataset structures, including those structured according to the CDISC Study Design Model, CDASH, SDTM, SEND and ADaM standards that are already widely used in industry and at the FDA.
  6. ODM is supported by NCI EVS as an exchange format for CDISC controlled terminology.
  7. ODM is already supported by major technology providers of clinical data information systems used for regulated clinical research.
  8. ODM has been successfully used in conjunction with HL7 CDA formatted data from Electronic Healthcare Record systems to support research under an HHS sponsored interoperability specification.
  9. ODM has the ability to represent more complex relationships between data events recorded per the research protocol.
  10. ODM can be easily and rapidly extended through the CDISC standards development process to address emerging new requirements as they arise.


Wayne closed by showing the FDA Federal Register announcement about an ODM pilot ~ 2007, stating that ~ 5 sponsors were ready to participate in this when it was suddenly cancelled by FDA. He encouraged FA to run an ODM pilot and not delay another 5 years before moving forward.


Fred Wood specifically answered the questions on the Federal Register notice, noting the fact that most of the biopharmaceutical industry is not aware of the HL7 standards and it would require starting over in terms of the research processes. There would be a major negative financial impact and not too many benefits.


Diane Wold stated that ODM would be orders of magnitude easier than HL7 v3 for industry to implement. She also commented that CDISC SHARE aims to provide rich metadata based on the BRIDG model and a robust datatype standard.


The discussion following these presentations repeatedly circled back to the following themes (some comments from the audience having been ‘tweeted’ on 5 November) from audience participants:


  • What is the problem we/FDA are trying to solve?
  • We need something now.
  • ODM is ready to go.
  • We need to add in the common data model (BRIDG) and semantics (SHARE).
  • There should be side by side pilots to get metrics.
  • The industry needs to know the FDA’s requirements...tell us what you want!
  • HL7 makes my head spin.
  • We want to help the reviewers without putting too much burden on the industry.


Chuck Cooper and Ron Fitzmartin, who facilitate the audience discussion, thanked everyone for their comments and input to FDA during the meeting and in writing and stated that they will be reviewing all of this information. They did not promise that anything would happen in the short-term; this will take time to assimilate.