An initial project to perform a gap analysis to assess the feasibility of using the CDISC Study Data Tabulation Model (SDTM) for representing 43 Self-Management Item Sets (SMIS) identified by the Japanese Collaborative Committee for Clinical Informatization in Diabetes Mellitus (CCCIDM)1 was completed with findings reported2. Building on this work, a second phase project was designed to 1) develop standalone SDTM examples illustrating all 43 items, 2) to develop CDISC Biomedical Concepts (BCs)3 for the items that did not already have published BCs available through CDISC, and 3) to make those examples and BCs available to the public. This paper describes the methodology and results from this second phase project with discussion of learnings that may be successfully applied to future projects.
Methodology
CDISC data standards developers reviewed each of the 43 SMIS items to determine which SDTM domain and which permissible variables would be required to minimally and accurately represent each core concept. Gaps in existing CDISC standards, previously identified in the first project, were discussed with CDISC’s Internal Standards Modeling Team (ISMT) to determine the best approaches for representation. Items that required clinical science input were discussed on a series of calls with subject matter experts from the Japanese Collaborative Committee of Clinical Informatization in Diabetes Mellitus, wherein project progress was discussed and feedback was solicited.
The standards development team used existing modeling approaches applied to analogous concepts in previously published CDISC standards and minimized the use of non-standard variables (NSVs) where possible. When non-standard variables were required, the developers prioritized using existing terms from the CDISC NSV Registry4. When the terminology used by the CCCIDM differed from published CDISC Controlled terminology (CT)5, CDISC CT was used, provided the original terms were listed as synonyms in CDISC CT. These changes were noted in the example descriptions. When no appropriate term existed in CDISC CT for an SMIS concept, a new term request was promptly submitted to the CDISC CT team via the new term request process at the National Cancer Institute Enterprise Vocabulary Services (NCI-EVS). LOINC Codes were added to the laboratory concept examples where available. For medical event term-based concepts, MedDRA terminology was prioritized in the example.
Finally, of the 14 concepts previously identified for which there were not yet existing CDISC BCs, each was developed through the separate CDISC BC Curation team development process. Each was aligned with NCI terminology to develop a conceptual layer from which one or more SDTM dataset specializations with pre-configured metadata were produced. BCs for medical event terms were not individually developed, as the current BC team philosophy is that providing one BC for all pre-specified event terms, and one BC for spontaneously reported event terms (each with pre-configured metadata) will provide value to end users who can easily supply their own specific event term during post-configuration.
Results
All 43 SMIS items were developed as SDTM examples and published in the CDISC Knowledge Base on the CDISC website. Examples were grouped by SDTM domain (e.g., all SMIS corresponding to vital signs were placed in one Vital Signs (VS) dataset example and specimen-based findings from laboratories were grouped into 2 Laboratory Test Results (LB) datasets for ease of review). In total, 8 SDTM example datasets were created to accommodate the 43 concepts. When it was possible that a concept could be represented in more than one way, each way was shown separately (e.g., SMIS item 17: “Abnormality on ECG” was shown both as a medical history term in the MH domain, and as an electrocardiogram result in the EG domain). A table of contents, linking each SMIS concept to its corresponding example in the CDISC Knowledge Base, including the row number in which it appears within that example, was developed as a gateway to each concept’s illustration in CDISC SDTM as shown in Appendix A. Row descriptions were written for those concepts that required further description to facilitate review.
One new Controlled Terminology term request was submitted to represent “Age at Disease Onset” as a Findings About (FA) test name (FATEST/FATESTCD) as a requested add-on to SMIS item numbers 11, 15, 19 and 21. This term is currently under development for the next semi-annual release of controlled terminology in September 2025.
All 14 of the previously identified gaps in CDISC BCs have been developed. These 14 BCs are now published or slated for publication in the next quarterly package release.
Discussion
The original investigation made during the initial project was to determine the applicability of CDISC standards to the 43 SMIS items proposed by CCCIDM. In this second phase, we have shown that CDISC standards are robust and can accommodate many concepts with little or no need for extensions to the existing SDTM model. Additionally, the majority of the items required minimal or no subject matter expert support to facilitate representation in SDTM. Minor gaps discovered in the initial project reflected new use cases for example development rather than gaps in existing approaches to implementing CDISC SDTM. In such cases, existing SDTM implementation approaches were simply applied to these concepts to create new examples. Some changes were required to terminology (some test names and units, for example) to align with CDISC controlled terminology “submission values”—the specific terms that should be used when creating SDTM conformant datasets. In these cases, the terms used by CCCIDM were listed as synonyms to the submission values published in CDISC Controlled Terminology.
Development of CDISC Biomedical Concepts ensures the relevance and longevity of this project work by enabling a future driven by scalable automation in standards implementation. Further, as science progresses and new items are identified, new BCs can be developed and published using the same framework. Providing the details and metadata necessary to create SDTM-conformant datasets, BCs reduce variation in standards implementation across studies and investigators. CDISC aims to make BCs a part of all future standards development projects.
The collaborative approach used in this project—where clinical SMEs develop the conceptual content, and CDISC works with these experts to translate the work into data standards—is a powerful one. This same model could be applied to future collaborations with CCCIDM, and other organizations within a variety of therapeutic areas as well.
Acknowledgements
This project was supported by the Japanese Ministry of Health, Labour and Welfare (MHLW) Research Grant (Number 23IA1016) obtained by Professor Naoki Nakashima.
Development activities have been done in collaboration with the Japanese Collaborative Committee of Clinical Informatization in Diabetes Mellitus, consisting of the Japan Diabetes Society, the Japanese Society of Hypertension, Japan Atherosclerosis Society, Japanese Society of Nephrology, Japanese Society of Laboratory Medicine, and Japan Association for Medical Informatics.
References
- Nakashima, et al: Recommended configuration for personal health records by standardized data item sets for diabetes mellitus and associated chronic diseases: A report from Collaborative Initiative by six Japanese Associations J Diabetes Investigation,2019;10: 868–875.
- Linking Collaborative Committee Clinical for Informatization in Diabetes Mellitus (CCCIDM) Self-managed Item Sets (SMIS) to the CDISC Study Data Tabulation Model (SDTM) V3.4, Accessed May 07, 2025. https://www.cdisc.org/news/linking-collaborative-committee-clinical-informatization-diabetes-mellitus-cccidm-self-managed
- CDISC Biomedical Concepts. Accessed May 07, 2025. https://www.corg/cdisc-biomedical-concepts
- Non-Standard Variables Registry. Accessed May 09, 2025. https://www.cdisc.org/standards/terminology/non-standard-variables
- Accessed May 07, 2025. https://www.cdisc.org/standards/terminology
Appendix A – Table of Contents of SDTM examples developed
Item # | Japanese Collaborative Committee SMIS Item | Units / Expression | SDTM Domain | Example Link | Row Number |
---|---|---|---|---|---|
1 | Height | cm | VS | 1 | |
2 | Weight | kg | VS | 2 | |
3 | Systolic blood pressure | mmHg | VS | 3 | |
4 | Diastolic blood pressure | mmHg | VS | 4 | |
5 | LDL-cholesterol | mg/dL | LB | 1 | |
6 | HDL-cholesterol | mg/dL | LB | 2 | |
7 | Smoking (including new types of tobacco) | Current/Former/Never | SU | ALL | |
8 | Serum creatinine | mg/dL | LB | 5 | |
9 | Urine protein | –, ±, +, 2 +, 3 + or over | LB | 15 | |
10 | Blood glucose | mg/dL | LB | 8 | |
11 | Age or date diagnosed as diabetes mellitus | under 10y.o, 10's, 20's, ..., 70's, 80y.o. or over, Not yet, unknown | MH | 5 | |
12 | HbA1c | % | LB | 20 | |
13 | ALT | U/L | LB | 10 | |
14 | Diabetic retinopathy | Yes/No/Unknown | MH | 6 | |
15 | Age or date diagnosed as hypertension | under 10y.o, 10's, 20's, ..., 70's, 80y.o. or over, No, unknown or YYYY-MM-DD | MH | 1 | |
16 | Serum potassium | mmol/L | LB | 7 | |
17 | Abnormality on ECG | Yes/No/Unknown | MH and EG |
| |
18 | Triglyceride | mg/dL | LB | 4 | |
19 | Age or date diagnosed as dyslipidemia | under 10y.o, 10's, 20's, ..., 70's, 80y.o. or over, No, unknown | MH | 2 | |
20 | History of coronary disease | Yes (contrast), Yes (other), No, Unknown | MH | 3 | |
21 | Age or date diagnosed as CKD | under 10y.o, 10's, 20's, ..., 70's, 80y.o. or over, No, unknown | MH | 8 | |
22 | Serum albumin | g/dL | LB | 9 | |
23 | Blood in Urine | –, ±, +, 2 +, 3 + or over; Macrohematuria | LB | 17 (macro), 18 (occult) | |
24 | Total cholesterol | mg/dL | LB | 3 | |
25 | Urine albumin/creatinine | mg/gCre | LB | 19 | |
26 | AST | U/L | LB | 11 | |
27 | Waist | cm | VS | 5 | |
28 | Urine glucose | –, ±, +, 2 + or over | LB | 16 | |
29 | y-GTP | U/L | LB | 21 | |
30 | Diabetic neuropathy | Yes/No/Unknown | MH | 7 | |
31 | Regular dental visit (≥ once/year) | Yes/No/Unknown | HO | ALL | |
32 | Urate | mg/dL | LB | 22 | |
33 | Systolic BP at home | mmHg | VS | 6 | |
34 | Diastolic BP at home | mmHg | VS | 7 | |
35 | Family history of renal failure | Yes/No/Unknown | APMH | ALL | |
36 | Urine protein/creatinine | g/gCre | LB | 23 | |
37 | Urine protein/Day | g/Day | LB | 14 | |
38 | Serum total protein | g/dL | LB | 12 | |
39 | UN | mg/dL | LB | 6 | |
40 | Hemoglobin | g/dL | LB | 13 | |
41 | Cystatin C | mg/L | LB | 24 | |
42 | Self-monitoring blood glucose | mmol/L | LB | ALL | |
43 | Weight at home | kg | VS | 8 |
