SDTMIG v3.3

CDISC

Study Data Tabulation Model Implementation Guide: Human Clinical Trials

Version 3.3 (Final)


Notes to Readers

This is the implementation guide for human clinical trials corresponding to version 1.7 of the CDISC Study Data Tabulation Model.

Revision History

DateVersion
2018-11-203.3 Final
2013-11-263.2 Final
2012-07-163.1.3 Final
2008-11-123.1.2 Final
2005-08-263.1.1 Final
2004-07-143.1

© 2018 Clinical Data Interchange Standards Consortium, Inc. All rights reserved.

1 Introduction

1.1 Purpose

This document comprises the CDISC Version 3.3 (v3.3) Study Data Tabulation Model Implementation Guide for Human Clinical Trials (SDTMIG), which has been prepared by the Submissions Data Standards (SDS) team of the Clinical Data Interchange Standards Consortium (CDISC). Like its predecessors, v3.3 is intended to guide the organization, structure, and format of standard clinical trial tabulation datasets submitted to a regulatory authority. Version 3.3 supersedes all prior versions of the SDTMIG.

The SDTMIG should be used in close concert with the version 1.7 of the CDISC Study Data Tabulation Model (SDTM, available at http://www.cdisc.org/sdtm), which describes the general conceptual model for representing clinical study data that is submitted to regulatory authorities and should be read prior to reading the SDTMIG. Version 3.3 provides specific domain models, assumptions, business rules, and examples for preparing standard tabulation datasets that are based on the SDTM.

This document is intended for companies and individuals involved in the collection, preparation, and analysis of clinical data that will be submitted to regulatory authorities.

1.2 Organization of this Document

This document is organized into the following sections:

  • Section 1, Introduction, provides an overall introduction to the v3.3 models and describes changes from prior versions.
  • Section 2, Fundamentals of the SDTM, recaps the basic concepts of the SDTM, and describes how this implementation guide should be used in concert with the SDTM.
  • Section 3, Submitting Data in Standard Format, explains how to describe metadata for regulatory submissions, and how to assess conformance with the standards.
  • Section 4, Assumptions for Domain Models, describes basic concepts, business rules, and assumptions that should be taken into consideration before applying the domain models.
  • Section 5, Models for Special Purpose Domains, describes special purpose domains, including Demographics, Comments, Subject Visits, and Subject Elements.
  • Section 6, Domain Models Based on the General Observation Classes, provides specific metadata models based on the three general observation classes, along with assumptions and example data.
  • Section 7, Trial Design Model Datasets, describes domains for trial-level data, with assumptions and examples.
  • Section 8, Representing Relationships and Data, describes how to represent relationships between separate domains, datasets, and/or records, and provides information to help sponsors determine where data belong in the SDTM.
  • Section 9, Study References, provides structures for representing study-specific terminology used in subject data.
  • Appendices provide additional background material and describe other supplemental material relevant to implementation.

1.3 Relationship to Prior CDISC Documents

This document, together with the SDTM, represents the most recent version of the CDISC Submission Data Domain Models. Since all updates are intended to be backward compatible, the term "v3.x" is used to refer to Version 3.3 and all subsequent versions. The most significant changes since the prior version, v3.2, include:

A detailed list of changes between versions is provided in Appendix E, Revision History.

Version 3.1 was the first fully implementation-ready version of the CDISC Submission Data Standards that was directly referenced by the FDA for use in human clinical studies involving drug products. However, future improvements and enhancements will continue to be made as sponsors gain more experience submitting data in this format. Therefore, CDISC will be preparing regular updates to the implementation guide to provide corrections, clarifications, additional domain models, examples, business rules, and conventions for using the standard domain models. CDISC will produce further documentation for controlled terminology as separate publications, so sponsors are encouraged to check the CDISC website (http://www.cdisc.org/terminology) frequently for additional information. See Section 4.3, Coding and Controlled Terminology Assumptions, for the most up-to-date information on applying Controlled Terminology.

1.4 How to Read this Implementation Guide

This SDTM Implementation Guide (SDTMIG) is best read online, so the reader can benefit from the many hyperlinks included to both internal and external references. The following guidelines may be helpful in reading this document:

  1. First, read the SDTM to gain a general understanding of SDTM concepts.
  2. Next, read Sections 1-3 of this document to review the key concepts for preparing domains and submitting data to regulatory authorities. Refer to Appendix B, Glossary and Abbreviations, as necessary.
  3. Read Section 4, Assumptions for Domain Models.
  4. Review Section 5, Models for Special Purpose Domains, and Section 6, Domain Models Based on the General Observation Classes, in detail, referring back to Section 4, Assumptions for Domain Models, as directed. See the implementation examples for each domain to gain an understanding of how to apply the domain models for specific types of data.
  5. Read Section 7, Trial Design Model Datasets, to understand the fundamentals of the Trial Design Model and consider how to apply the concepts for typical protocols.
  6. Review Section 8, Representing Relationships and Data, to learn advanced concepts of how to express relationships between datasets, records, and additional variables not specifically defined in the models.
  7. Review Section 9, Study References, to learn occasions when it is necessary to establish study-specific references that will be used in accordance with subject data.
  8. Finally, review the Appendices as appropriate. Appendix C, Controlled Terminology, in particular, describes how CDISC Terminology is centrally managed by the CDISC Controlled Terminology Team. Efforts are made at publication time to ensure all SDTMIG domain/dataset specification tables and/or examples reflect the latest CDISC Terminology; users, however, should refer to https://www.cancer.gov/research/resources/terminology/cdisc as the authoritative source of controlled terminology, as CDISC controlled terminology is updated on a quarterly basis.

This implementation guide covers most data collected in human clinical trials, but separate implementation guides provide information about certain data, and should be consulted when needed.

  • The SDTM Implementation Guide for Associated Persons (SDTMIG-AP) provides structures for representing data collected about persons who are not study subjects.
  • The SDTM Implementation Guide for Medical Devices (SDTMIG-MD) provides structures for data about devices.
  • The SDTM Implementation Guide for Pharmacogenomics/Genetics (SDTMIG-PGx) provides structures for pharmacogenetic/genomic data and for data about biospecimens.

1.4.1 How to Read a Domain Specification

A domain specification table includes rows for all required and expected variables for a domain and for a set of permissible variables. The permissible variables do not include all the variables that are allowed for the domain; they are a set of variables that the SDS team considered likely to be included. The columns of the table:

  • Variable Name
    • For variables that do not include a domain prefix, this name is taken directly from the SDTM.
    • For variables that do include the domain prefix, this name from the SDTM, but with "--" placeholder in the SDTM variable name replaced by the domain prefix.
  • Variable Label: A longer name for the variable.
    • This may be the same as the label in the SDTM, or it may be customized for the domain.
    • If a sponsor includes in a dataset an allowable variable not in the domain specification, they will create an appropriate label.
  • Type: One of the two SAS datatypes, "Num" or "Char". These values are taken directly from the SDTM.
  • Controlled Terms, Codelist, or Format
    • Controlled Terms are represented as hyperlinked text. The domain code in the row for the DOMAIN variable is the most common kind of controlled term represented in domain specifications.
    • Codelist
      • An asterisk * indicates that the variable may be subject to controlled terminology.
        • The controlled terminology might be of a type that would inherently be sponsor defined.
        • The controlled terminology might be of a type that could be standardized, but has not yet been developed.
        • The controlled terminology might be terminology that would be specified in value-level metadata.
      • A hyperlinked codelist name in parentheses indicates that the variable is subject to the CDISC controlled terminology in the named codelist.
      • The name of an external code system (e.g., MedDRA, ISO 3166 Alpha-3) may be listed in plain text.
    • Format: "ISO8601" in plain text indicates that the variable values should be formatted in conformance with that standard.
  • Role: This is taken directly from the SDTM. Note that if a variable is either a Variable Qualifier or a Synonym Qualifier, the SDTM includes the qualified variable, but SDTMIG domain specifications do not.
  • CDISC Notes: The notes may include any of the following:
    • A description of what the variable means.
    • Information about how this variable relates to another variable.
    • Rules for when or how the variable should be populated, or how the contents should be formatted.
    • Examples of values that might appear in the variable. Such examples are only examples, and although they may be CDISC controlled terminology values, their presence in a CDISC note should not be construed as definitive. For authoritative information on CDISC controlled terminology, consult https://www.cancer.gov/research/resources/terminology/cdisc.
  • Core: Contains one of the three values "Req", "Exp", or "Perm", which are explained further in Section 4.1.5, SDTM Core Designations.

2 Fundamentals of the SDTM

2.1 Observations and Variables

The SDTMIG for Human Clinical Trials is based on the SDTM's general framework for organizing clinical trials information that is to be submitted to regulatory authorities. The SDTM is built around the concept of observations collected about subjects who participated in a clinical study. Each observation can be described by a series of variables, corresponding to a row in a dataset. Each variable can be classified according to its Role. A Role determines the type of information conveyed by the variable about each distinct observation and how it can be used. Variables can be classified into five major roles:

  • Identifier variables, such as those that identify the study, subject, domain, and sequence number of the record
  • Topic variables, which specify the focus of the observation (such as the name of a lab test)
  • Timing variables, which describe the timing of the observation (such as start date and end date)
  • Qualifier variables, which include additional illustrative text or numeric values that describe the results or additional traits of the observation (such as units or descriptive adjectives)
  • Rule variables, which express an algorithm or executable method to define start, end, and branching or looping conditions in the Trial Design model

The set of Qualifier variables can be further categorized into five sub-classes:

  • Grouping Qualifiers are used to group together a collection of observations within the same domain. Examples include --CAT and --SCAT.
  • Result Qualifiers describe the specific results associated with the topic variable in a Findings dataset. They answer the question raised by the topic variable. Result Qualifiers are --ORRES, --STRESC, and --STRESN.
  • Synonym Qualifiers specify an alternative name for a particular variable in an observation. Examples include --MODIFY and --DECOD, which are equivalent terms for a --TRT or --TERM topic variable, and --TEST and --LOINC, which are equivalent terms for a --TESTCD.
  • Record Qualifiers define additional attributes of the observation record as a whole (rather than describing a particular variable within a record). Examples include --REASND, AESLIFE, and all other SAE flag variables in the AE domain; AGE, SEX, and RACE in the DM domain; and --BLFL, --POS, --LOC, --SPEC and --NAM in a Findings domain
  • Variable Qualifiers are used to further modify or describe a specific variable within an observation and are only meaningful in the context of the variable they qualify. Examples include --ORRESU, --ORNRHI, and --ORNRLO, all of which are Variable Qualifiers of --ORRES; and --DOSU, which is a Variable Qualifier of --DOSE.

For example, in the observation, "Subject 101 had mild nausea starting on Study Day 6," the Topic variable value is the term for the adverse event, "NAUSEA". The Identifier variable is the subject identifier, "101". The Timing variable is the study day of the start of the event, which captures the information, "starting on Study Day 6", while an example of a Record Qualifier is the severity, the value for which is "MILD". Additional Timing and Qualifier variables could be included to provide the necessary detail to adequately describe an observation.

2.2 Datasets and Domains

Observations about study subjects are normally collected for all subjects in a series of domains. A domain is defined as a collection of logically related observations with a common topic. The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the trial. Each domain is represented by a single dataset.

Each domain dataset is distinguished by a unique, two-character code that should be used consistently throughout the submission. This code, which is stored in the SDTM variable named DOMAIN, is used in four ways: as the dataset name, the value of the DOMAIN variable in that dataset; as a prefix for most variable names in that dataset; and as a value in the RDOMAIN variable in relationship tables Section 8, Representing Relationships and Data.

All datasets are structured as flat files with rows representing observations and columns representing variables. Each dataset is described by metadata definitions that provide information about the variables used in the dataset. The metadata are described in a data definition document, a Define-XML document, that is submitted with the data to regulatory authorities. The Define-XML standard, available at https://www.cdisc.org/standards/transport/define-xml, specifies metadata attributes to describe SDTM data.

Data stored in SDTM datasets include both raw (as originally collected) and derived values (e.g., converted into standard units, or computed on the basis of multiple values, such as an average). The SDTM lists only the name, label, and type, with a set of brief CDISC guidelines that provide a general description for each variable.

The domain dataset models included in Section 5, Models for Special Purpose Domains and Section 6, Domain Models Based on the General Observation Classes of this document provide additional information about Controlled Terms or Format, notes on proper usage, and examples. See Section 1.4.1, How to Read a Domain Specification.

2.3 The General Observation Classes

Most subject-level observations collected during the study should be represented according to one of the three SDTM general observation classes: Interventions, Events, or Findings. The lists of variables allowed to be used in each of these can be found in the SDTM.

  • The Interventions class captures investigational, therapeutic, and other treatments that are administered to the subject (with some actual or expected physiological effect) either as specified by the study protocol (e.g., exposure to study drug), coincident with the study assessment period (e.g., concomitant medications), or self-administered by the subject (such as use of alcohol, tobacco, or caffeine).
  • The Events class captures planned protocol milestones such as randomization and study completion, and occurrences, conditions, or incidents independent of planned study evaluations occurring during the trial (e.g., adverse events) or prior to the trial (e.g., medical history).
  • The Findings class captures the observations resulting from planned evaluations to address specific tests or questions such as laboratory tests, ECG testing, and questions listed on questionnaires.

In most cases, the choice of observation class appropriate to a specific collection of data can be easily determined according to the descriptions provided above. The majority of data, which typically consists of measurements or responses to questions, usually at specific visits or time points, will fit the Findings general observation class. Additional guidance on choosing the appropriate general observation class is provided in Section 8.6.1, Guidelines for Determining the General Observation Class.

General assumptions for use with all domain models and custom domains based on the general observation classes are described in Section 4, Assumptions for Domain Models; specific assumptions for individual domains are included with the domain models.

2.4 Datasets Other Than General Observation Class Domains

The SDTM includes four types of datasets other than those based on the general observation classes:

  • Domain datasets, which include subject-level data that do not conform to one of the three general observation classes. These include Demographics (DM), Comments (CO), Subject Elements (SE), and Subject Visits (SV) [1], and are described in Section 5, Models for Special Purpose Domains.
  • Trial Design Model (TDM) datasets, which represent information about the study design but do not contain subject data. These include datasets such as Trial Arms (TA) and Trial Elements (TE) and are described in Section 7, Trial Design Model Datasets.
  • Relationship datasets, such as the RELREC and SUPP-- datasets. These are described in Section 8, Representing Relationships and Data.
  • Study Reference datasets, which include Device Identifiers (DI), Non-host Organism Identifiers (OI), and Pharmacogenomic/Genetic Biomarker Identifiers (PB). These provide structures for representing study-specific terminology used in subject data. These are described in Section 9, Study References.

[1] SE and SV were included as part of the Trial Design Model in SDTMIG v3.1.1, but were moved in SDTMIG v3.1.2.

2.5 The SDTM Standard Domain Models

A sponsor should only submit domain datasets that were actually collected (or directly derived from the collected data) for a given study. Decisions on what data to collect should be based on the scientific objectives of the study, rather than the SDTM. Note that any data collected that will be submitted in an analysis dataset must also appear in a tabulation dataset.

The collected data for a given study may use standard domains from this and other SDTM Implementation Guides as well as additional custom domains based on the three general observation classes. A list of standard domains is provided in Section 3.2.1, Dataset-Level Metadata. Final domains will be published only in an SDTM Implementation Guide (the SDTMIG for human clinical trials or another implementation guide, such as the SDTMIG for Medical Devices). Therapeutic area standards projects and other projects may develop proposals for additional domains. Draft versions of these domains may be made available in the CDISC wiki in the SDTM Draft Domains (https://wiki.cdisc.org/x/s4Iv) area.

Starting with SDTMIG v3.3:

  • A new domain has version 1.0.
  • An existing version that has changed since the last published version of the SDTMIG is up-versioned.
  • An existing version that has not changed since the last published version of the SDTMIG is not up-versioned.

What constitutes a change for the purposes of deciding a domain version will be developed further, but for SDTMIG v3.3, a domain was assigned a version of v3.3 if there was a change to the specification and/or the assumptions from the domain as it appeared in SDTMIG v3.2.

These general rules apply when determining which variables to include in a domain:

  • The Identifier variables, STUDYID, USUBJID, DOMAIN, and --SEQ are required in all domains based on the general observation classes. Other Identifiers may be added as needed.
  • Any Timing variables are permissible for use in any submission dataset based on a general observation class except where restricted by specific domain assumptions.
  • Any additional Qualifier variables from the same general observation class may be added to a domain model except where restricted by specific domain assumptions.
  • Sponsors may not add any variables other than those described in the preceding three bullets. The addition of non-standard variables will compromise the FDA's ability to populate the data repository and to use standard tools. The SDTM allows for the inclusion of a sponsor's non-SDTM variables using the Supplemental Qualifiers special purpose dataset structure, described in Section 8.4, Relating Non-Standard Variables Values to a Parent Domain. As the SDTM continues to evolve over time, certain additional standard variables may be added to the general observation classes.
  • Standard variables must not be renamed or modified for novel usage. Their metadata should not be changed.
  • A Permissible variable should be used in an SDTM dataset wherever appropriate.  
    • If a study includes a data item that would be represented in a Permissible variable, then that variable must be included in the SDTM dataset, even if null. Indicate no data were available for that variable in the Define-XML document.
    • If a study did not include a data item that would be represented in a Permissible variable, then that variable should not be included in the SDTM dataset and should not be declared in the Define-XML document.

2.6 Creating a New Domain

This section describes the overall process for creating a custom domain, which must be based on one of the three SDTM general observation classes. The number of domains submitted should be based on the specific requirements of the study. Follow the process below to create a custom domain:

  1. Confirm that none of the existing published domains will fit the need. A custom domain may only be created if the data are different in nature and do not fit into an existing published domain.
    • Establish a domain of a common topic (i.e., where the nature of the data is the same), rather than by a specific method of collection (e.g., electrocardiogram, EG). Group and separate data within the domain using --CAT, --SCAT, --METHOD, --SPEC, --LOC, etc. as appropriate. Examples of different topics are: microbiology, tumor measurements, pathology/histology, vital signs, and physical exam results.
    • Do not create separate domains based on time; rather, represent both prior and current observations in a domain (e.g., CM for all non-study medications). Note that AE and MH are an exception to this best practice because of regulatory reporting needs.
    • How collected data are used (e.g., to support analyses and/or efficacy endpoints) must not result in the creation of a custom domain. For example, if blood pressure measurements are endpoints in a hypertension study, they must still be represented in the VS (Vital Signs) domain, as opposed to a custom "efficacy" domain. Similarly, if liver function test results are of special interest, they must still be represented in the LB (Laboratory Tests) domain.
    • Data that were collected on separate CRF modules or pages may fit into an existing domain (such as separate questionnaires into the QS domain, or prior and concomitant medications in the CM domain).
    • If it is necessary to represent relationships between data that are hierarchical in nature (e.g., a parent record must be observed before child records), then establish a domain pair (e.g., MB/MS, PC/PP). Note, domain pairs have been modeled for microbiology data (MB/MS domains) and PK data (PC/PP domains) to enable dataset-level relationships to be described using RELREC. The domain pair uses DOMAIN as an Identifier to group parent records (e.g., MB) from child records (e.g., MS) and enables a dataset-level relationship to be described in RELREC. Without using DOMAIN to facilitate description of the data relationships, RELREC, as currently defined, could not be used without introducing a variable that would group data like DOMAIN.
  2. Check the SDTM Draft Domains area of CDISC wiki SDTM Draft Domains Home (https://wiki.cdisc.org/x/s4Iv) for proposed domains developed since the last published version of the SDTMIG. These proposed domains may be used as custom domains in a submission.
  3. Look for an existing, relevant domain model to serve as a prototype. If no existing model seems appropriate, choose the general observation class (Interventions, Events, or Findings) that best fits the data by considering the topic of the observation The general approach for selecting variables for a custom domain is as follows (also see Figure 2.6, Creating a New Domain, below).
    1. Select and include the required identifier variables (e.g., STUDYID, DOMAIN, USUBJID, --SEQ) and any permissible Identifier variables from the SDTM.
    2. Include the topic variable from the identified general observation class (e.g., --TESTCD for Findings) in the SDTM.
    3. Select and include the relevant qualifier variables from the identified general observation class in the SDTM. Variables belonging to other general observation classes must not be added.
    4. Select and include the applicable timing variables in the SDTM.
    5. Determine the domain code, one that is not a domain code in the CDISC Controlled Terminology codelist "SDTM Domain Abbreviations" available at  http://www.cancer.gov/research/resources/terminology/cdisc. If it desired to have this domain code as part of CDISC controlled terminology, then submit a request to https://ncitermform.nci.nih.gov/ncitermform/?version=cdisc. The sponsor-selected, two-character domain code should be used consistently throughout the submission.
    6. Apply the two-character domain code to the appropriate variables in the domain. Replace all variable prefixes (shown in the models as two hyphens "--") with the domain code.
    7. Set the order of variables consistent with the order defined in the SDTM for the general observation class.
    8. Adjust the labels of the variables only as appropriate to properly convey the meaning in the context of the data being submitted in the newly created domain. Use title case for all labels (title case means to capitalize the first letter of every word except for articles, prepositions, and conjunctions).
    9. Ensure that appropriate standard variables are being properly applied by comparing their use in the custom domain to their use in standard domains.

    10. Describe the dataset within the Define-XML document. See Section 3.2, Using the CDISC Domain Models in Regulatory Submissions — Dataset Metadata.

    11. Place any non-standard (SDTM) variables in a Supplemental Qualifier dataset. Mechanisms for representing additional non-standard qualifier variables not described in the general observation classes and for defining relationships between separate datasets or records are described in Section 8.4, Relating Non-Standard Variables Values to a Parent Domain.

Figure 2.6: Creating a New Domain

2.7 SDTM Variables Not Allowed in SDTMIG

This section identifies those SDTM variables that either 1) should not be used in SDTM-compliant data tabulations of clinical trials data or 2) have not yet been evaluated for use in human clinical trials.

The following SDTM variables, defined for use in non-clinical studies (SEND), must NEVER be used in the submission of SDTM-based data for human clinical trials:

  • --USCHFL (Interventions, Events, Findings)
  • --DTHREL (Findings)
  • --EXCLFL (Findings)
  • --REASEX (Findings)
  • --IMPLBL (Findings)
  • FETUSID (Identifiers)
  • --DETECT (Timing Variables)
  • --NOMDY (Timing Variables)
  • --NOMLBL (Timing Variables)

The following variables can be used for non-clinical studies (SEND) but must NEVER be used in the Demographics domain for human clinical trials, where all subjects are human. See Section 9.2, Non-host Organism Identifiers (OI), for information about representing taxonomic information for non-host organisms such as bacteria and viruses.

  • SPECIES (Demographics)
  • STRAIN (Demographics)
  • SBSTRAIN (Demographics)

The following variables have not been evaluated for use in human clinical trials and must therefore be used with extreme caution:

  • --METHOD (Interventions)
  • --ANTREG (Findings)
  • --CHRON (Findings)
  • --DISTR (Findings)
  • SETCD (Demographics)

    The use of SETCD additionally requires the use of the Trials Sets domain.

The following identifier variable can be used for non-clinical studies (SEND), and may be used in human clinical trials when appropriate:

  • POOLID

    The use of POOLID additionally requires the use of the Pool Definition dataset.

Other variables defined in the SDTM are allowed for use as defined in this SDTMIG except when explicitly stated. Custom domains, created following the guidance in Section 2.6, Creating a New Domain, may utilize any appropriate Qualifier variables from the selected general observation class.

3 Submitting Data in Standard Format

3.1 Standard Metadata for Dataset Contents and Attributes

The SDTMIG provides standard descriptions of some of the most commonly used data domains, with metadata attributes. These include descriptive metadata attributes that should be included in a Define-XML document. In addition, the CDISC domain models include two shaded columns that are not sent to the FDA. These columns assist sponsors in preparing their datasets:

  • "CDISC Notes" is for notes to the sponsor regarding the relevant use of each variable.
  • "Core" indicates how a variable is classified (see Section 4.1.5, SDTM Core Designations).

The domain models in Section 6, Domain Models Based on the General Observation Classes illustrate how to apply the SDTM when creating a specific domain dataset. In particular, these models illustrate the selection of a subset of the variables offered in one of the general observation classes, along with applicable timing variables. The models also show how a standard variable from a general observation class should be adjusted to meet the specific content needs of a particular domain, including making the label more meaningful, specifying controlled terminology, and creating domain-specific notes and examples. Thus the domain models not only demonstrate how to apply the model for the most common domains, but also give insight on how to apply general model concepts to other domains not yet defined by CDISC.

3.2 Using the CDISC Domain Models in Regulatory Submissions — Dataset Metadata

The Define-XML document that accompanies a submission should also describe each dataset that is included in the submission and describe the natural key structure of each dataset. While most studies will include DM and a set of safety domains based on the three general observation classes (typically including EX, CM, AE, DS, MH, LB, and VS), the actual choice of which data to submit will depend on the protocol and the needs of the regulatory reviewer. Dataset definition metadata should include the dataset filenames, descriptions, locations, structures, class, purpose, and keys, as shown in Section 3.2.1, Dataset-Level Metadata. In addition, comments can also be provided where needed.

In the event that no records are present in a dataset (e.g., a small PK study where no subjects took concomitant medications), the empty dataset should not be submitted and should not be described in the Define-XML document. The annotated CRF will show the data that would have been submitted had data been received; it need not be re-annotated to indicate that no records exist.

3.2.1 Dataset-Level Metadata

Note that the key variables shown in this table are examples only. A sponsor's actual key structure may be different.

Separate Supplemental Qualifier datasets of the form supp--.xpt are required. See Section 8.4, Relating Non-Standard Variables Values to a Parent Domain.

3.2.1.1 Primary Keys

The table in Section 3.2.1, Dataset-Level Metadata shows examples of what a sponsor might submit as variables that comprise the primary key for SDTM datasets. Since the purpose of this column is to aid reviewers in understanding the structure of a dataset, sponsors should list all of the natural keys (see definition below) for the dataset. These keys should define uniqueness for records within a dataset, and may define a record sort order. The identified keys for each dataset should be consistent with the description of the dataset structure as described in the Define-XML document. For all the general-observation-class domains (and for some special purpose domains), the --SEQ variable was created so that a unique record could be identified consistently across all of these domains via its use, along with STUDYID, USUBJID, DOMAIN. In most domains, --SEQ will be a surrogate key (see definition below) for a set of variables that comprise the natural key. In certain instances, a Supplemental Qualifier (SUPP--) variable might also contribute to the natural key of a record for a particular domain. See Section 4.1.9, Assigning Natural Keys in the Metadata, for how this should be represented, and for additional information on keys.

A natural key is a set of data (one or more columns of an entity) that uniquely identifies that entity and distinguishes it from any other row in the table. The advantage of natural keys is that they exist already; one does not need to introduce a new, "unnatural" value to the data schema. One of the difficulties in choosing a natural key is that just about any natural key one can think of has the potential to change. Because they have business meaning, natural keys are effectively coupled to the business, and they may need to be reworked when business requirements change. An example of such a change in clinical trials data would be the addition of a position or location that becomes a key in a new study, but that wasn't collected in previous studies.

A surrogate key is a single-part, artificially established identifier for a record. Surrogate key assignment is a special case of derived data, one where a portion of the primary key is derived. A surrogate key is immune to changes in business needs. In addition, the key depends on only one field, so it's compact. A common way of deriving surrogate key values is to assign integer values sequentially. The --SEQ variable in the SDTM datasets is an example of a surrogate key for most datasets; in some instances, however, --SEQ might be a part of a natural key as a replacement for what might have been a key (e.g., a repeat sequence number) in the sponsor's database.

3.2.1.2 CDISC Submission Value-Level Metadata

In general, the SDTMIG v3.x Findings data models are closely related to normalized, relational data models in a vertical structure of one record per observation. Since the v3.x data structures are fixed, sometimes information that might have appeared as columns in a more horizontal (denormalized) structure in presentations and reports will instead be represented as rows in an SDTM Findings structure. Because many different types of observations are all presented in the same structure, there is a need to provide additional metadata to describe the expected properties that differentiate, for example, hematology lab results from serum chemistry lab results in terms of data type, standard units, and other attributes.

For example, the Vital Signs data domain could contain subject records related to diastolic and systolic blood pressure, height, weight, and body mass index (BMI). These data are all submitted in the normalized SDTM Findings structure of one row per vital signs measurement. This means that there could be five records per subject (one for each test or measurement) for a single visit or time point, with the parameter names stored in the Test Code/Name variables, and the parameter values stored in result variables. Since the unique Test Code/Names could have different attributes (i.e., different origins, roles, and definitions) there would be a need to provide value-level metadata for this information.

The value-level metadata should be provided as a separate section of the Define-XML document. For details on the CDISC Define-XML standard, see https://www.cdisc.org/standards/transport/define-xml.

3.2.2 Conformance

Conformance with the SDTMIG Domain Models is minimally indicated by:

  • Following the complete metadata structure for data domains
  • Following SDTMIG domain models wherever applicable
  • Using SDTM-specified standard domain names and prefixes where applicable
  • Using SDTM-specified standard variable names
  • Using SDTM-specified data types for all variables
  • Following SDTM-specified controlled terminology and format guidelines for variables, when provided
  • Including all collected and relevant derived data in one of the standard domains, special purpose datasets, or general-observation-class structures
  • Including all Required and Expected variables as columns in standard domains, and ensuring that all Required variables are populated
  • Ensuring that each record in a dataset includes the appropriate Identifier and Timing variables, as well as a Topic variable
  • Conforming to all business rules described in the CDISC Notes column and general and domain-specific assumptions

4 Assumptions for Domain Models

4.1 General Domain Assumptions

4.1.1 Review Study Data Tabulation and Implementation Guide

Review the Study Data Tabulation Model as well as this complete Implementation Guide before attempting to use any of the individual domain models.

4.1.2 Relationship to Analysis Datasets

Specific guidance on preparing analysis datasets can be found in the CDISC Analysis Data Model (ADaM) Implementation Guide and other ADaM documents, available at http://www.cdisc.org/adam.

4.1.3 Additional Timing Variables

Additional Timing variables can be added as needed to a standard domain model based on the three general observation classes, except for the cases specified in Assumption 4.4.8, Date and Time Reported in a Domain Based on Findings. Timing variables can be added to special purpose domains only where specified in the SDTMIG domain model assumptions. Timing variables cannot be added to SUPPQUAL datasets or to RELREC (described in Section 8, Representing Relationships and Data).

4.1.3.1 EPOCH Variable Guidance

When EPOCH is included in a Findings class domain, it should be based on the --DTC variable, since this is the date/time of the test or, for tests performed on specimens, the date/time of specimen collection. For observations in Interventions or Events class domains, EPOCH should be based on the --STDTC variable, since this is the start of the Intervention or Event. A possible, though unlikely, exception would be a finding based on an interval specimen collection that started in one epoch but ended in another. --ENDTC might be a more appropriate basis for EPOCH in such a case.

Sponsors should not impute EPOCH values, but should, where possible, assign EPOCH values on the basis of CRF instructions and structure, even ifEPOCH was not directly collected and date/time data was not collected with sufficient precision to permit assignment of an observation to an EPOCH on the basis of date/time data alone. If it is not possible to determine theEPOCH of an observation, then EPOCH should be null. Methods for assigning EPOCH values can be described in the Define-XML document.

Since EPOCH is a study-design construct, it is not applicable to Interventions or Events that started before the subject's participation in the study, nor to Findings performed before their participation in the study. For such records, EPOCH should be null. Note that a subject's participation in a study includes screening, which generally occurs before the reference start date, RFSTDTC, in the DM domain.

4.1.4 Order of the Variables

The order of variables in the Define-XML document must reflect the order of variables in the dataset. The order of variables in the CDISC domain models has been chosen to facilitate the review of the models and application of the models. Variables for the three general observation classes must be ordered with Identifiers first, followed by the Topic, Qualifier, and Timing variables. Within each role, variables must be ordered as shown in SDTM Tables 2.2.1, 2.2.2, 2.2.3, 2.2.3.1, 2.2.4, and 2.2.5.

4.1.5 SDTM Core Designations

Three categories are specified in the "Core" column in the domain models:

  • A Required variable is any variable that is basic to the identification of a data record (i.e., essential key variables and a topic variable) or is necessary to make the record meaningful. Required variables must always be included in the dataset and cannot be null for any record.
  • An Expected variable is any variable necessary to make a record useful in the context of a specific domain. Expected variables may contain some null values, but in most cases will not contain null values for every record. When the study does not include the data item for an expected variable, however, a null column must still be included in the dataset, and a comment must be included in the Define-XML document to state that the study does not include the data item.
  • A Permissible variable should be used in an SDTM dataset wherever appropriate. Although domain specification tables list only some of the Identifier, Timing, and general observation class variables listed in the SDTM, all are permissible unless specifically restricted in this implementation guide (see Section 2.7, SDTM Variables Not Allowed in SDTMIG) or by specific domain assumptions.
    • Domain assumptions that say a Permissible variable is "generally not used" do not prohibit use of the variable.
    • If a study includes a data item that would be represented in a Permissible variable, then that variable must be included in the SDTM dataset, even if null. Indicate no data were available for that variable in the Define-XML document.
    • If a study did not include a data item that would be represented in a Permissible variable, then that variable should not be included in the SDTM dataset and should not be declared in the Define-XML document.

4.1.6 Additional Guidance on Dataset Naming

SDTM datasets are normally named to be consistent with the domain code; for example, the Demographics dataset (DM) is named dm.xpt. (See the SDTM Domain Abbreviation codelist, C66734, in CDISC Controlled Terminology (https://www.cancer.gov/research/resources/terminology/cdisc) for standard domain codes). Exceptions to this rule are described in Section 4.1.7, Splitting Domains, for general-observation-class datasets and in Section 8, Representing Relationships and Data, for the RELREC and SUPP-- datasets.

In some cases, sponsors may need to define new custom domains and may be concerned that CDISC domain codes defined in the future will conflict with those they choose to use. To eliminate any risk of a sponsor using a name that CDISC later determines to have a different meaning, domain codes beginning with the letters X, Y, or Z have been reserved for the creation of custom domains. Any letter or number may be used in the second position. Note the use of codes beginning with X, Y, or Z is optional, and not required for custom domains.

4.1.7 Splitting Domains

Sponsors may choose to split a domain of topically related information into physically separate datasets.

  • A domain based on a general observation class may be split according to values in --CAT. When a domain is split on --CAT, --CAT must not be null.
  • The Findings About (FA) domain (Section 6.4.4, Findings About) may alternatively be split based on the domain of the value in --OBJ. For example, FACM would store Findings About CM records. See Section 6.4.2, Naming Findings About Domains, for more details.

The following rules must be adhered to when splitting a domain into separate datasets to ensure they can be appended back into one domain dataset:

  1. The value of DOMAIN must be consistent across the separate datasets as it would have been if they had not been split (e.g., QS, FA).
  2. All variables that require a domain prefix (e.g., --TESTCD, --LOC) must use the value of DOMAIN as the prefix value (e.g., QS, FA).
  3. --SEQ must be unique within USUBJID for all records across all the split datasets. If there are 1000 records for a USUBJID across the separate datasets, all 1000 records need unique values for --SEQ.
  4. When relationship datasets (e.g., SUPPxx, FAxx, CO, RELREC) relate back to split parent domains, IDVAR would generally be --SEQ. When IDVAR is a value other than --SEQ (e.g., --GRPID, --REFID, --SPID), care should be used to ensure that the parent records across the split datasets have unique values for the variable specified in IDVAR, so that related children records do not accidentally join back to incorrect parent records.
  5. Permissible variables included in one split dataset need not be included in all split datasets.
  6. For domains with two-letter domain codes (i.e., other than SUPP and RELREC), split dataset names can be up to four characters in length. For example, if splitting by --CAT, then dataset names would be the domain name plus up to two additional characters (e.g., QS36 for SF-36). If splitting Findings About by parent domain, then the dataset name would be the domain code, "FA", plus the two-character domain code for parent domain code (e.g., "FACM"). The four-character dataset-name limitation allows the use of a Supplemental Qualifier dataset associated with the split dataset.
  7. Supplemental Qualifier datasets for split domains would also be split. The nomenclature would include the additional one-to-two characters used to identify the split dataset (e.g., SUPPQS36, SUPPFACM). The value of RDOMAIN in the SUPP-- datasets would be the two-character domain code (e.g., QS, FA).
  8. In RELREC, if a dataset-level relationship is defined for a split Findings About domain, then RDOMAIN may contain the four-character dataset name, rather than the domain name "FA", as shown in the following example

    relrec.xpt

    RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
    1ABCCM
    CMSPID
    ONE1
    2ABCFACM
    FASPID
    MANY1
  9. See the SDTM Implementation Guide for Associated Persons for the naming of split associated persons datasets.
  10. See the SDTM Define-XML specification for details regarding metadata representation when a domain is split into different datasets. Additional examples can be referenced in the Metadata Submission Guidelines (MSG) for SDTMIG.

Note that submission of split SDTM domains may be subject to additional dataset splitting conventions as defined by regulators via technical specifications and/or as negotiated with regulatory reviewers.

4.1.7.1 Example of Splitting Questionnaires

This example shows the QS domain data split into three datasets: Clinical Global Impression (QSCG), Cornell Scale for Depression in Dementia (QSCS) and Mini Mental State Examination (QSMM). Each dataset represents a subset of the QS domain data and has only one value of QSCAT.

QS Domains

Dataset for Clinical Global Impressions

qscg.xpt

RowSTUDYIDDOMAINUSUBJIDQSSEQQSSPIDQSTESTCDQSTESTQSCATQSORRESQSSTRESCQSSTRESNQSBLFLVISITNUMVISITVISITDYQSDTCQSDY
1CDISC01QSCDISC01.1000081CGI-CGI-ICGIGLOBGlobal ImprovementClinical Global ImpressionsNo change44
3WEEK 2152003-05-1315
2CDISC01QSCDISC01.1000082CGI-CGI-ICGIGLOBGlobal ImprovementClinical Global ImpressionsMuch Improved22
10WEEK 241692003-10-13168
3CDISC01QSCDISC01.1000141CGI-CGI-ICGIGLOBGlobal ImprovementClinical Global ImpressionsMinimally Improved33
3WEEK 2152003-10-3117
4CDISC01QSCDISC01.1000142CGI-CGI-ICGIGLOBGlobal ImprovementClinical Global ImpressionsMinimally Improved33
10WEEK 241692004-03-30168

Dataset for Cornell Scale for Depression in Dementia

qscs.xpt

RowSTUDYIDDOMAINUSUBJIDQSSEQQSSPIDQSTESTCDQSTESTQSCATQSORRESQSSTRESCQSSTRESNQSBLFLVISITNUMVISITVISITDYQSDTCQSDY
1CDISC01QSCDISC01.1000083CSDD-01CSDD01AnxietyCornell Scale for Depression in DementiaSevere22
1SCREEN-132003-04-15-14
2CDISC01QSCDISC01.10000823CSDD-01CSDD01AnxietyCornell Scale for Depression in DementiaSevere22Y2BASELINE12003-04-291
3CDISC01QSCDISC01.1000143CSDD-01CSDD01AnxietyCornell Scale for Depression in DementiaSevere22
1SCREEN-132003-10-06-9
4CDISC01QSCDISC01.10001428CSDD-06CSDD06RetardationCornell Scale for Depression in DementiaMild11Y2BASELINE12003-10-151

Dataset for Mini Mental State Examination

qsmm.xpt

RowSTUDYIDDOMAINUSUBJIDQSSEQQSSPIDQSTESTCDQSTESTQSCATQSORRESQSSTRESCQSSTRESNQSBLFLVISITNUMVISITVISITDYQSDTCQSDY
1CDISC01QSCDISC01.10000881MMSE-A.1MMSEA1Orientation Time ScoreMini Mental State Examination444
1SCREEN-132003-04-15-14
2CDISC01QSCDISC01.10000888MMSE-A.1MMSEA1Orientation Time ScoreMini Mental State Examination333Y2BASELINE12003-04-291
3CDISC01QSCDISC01.10001481MMSE-A.1MMSEA1Orientation Time scoreMini Mental State Examination222
1SCREEN-132003-10-06-9
4CDISC01QSCDISC01.10001488MMSE-A.1MMSEA1Orientation Time scoreMini Mental State Examination222Y2BASELINE12003-10-151

SUPPQS Domains

Supplemental Qualifiers for QSCG

suppqscg.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1CDISC01QSCDISC01.100008QSCATClinical Global ImpressionsQSLANGQuestionnaire LanguageGERMANCRF
2CDISC01QSCDISC01.100014QSCATClinical Global ImpressionsQSLANGQuestionnaire LanguageFRENCHCRF

Supplemental Qualifiers for QSCS

suppqscs.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1CDISC01QSCDISC01.100008QSCATCornell Scale for Depression in DementiaQSLANGQuestionnaire LanguageGERMANCRF
2CDISC01QSCDISC01.100014QSCATCornell Scale for Depression in DementiaQSLANGQuestionnaire LanguageFRENCHCRF

Supplemental Qualifiers for QSMM

suppqsmm.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1CDISC01QSCDISC01.100008QSCATMini Mental State ExaminationQSLANGQuestionnaire LanguageGERMANCRF
2CDISC01QSCDISC01.100014QSCATMini Mental State ExaminationQSLANGQuestionnaire LanguageFRENCHCRF

4.1.8 Origin Metadata

4.1.8.1 Origin Metadata for Variables

The origin element in the Define-XML document file is used to indicate where the data originated. Its purpose is to unambiguously communicate to the reviewer the origin of the data source. For example, data could be on the CRF (and thus should be traceable to an annotated CRF), derived (and thus traceable to some derivation algorithm), or assigned by some subjective process (and thus traceable to some external evaluator). The Define-XML specification is the definitive source of allowable origin values. Additional guidance and supporting examples can be referenced using the Metadata Submission Guidelines (MSG) for SDTMIG.

4.1.8.2 Origin Metadata for Records

Sponsors are cautioned to recognize that a derived origin means that all values for that variable were derived, and that collected on the CRF applies to all values as well. In some cases, both collected and derived values may be reported in the same field. For example, some records in a Findings dataset such as QS contain values collected from the CRF; other records may contain derived values, such as a total score. When both derived and collected values are reported in a variable, the origin is to be described using value-level metadata.

4.1.9 Assigning Natural Keys in the Metadata

Section 3.2, Using the CDISC Domain Models in Regulatory Submissions — Dataset Metadata, indicates that a sponsor should include in the metadata the variables that contribute to the natural key for a domain. In a case where a dataset includes a mix of records with different natural keys, the natural key that provides the most granularity is the one that should be provided. The following examples are illustrations of how to do this, and include a case where a Supplemental Qualifier variable is referenced because it forms part of the natural key.

Musculoskeletal System Findings (MK) domain example:

Sponsor A chooses the following natural key for the MK domain:

STUDYID, USUBJID, VISTNUM, MKTESTCD

Sponsor B collects data in such a way that the location (MKLOC and MKLAT) and method (MKMETHOD) variables need to be included in the natural key to identify a unique row. Sponsor B then defines the following natural key for the MK domain.

STUDYID, USUBJID, VISITNUM, MKTESTCD, MKLOC, MKLAT, MKMETHOD

In certain instances a Supplemental Qualifier variable (i.e., a QNAM value, see Section 8.4, Relating Non-Standard Variables Values to a Parent Domain) might also contribute to the natural key of a record, and therefore needs to be referenced as part of the natural key for a domain. The important concept here is that a domain is not limited by physical structure. A domain may be comprised of more than one physical dataset, for example the main domain dataset and its associated Supplemental Qualifiers dataset. Supplemental Qualifiers variables should be referenced in the natural key by using a two-part name. The word QNAM must be used as the first part of the name to indicate that the contributing variable exists in a domain-specific SUPP-- and the second part is the value of QNAM that ultimately becomes a column reference (e.g., QNAM.XVAR when the SUPP-- record has a QNAM of "XVAR") when the SUPPQUAL records are joined on to the main domain dataset.

Continuing with the MK domain example above:

Sponsor B might have collected data that used different imaging methods, using imaging devices with different makes and models, and using different hand positions. The sponsor considers the make and model information and hand position to be essential data that contributes to the uniqueness of the test result, and so includes a device identifier (SPDEVID) in the data and creates a Supplemental Qualifier variable for hand position (QNAM = "MKHNDPOS"). The natural key is then defined as follows:

STUDYID, USUBJID, SPDEVID, VISITNUM, MKTESTCD, MKLOC, MKLAT, MKMETHOD, QNAM.MKHNDPOS

Where the notation "QNAM.MKHNDPOS" means the Supplemental Qualifier whose QNAM is "MKHNDPOS".

This approach becomes very useful in a Findings domain when --TESTCD values are "generic" and rely on other variables to completely describe the test. The use of generic test codes helps to create distinct lists of manageable controlled terminology for --TESTCD. In studies where multiple repetitive tests or measurements are being made, for example in a rheumatoid arthritis study where repetitive measurements of bone erosion in the hands and wrists might be made using both X-ray and MRI equipment, the generic MKTEST "Sharp/Genant Bone Erosion Score" would be used in combination with other variables to fully identify the result.

Taking just the phalanges, a sponsor might want to express the following in a test in order to make it unique:

  • Left or Right hand
  • Phalangeal joint position (which finger, which joint)
  • Rotation of the hand
  • Method of measurement (X-ray or MRI)
  • Machine make and model

When CDISC controlled terminology for a test is not available, and a sponsor creates --TEST and --TESTCD values, trying to encapsulate all information about a test within a unique value of a --TESTCD is not a recommended approach for the following reasons:

  • It results in the creation of a potentially large number of test codes.
  • The eight-character values of --TESTCD become less intuitively meaningful.
  • Multiple test codes are essentially representing the same test or measurement simply to accommodate attributes of a test within the --TESTCD value itself (e.g., to represent a body location at which a measurement was taken).

As a result, the preferred approach would be to use a generic (or simple) test code that requires associated qualifier variables to fully express the test detail. This approach was used in creating the CDISC controlled terminology that would be used in the above example:

The MKTESTCD value "SGBESCR" is a "generic" test code, and additional information about the test is provided by separate qualifier variables. The variables that completely specify a test may include domain variables and supplemental qualifier variables. Expressing the natural key becomes very important in this situation in order to communicate the variables that contribute to the uniqueness of a test.

The following variables would be used to fully describe the test. The natural key for this domain includes both parent dataset variables and a supplemental qualifier variable that contribute to the natural key of each row and to describe the uniqueness of the test.

SPDEVIDMKTESTCDMKTESTMKLOCMKLATMKMETHODQNAM.MKHNDPOS
ACME3000SGBESCRSharp/Genant Bone Erosion ScoreMETACARPOPHALANGEAL JOINT 1LEFTX-RAYPALM UP

4.2 General Variable Assumptions

4.2.1 Variable-Naming Conventions

SDTM variables are named according to a set of conventions, using fragment names (listed in Appendix D, CDISC Variable-Naming Fragments). Variables with names ending in "CD" are "short" versions of associated variables that do not include the "CD" suffix (e.g., --TESTCD is the short version of --TEST).

Values of --TESTCD must be limited to eight characters and cannot start with a number, nor can they contain characters other than letters, numbers, or underscores. This is to avoid possible incompatibility with SAS v5 Transport files. This limitation will be in effect until the use of other formats (such as Dataset-XML) becomes acceptable to regulatory authorities.

QNAM serves the same purpose as --TESTCD within supplemental qualifier datasets, and so values of QNAM are subject to the same restrictions as values of --TESTCD.

Values of other "CD" variables are not subject to the same restrictions as --TESTCD.

  • ETCD (the companion to ELEMENT) and TSPARMCD (the companion to TSPARM) are limited to eight characters and do not have the character restrictions that apply to --TESTCD. These values should be short for ease of use in programming, but it is not expected that they will need to serve as variable names.
  • ARMCD is limited to 20 characters and does not have the character restrictions that apply to --TESTCD. The maximum length of ARMCD is longer than for other "short" variables to accommodate the kind of values that are likely to be needed for crossover trials. For example, if ARMCD values for a seven-period crossover were constructed using two-character abbreviations for each treatment and separating hyphens, the length of ARMCD values would be 20. This same rule applies to the ACTARMCD variable also.

Variable descriptive names (labels), up to 40 characters, should be provided as data variable labels for all variables, including Supplemental Qualifier variables.

Use of variable names (other than domain prefixes), formats, decodes, terminology, and data types for the same type of data (even for custom domains and Supplemental Qualifiers) should be consistent within and across studies within a submission.

4.2.2 Two-Character Domain Identifier

In order to minimize the risk of difficulty when merging/joining domains for reporting purposes, the two-character Domain Identifier is used as a prefix in most variable names.

Variables in domain specification tables (see Section 5, Models for Special Purpose Domains, Section 6, Domain Models Based on the General Observation Classes, Section 7, Trial Design Model Datasets, Section 8, Representing Relationships and Data, and Section 9, Study References) already specify the complete variable names. When adding variables from the SDTM to standard domains or creating custom domains based on the General Observation Classes, sponsors must replace the -- (two hyphens) prefix in the SDTM tables of General Observation Class, Timing, and Identifier variables with the two-character Domain Identifier (DOMAIN) value for that domain/dataset. The two-character domain code is limited to A-Z for the first character, and A-Z, 0-9 for the second character. No other characters are allowed. This is for compatibility with SAS version 5 Transport files and with file naming for the Electronic Common Technical Document (eCTD).

The following variables are exceptions to the philosophy that all variable names are prefixed with the Domain Identifier:

  • Required Identifiers (STUDYID, DOMAIN, USUBJID)
  • Commonly used grouping and merge Keys (e.g., VISIT, VISITNUM, VISITDY)
  • All Demographics domain (DM) variables other than DMDTC and DMDY
  • All variables in RELREC and SUPPQUAL, and some variables in Comments and Trial Design datasets.

Required Identifiers are not prefixed because they are usually used as keys when merging/joining observations. The --SEQ and the optional Identifiers --GRPID and --REFID are prefixed because they may be used as keys when relating observations across domains.

4.2.3 Use of "Subject" and USUBJID

"Subject" is used to generically refer to both "patients" and "healthy volunteers" in order to be consistent with the recommendation in FDA guidance. The term "Subject" should be used consistently in all labels and Define-XML document comments. To identify a subject uniquely across all studies for all applications or submissions involving the product, a unique identifier (USUBJID) should be assigned and included in all datasets.

The unique subject identifier (USUBJID) is required in all datasets containing subject-level data. USUBJID values must be unique for each trial participant (subject) across all trials in the submission. This means that no two (or more) subjects, across all trials in the submission, may have the same USUBJID. Additionally, the same person who participates in multiple clinical trials (when this is known) must be assigned the same USUBJID value in all trials.

The below dm.xpt sample rows illustrate a single subject who participates in two studies, first in ACME01 and later in ACME14. Note that this is only one example of the possible values for USUBJID. CDISC does not recommend any specific format for the values of USUBJID, only that the values need to be unique for all subjects in the submission, and across multiple submissions for the same compound. Many sponsors concatenate values for the Study, Site and Subject into USUBJID, but this is not a requirement. It is acceptable to use any format for USUBJID, as long as the values are unique across all subjects per FDA guidance.

Study ACME01 dm.xpt

dm.xpt

RowSTUDYIDDOMAINUSUBJIDSUBJIDSITEIDINVNAM
1ACME01DMACME01-05-00100105John Doe

Study ACME14 dm.xpt

dm.xpt

RowSTUDYIDDOMAINUSUBJIDSUBJIDSITEIDINVNAM
1ACME14DMACME01-05-00101714Mary Smith

4.2.4 Text Case in Submitted Data

It is recommended that text data be submitted in upper case text. Exceptions may include long text data (such as comment text) and values of --TEST in Findings datasets (which may be more readable in title case if used as labels in transposed views). Values from CDISC controlled terminology or external code systems (e.g., MedDRA) or response values for QRS instruments specified by the instrument documentation should be in the case specified by those sources, which may be mixed case. The case used in the text data must match the case used in the Controlled Terminology provided in the Define-XML document.

4.2.5 Convention for Missing Values

Missing values for individual data items should be represented by nulls. Conventions for representing observations not done, using the SDTM --STAT and --REASND variables, are addressed in Section 4.5.1.2, Tests Not Done and the individual domain models.

4.2.6 Grouping Variables and Categorization

Grouping variables are Identifiers and Qualifiers variables, such as the --CAT (Category) and --SCAT (Subcategory), that group records in the SDTM domains/datasets and can be assigned by sponsors to categorize topic-variable values. For example, a lab record with LBTEST = "SODIUM" might have LBCAT = "CHEMISTRY" and LBSCAT = "ELECTROLYTES". Values for --CAT and --SCAT should not be redundant with the domain name or dictionary classification provided by --DECOD and --BODSYS.

1. Hierarchy of Grouping Variables

STUDYID
DOMAIN

--CAT


--SCAT



USUBJID




--GRPID
--LNKID
--LNKGRP

2. How Grouping Variables Group Data

A. For the subject

  1. All records with the same USUBJID value are a group of records that describe that subject.

B. Across subjects (records with different USUBJID values)

  1. All records with the same STUDYID value are a group of records that describe that study.
  2. All records with the same DOMAIN value are a group of records that describe that domain.
  3. --CAT (Category) and --SCAT (Sub-category) values further subset groups within the domain. Generally, --CAT/--SCAT values have meaning within a particular domain. However, it is possible to use the same values for --CAT/--SCAT in related domains (e.g., MH and AE). When values are used across domains, the meanings should be the same. Examples of where --CAT/--SCAT may have meaning across domains/datasets:
    1. Cases where different domains in the same general observation class contain similar conceptual information. Adverse Events (AE), Medical History (MH), and Clinical Events (CE), for example, are conceptually the same data, the only differences being when the event started relative to the study start and whether the event is considered a regulatory reportable adverse event in the study. Neurotoxicities collected in Oncology trials both as separate Medical History CRFs (MH domain) and Adverse Event CRFs (AE domain) could both identify/collect "Paresthesia of the left Arm". In both domains, the --CAT variable could have the value of NEUROTOXICITY.
    2. Cases where multiple datasets are necessary to capture data about the same topic. As an example, perhaps the existence and start and stop date of "Paresthesia of the left Arm" is reported as an Adverse Event (AE domain), but the severity of the event is captured at multiple visits and recorded as Findings About (FA dataset). In both cases the --CAT variable could have a value of NEUROTOXICITY.
    3. Cases where multiple domains are necessary to capture data that was collected together and have an implicit relationship, perhaps identified in the Related Records (RELREC) special purpose dataset.

      Stress Test data collection, for example, may capture the following:

      1. Information about the occurrence, start, stop, and duration of the test (in the PR domain)
      2. Vital Signs recorded during the stress test (VS domain)
      3. Treatments (e.g., oxygen) administered during the stress test (in an Interventions domain)

      In such cases, the data collected during the stress tests recorded in three separate domains may all have --CAT/--SCAT values (STRESS TEST) that identify that this data was collected during the stress test.

C. Within subjects (records with the same USUBJID values)

  1. --GRPID values further group (subset) records within USUBJID. All records in the same domain with the same --GRPID value are a group of records within USUBJID. Unlike --CAT and --SCAT, --GRPID values are not intended to have any meaning across subjects and are usually assigned during or after data collection.

D. Although --SPID and --REFID are Identifier variables, they may sometimes be used as grouping variables and may also have meaning across domains.

E. --LNKID and --LNKGRP express values that are used to link records in separate domains. As such, these variables are often used in IDVAR in a RELREC relationship when there is a dataset-to-dataset relationship.

  1. --LNKID is a grouping identifier used to identify a record in one domain that is related to records in another domain, often forming a one-to-many relationship.
  2. --LNKGRP is a grouping identifier used to identify a group of records in one domain that is related to a record in another domain, often forming a many-to-one relationship.

3. Differences between Grouping Variables

The primary distinctions between --CAT/--SCAT and --GRPID are:

  • --CAT/--SCAT are known (identified) about the data before it is collected.
  • --CAT/--SCAT values group data across subjects.
  • --CAT/--SCAT may have some controlled terminology.
  • --GRPID is usually assigned during or after data collection at the discretion of the sponsor.
  • --GRPID groups data only within a subject.
  • --GRPID values are sponsor-defined, and will not be subject to controlled terminology.

Therefore, data that would be the same across subjects is usually more appropriate in --CAT/--SCAT, and data that would vary across subjects is usually more appropriate in --GRPID. For example, a Concomitant Medication administered as part of a known combination therapy for all subjects ("Mayo Clinic Regimen", for example) would more appropriately use --CAT/--SCAT to identify the medication as part of that regimen. Groups of medications recorded on an SAE form as treatments for the SAE would more appropriately use --GRPID because the groupings are likely to differ across subjects.

In domains based on the Findings general observation class, the --RESCAT variable can be used to categorize results after the fact. --CAT and --SCAT by contrast, are generally pre-defined by the sponsor or used by the investigator at the point of collection, not after assessing the value of Findings results.

4.2.7 Submitting Free Text from the CRF

Sponsors often collect free text data on a CRF to supplement a standard field. This often occurs as part of a list of choices accompanied by "Other, specify." The manner in which these data are submitted will vary based on their role. The handling of verbatim text values for the ---OBJ variable is discussed in Section 6.4.3 Variables Unique to Findings About.

4.2.7.1 "Specify" Values for Non-Result Qualifier Variables

When free-text information is collected to supplement a standard non-result Qualifier field, the free-text value should be placed in the SUPP-- dataset described in Section 8.4, Relating Non-Standard Variables Values to a Parent Domain. When applicable, controlled terminology should be used for SUPP-- field names (QNAM) and their associated labels (QLABEL) (see Section 8.4, Relating Non-Standard Variables Values to a Parent Domain and Appendix C2, Supplemental Qualifiers Name Codes).

For example, when a description of "Other Medically Important Serious Adverse Event" category is collected on a CRF, the free text description should be stored in the SUPPAE dataset.

  • AESMIE = "Y"
  • SUPPAE QNAM = "AESOSP", QLABEL = "Other Medically Important SAE", QVAL = "HIGH RISK FOR ADDITIONAL THROMBOSIS"

Another example is a CRF that collects reason for dose adjustment with additional free-text description:

Reason for Dose Adjustment (EXADJ)Describe
  • Adverse Event
 
  • Insufficient Response
 
  • Non-medical Reason
 

The free text description should be stored in the SUPPEX dataset.

  • EXADJ = "NONMEDICAL REASON"
  • SUPPEX QNAM = "EXADJDSC", QLABEL = "Reason For Dose Adjustment Description", QVAL = "PATIENT MISUNDERSTOOD INSTRUCTIONS"

    Note that QNAM references the "parent" variable name with the addition of "DSC". Likewise, the label is a modification of the parent variable label.

When the CRF includes a list of values for a qualifier field that includes "Other" and the "Other" is supplemented with a "Specify" free text field, then the manner in which the free text "Specify" value is submitted will vary based on the sponsor's coding practice and analysis requirements.

For example, consider a CRF that collects the indication for an analgesic concomitant medication (CMINDC) using a list of pre-specified values and an "Other, specify" field :

Indication for analgesic
  • , specify: ________________

An investigator has selected "OTHER" and specified "Broken arm". Several options are available for submission of this data:

1) If the sponsor wishes to maintain controlled terminology for the CMINDC field and limit the terminology to the five pre-specified choices, then the free text is placed in SUPPCM.

CMINDC
OTHER
QNAMQLABELQVAL
CMINDOTHOther IndicationBROKEN ARM

2) If the sponsor wishes to maintain controlled terminology for CMINDC but will expand the terminology based on values seen in the specify field, then the value of CMINDC will reflect the sponsor's coding decision and SUPPCM could be used to store the verbatim text.

CMINDC
FRACTURE
QNAMQLABELQVAL
CMINDOTHOther IndicationBROKEN ARM

Note that the sponsor might choose a different value for CMINDC (e.g., "BONE FRACTURE") depending on the sponsor's coding practice and analysis requirements.

3) If the sponsor does not require that controlled terminology be maintained and wishes for all responses to be stored in a single variable, then CMINDC will be used and SUPPCM is not required.

CMINDC
BROKEN ARM

4.2.7.2 "Specify" Values for Result Qualifier Variables

When the CRF includes a list of values for a result field that includes "Other" and the "Other" is supplemented with a "Specify" free text field, then the manner in which the free text "Specify" value is submitted will vary based on the sponsor's coding practice and analysis requirements.

For example, consider a CRF where the sponsor requests the subject's eye color:

Eye Color
  • , specify: ________________

An investigator has selected "OTHER" and specified "BLUEISH GRAY". As in the above discussion for non-result Qualifier values, the sponsor has several options for submission:

1) If the sponsor wishes to maintain controlled terminology in the standard result field and limit the terminology to the five pre-specified choices, then the free text is placed in --ORRES and the controlled terminology in --STRESC.

SCTESTSCORRESSCSTRESC
Eye ColorBLUEISH GRAYOTHER

2) If the sponsor wishes to maintain controlled terminology in the standard result field, but will expand the terminology based on values seen in the specify field, then the free text is placed in --ORRES and the value of --STRESC will reflect the sponsor's coding decision.

SCTESTSCORRESSCSTRESC
Eye ColorBLUEISH GRAYGRAY

3) If the sponsor does not require that controlled terminology be maintained, the verbatim value will be copied to --STRESC.

SCTESTSCORRESSCSTRESC
Eye ColorBLUEISH GRAYBLUEISH GRAY

4.2.7.3 "Specify" Values for Topic Variables

Interventions: If a list of specific treatments is provided along with "Other, Specify", --TRT should be populated with the name of the treatment found in the specified text. If the sponsor wishes to distinguish between the pre-specified list of treatments and those recorded under "Other, Specify," the --PRESP variable could be used. For example:

Indicate which of the following concomitant medications
was used to treat the subject's headaches:
  • , specify: ________________

If ibuprofen and diclofenac were reported, the CM dataset would include the following:

CMTRTCMPRESP
IBUPROFENY
DICLOFENAC

Events: "Other, Specify" for Events may be handled similarly to Interventions. --TERM should be populated with the description of the event found in the specified text and --PRESP could be used to distinguish between prespecified and free text responses.

Findings: "Other, Specify" for tests may be handled similarly to Interventions. --TESTCD and --TEST should be populated with the code and description of the test found in the specified text. If specific tests are not prespecified on the CRF and the investigator has the option of writing in tests, then the name of the test would have to be coded to ensure that all --TESTCD and --TEST values are consistent with the test controlled terminology.

For example, a lab CRF collected values for Hemoglobin, Hematocrit and "Other, specify". The value the investigator wrote for "Other, specify" was "Prothrombin time" with an associated result and units. The sponsor would submit the controlled terminology for this test, i.e., LBTESTCD would be "PT" and LBTEST would be "Prothrombin Time", rather than the verbatim term, "Prothrombin time" supplied by the investigator.

4.2.8 Multiple Values for a Variable

4.2.8.1 Multiple Values for an Intervention or Event Topic Variable

If multiple values are reported for a topic variable (i.e., --TRT in an Interventions general-observation-class dataset or --TERM in an Events general-observation-class dataset), it is expected that the sponsor will split the values into multiple records or otherwise resolve the multiplicity per the sponsor's standard data management procedures. For example, if an adverse event term of "Headache and Nausea" or a concomitant medication of "Tylenol and Benadryl" is reported, sponsors will often split the original report into separate records and/or query the site for clarification. By the time of submission, the datasets should be in conformance with the record structures described in the SDTMIG. Note that the Disposition dataset (DS) is an exception to the general rule of splitting multiple topic values into separate records. For DS, one record for each disposition or protocol milestone is permitted according to the domain structure. For cases of multiple reasons for discontinuation see Section 6.2.3, Disposition, Assumption 5 for additional information.

4.2.8.2 Multiple Values for a Findings Result Variable

If multiple result values (--ORRES) are reported for a test in a Findings class dataset, multiple records should be submitted for that --TESTCD.

For example,

  • EGTESTCD = "SPRTARRY", EGTEST = "Supraventricular Tachyarrhythmias", EGORRES = "ATRIAL FIBRILLATION"
  • EGTESTCD = "SPRTARRY", EGTEST = "Supraventricular Tachyarrhythmias", EGORRES = "ATRIAL FLUTTER"

When a finding can have multiple results, the key structure for the findings dataset must be adequate to distinguish between the multiple results. See Section 4.1.9 Assigning Natural Keys in the Metadata.

4.2.8.3 Multiple Values for a Non-Result Qualifier Variable

The SDTM permits one value for each Qualifier variable per record. If multiple values exist (e.g., due to a "Check all that apply" instruction on a CRF), then the value for the Qualifier variable should be "MULTIPLE" and SUPP-- should be used to store the individual responses. It is recommended that the SUPP-- QNAM value reference the corresponding standard domain variable with an appended number or letter. In some cases, the standard variable name will be shortened to meet the 8-character variable name requirement, or it may be clearer to append a meaningful character string as shown in the second AE example below, where the first three characters of the drug name are appended. Likewise the QLABEL value should be similar to the standard label. The values stored in QVAL should be consistent with the controlled terminology associated with the standard variable. See Section 8.4, Relating Non-Standard Variables Values to a Parent Domain for additional guidance on maintaining appropriately unique QNAM values.

The following example includes selected variables from the ae.xpt and suppae.xpt datasets for a rash whose locations are the face, neck, and chest.

AE Dataset

AETERMAELOC
RASHMULTIPLE

SUPPAE Dataset

QNAMQLABELQVAL
AELOC1Location of the Reaction 1FACE
AELOC2Location of the Reaction 2NECK
AELOC3Location of the Reaction 3CHEST

In some cases, values for QNAM and QLABEL more specific than those above may be needed.

For example, a sponsor might conduct a study with two study drugs (e.g., open-label study of Abcicin + Xyzamin), and may require the investigator assess causality and describe action taken for each drug for the rash:

AE Dataset

AETERMAERELAEACN
RASHMULTIPLEMULTIPLE

SUPPAE Dataset

QNAMQLABELQVAL
AERELABCCausality of AbcicinPOSSIBLY RELATED
AERELXYZCausality of XyzaminUNLIKELY RELATED
AEACNABCAction Taken with AbcicinDOSE REDUCED
AEACNXYZAction Taken with XyzaminDOSE NOT CHANGED

In each of the above examples, the use of SUPPAE should be documented in the Define-XML document and the annotated CRF. The controlled terminology used should be documented as part of value-level metadata.

If the sponsor has clearly documented that one response is of primary interest (e.g., in the CRF, protocol, or analysis plan), the standard domain variable may be populated with the primary response and SUPP-- may be used to store the secondary response(s).

For example, if Abcicin is designated as the primary study drug in the example above:

AE Dataset

AETERMAERELAEACN
RASHPOSSIBLY RELATEDDOSE REDUCED

SUPPAE Dataset

QNAMQLABELQVAL
AERELXCausality of XyzaminUNLIKELY RELATED
AEACNXAction Taken with XyzaminDOSE NOT CHANGED

Note that in the latter case, the label for standard variables AEREL and AEACN will have no indication that they pertain to Abcicin. This association must be clearly documented in the metadata and annotated CRF.

4.2.9 Variable Lengths

Very large transport files have become an issue for FDA to process. One of the main contributors to the large file sizes has been sponsors using the maximum length of 200 for character variables. To help rectify this situation:

  • The maximum SAS Version 5 character variable length of 200 characters should not be used unless necessary.
  • Sponsors should consider the nature of the data and apply reasonable, appropriate lengths to variables. For example:
    • The length of flags will always be 1.
    • --TESTCD and IDVAR will never be more than 8, so the length can always be set to 8.
    • The length for variables that use controlled terminology can be set to the length of the longest term.

4.3 Coding and Controlled Terminology Assumptions

Examples provided in the column "CDISC Notes" are only examples and not intended to imply controlled terminology. Check current controlled terminology at this link: http://www.cancer.gov/cancertopics/cancerlibrary/terminologyresources/cdisc.

4.3.1 Types of Controlled Terminology

As of SDTMIG v3.3, controlled terminology is represented one of the following ways:

  • A single asterisk, "*", when CDISC controlled terminology is not available at the current time, but the SDS Team expects that sponsors may have their own controlled terminology and/or the CDISC Controlled Terminology Team may develop controlled terminology in the future.
  • The single applicable value for the variable DOMAIN, e.g., "PR".
  • The name of a CDISC codelist, represented as a hyperlink in parentheses, e.g., "(NY)".
  • A short reference to an external terminology, such as "MedDRA" or "ISO 3166 Alpha-3".

In addition, the "Controlled Terms, Codelist or Format" column has been used to indicate variables that use an ISO 8601 format.

4.3.2 Controlled Terminology Text Case

Terms from controlled terminology should be in the case that appears the source codelist or code system (e.g., CDISC codelist or external code system such as MedDRA). See Section 4.2.4 Text Case in Submitted Data

4.3.3 Controlled Terminology Values

The controlled terminology or a reference to the controlled terminology should be included in the Define-XML document file wherever applicable. All values in the permissible value set for the study should be included, whether they are represented in the submitted data or not. Note that a null value should not be included in the permissible value set. A null value is implied for any list of controlled terms unless the variable is "Required" (see Section 4.1.5, SDTM Core Designations).

When a domain or datasetspecification includes a codelist for a variable, not every value in that codelist may have been part of planned data collection; only values that were part of planned data collection should be included in the Define-XML document. For example, --PRESP variables are associated with the NY codelist, but only the value "Y" is allowed in --PRESP variables. Future versions of the Define-XML Specification are expected to include information on representing subsets of controlled terminology.

4.3.4 Use of Controlled Terminology and Arbitrary Number Codes

Controlled terminology or human-readable text should be used instead of arbitrary number codes in order to reduce ambiguity for submission reviewers. For example, CMDECOD would contain human-readable dictionary text rather than a numeric code. Numeric code values may be submitted as Supplemental Qualifiers if necessary.

4.3.5 Storing Controlled Terminology for Synonym Qualifier Variables

  • For events such as AEs and Medical History, populate --DECOD with the dictionary's preferred term and populate --BODSYS with the preferred body system name. If a dictionary is multi-axial, the value in --BODSYS should represent the system organ class (SOC) used for the sponsor's analysis and summary tables, which may not necessarily be the primary SOC. Populate --SOC with the dictionary-derived primary SOC. In cases where the primary SOC was used for analysis, --BODSYS and --SOC are the same.
  • If the MedDRA dictionary was used to code events, the intermediate levels in the MedDRA hierarchy should also be represented in the dataset. A pair of variables has been defined for each of the levels of the hierarchy other than SOC and PT: one to represent the text description and the other to represent the code value associated with it. For example, --LLT should be used to represent the Lowest Level Term text description and --LLTCD should be used to represent the Lowest Level Term code value.
  • For concomitant medications, populate CMDECOD with the drug's generic name and populate CMCLAS with the drug class used for the sponsor's analysis and summary tables. If coding to multiple classes, follow Section 4.2.8.1, Multiple Values for an Intervention or Event Topic Variable, or omit CMCLAS.
  • For concomitant medications, supplemental qualifiers may be used to represent additional coding dictionary information, e.g., a drug's ATC codes from the WHO Drug dictionary (see Section 8.4, Relating Non-Standard Variables Values to a Parent Domain for more information).

The sponsor is expected to provide the dictionary name and version used to map the terms by utilizing the Define-XML external codelist attributes.

4.3.6 Storing Topic Variables for General Domain Models

The topic variable for the Interventions and Events general-observation-class models is often stored as verbatim text. For an Events domain, the topic variable is --TERM. For an Interventions domain, the topic variable is --TRT. For a Findings domain, the topic variable, --TESTCD, should use Controlled Terminology (e.g., "SYSBP" for Systolic Blood Pressure). If CDISC standard controlled terminology exists, it should be used; otherwise, sponsors should define their own controlled list of terms. If the verbatim topic variable in an Interventions or Event domain is modified to facilitate coding, the modified text is stored in --MODIFY. In most cases (other than PE), the dictionary-coded text is derived into --DECOD. Since the PEORRES variable is modified instead of the topic variable for PE, the dictionary-derived text would be placed in PESTRESC. The variables used in each of the defined domains are:

DomainOriginal VerbatimModified VerbatimStandardized Value
AEAETERMAEMODIFYAEDECOD
DSDSTERM
DSDECOD
CMCMTRTCMMODIFYCMDECOD
MHMHTERMMHMODIFYMHDECOD
PEPEORRESPEMODIFYPESTRESC

4.3.7 Use of "Yes" and "No" Values

Variables where the response is "Yes" or "No" ("Y" or "N") should normally be populated for both "Y" and "N" responses. This eliminates confusion regarding whether a blank response indicates "N" or is a missing value. However, some variables are collected or derived in a manner that allows only one response, such as when a single check box indicates "Yes". In situations such as these, where it is unambiguous to populate only the response of interest, it is permissible to populate only one value ("Y" or "N") and leave the alternate value blank. An example of when it would be acceptable to use only a value of "Y" would be for Last Observation Before Exposure Flag (--LOBXFL) variables, where "N" is not necessary to indicate that a value is not the last observation before exposure.

Note: Permissible values for variables with controlled terms of "Y" or "N" may be extended to include "U" or "NA" if it is the sponsor's practice to explicitly collect or derive values indicating "Unknown" or "Not Applicable" for that variable.

4.4 Actual and Relative Time Assumptions

Timing variables (SDTM Table 2.2.5) are an essential component of all SDTM subject-level domain datasets. In general, all domains based on the three general observation classes should have at least one Timing variable. In the Events or Interventions general observation class, this could be the start date of the event or intervention. In the Findings observation class, where data are usually collected at multiple visits, at least one Timing variable must be used.

The SDTMIG requires dates and times of day to be stored according to the international standard ISO 8601 (http://www.iso.org). ISO 8601 provides a text-based representation of dates and/or times, intervals of time, and durations of time.

4.4.1 Formats for Date/Time Variables

An SDTM DTC variable may include data that is represented in ISO 8601 format as a complete date/time, a partial date/time, or an incomplete date/time.

The SDTMIG template uses ISO 8601 for calendar dates and times of day, which are expressed as follows:

  • YYYY-MM-DDThh:mm:ss(.n+)?(((+|-)hh:mm)|Z)?

where:

  • [YYYY] = four-digit year
  • [MM] = two-digit representation of the month (01-12, 01=January, etc.)
  • [DD] = two-digit day of the month (01 through 31)
  • [T] = (time designator) indicates time information follows
  • [hh] = two digits of hour (00 through 23) (am/pm is NOT allowed)
  • [mm] = two digits of minute (00 through 59)
  • [ss] = two digits of second (00 through 59)
    The last two components, indicated in the format pattern with a question mark, are optional:
  • [(.n+)?] = optional fractions of seconds
  • [(((+|-)hh:mm)|Z)?] = optional time zone

Other characters defined for use within the ISO 8601 standard are:

  • [-] (hyphen): to separate the time Elements "year" from "month" and "month" from "day" and to represent missing date components.
  • [:] (colon): to separate the time Elements "hour" from "minute" and "minute" from "second"
  • [/] (solidus): to separate components in the representation of date/time intervals
  • [P] (duration designator): precedes the components that represent the duration

    Spaces are not allowed in any ISO 8601 representations

Key aspects of the ISO 8601 standard are as follows:

  • ISO 8601 represents dates as a text string using the notation YYYY-MM-DD.
  • ISO 8601 represents times as a text string using the notation hh:mm:ss(.n+)?(((+|-)hh:mm)|Z)?.
  • The SDTM and SDTMIG require use of the ISO 8601 Extended format, which requires hyphen delimiters for date components and colon delimiters for time components. The ISO 8601 basic format, which does not require delimiters, should not be used in SDTM datasets.
  • When a date is stored with a time in the same variable (as a date/time), the date is written in front of the time and the time is preceded with "T" using the notation YYYY-MM-DDThh:mm:ss (e.g. 2001-12-26T00:00:01).

Implementation of the ISO 8601 standard means that date/time variables are character/text data types. The SDTM fragment employed for date/time character variables is DTC.

4.4.2 Date/Time Precision

The concept of representing date/time precision is handled through use of the ISO 8601 standard. According to ISO 8601, precision (also referred to by ISO 8601 as "completeness" or "representations with reduced accuracy") can be inferred from the presence or absence of components in the date and/or time values. Missing components are represented by right truncation or a hyphen (for intermediate components that are missing). If the date and time values are completely missing, the SDTM date field should be null. Every component except year is represented as two digits. Years are represented as four digits; for all other components, one-digit numbers are always padded with a leading zero.

The table below provides examples of ISO 8601 representations of complete and truncated date/time values using ISO 8601 "appropriate right truncations" of incomplete date/time representations. Note that if no time component is represented, the [T] time designator (in addition to the missing time) must be omitted in ISO 8601 representation.


Date and Time as Originally RecordedPrecisionISO 8601 Date/Time
1December 15, 2003 13:14:17.123Date/time, including fractional seconds2003-12-15T13:14:17.123
2December 15, 2003 13:14:17Date/time to the nearest second2003-12-15T13:14:17
3December 15, 2003 13:14Unknown seconds2003-12-15T13:14
4December 15, 2003 13Unknown minutes and seconds2003-12-15T13
5December 15, 2003Unknown time2003-12-15
6December, 2003Unknown day and time2003-12
72003Unknown month, day, and time2003

This date and date/time model also provides for imprecise or estimated dates, such as those commonly seen in Medical History. To represent these intervals while applying the ISO 8601 standard, it is recommended that the sponsor concatenate the date/time values (using the most complete representation of the date/time known) that describe the beginning and the end of the interval of uncertainty and separate them with a solidus as shown in the table below:


Interval of UncertaintyISO 8601 Date/Time
1Between 10:00 and 10:30 on the morning of December 15, 20032003-12-15T10:00/2003-12-15T10:30
2Between the first of this year (2003) until "now" (February 15, 2003)2003-01-01/2003-02-15
3Between the first and the tenth of December, 20032003-12-01/2003-12-10
4Sometime in the first half of 20032003-01-01/2003-06-30

Other uncertainty intervals may be represented by the omission of components of the date when these components are unknown or missing. As mentioned above, ISO 8601 represents missing intermediate components through the use of a hyphen where the missing component would normally be represented. This may be used in addition to "appropriate right truncations" for incomplete date/time representations. When components are omitted, the expected delimiters must still be kept in place and only a single hyphen is to be used to indicate an omitted component. Examples of this method of omitted component representation are shown in the table below:

Date and Time as Originally RecordedLevel of UncertaintyISO 8601 Date/Time
1December 15, 2003 13:15:17Date/time to the nearest second2003-12-15T13:15:17
2December 15, 2003 ??:15Unknown hour with known minutes2003-12-15T-:15
3December 15, 2003 13:??:17Unknown minutes with known date, hours, and seconds2003-12-15T13:-:17
4The 15th of some month in 2003, time not collectedUnknown month and time with known year and day2003---15
5December 15, but can't remember the year, time not collectedUnknown year with known month and day--12-15
67:15 of some unknown dateUnknown date with known hour and minute-----T07:15

Note that Row 6 above, where a time is reported with no date information, represents a very unusual situation. Since most data is collected as part of a visit, when only a time appears on a CRF, it is expected that the date of the visit would usually be used as the date of collection.

Using a character-based data type to implement the ISO 8601 date/time standard will ensure that the date/time information will be machine and human readable without the need for further manipulation, and will be platform and software independent.

4.4.3 Intervals of Time and Use of Duration for --DUR Variables

4.4.3.1 Intervals of Time and Use of Duration

As defined by ISO 8601, an interval of time is the part of a time axis, limited by two time "instants" such as the times represented in SDTM by the variables --STDTC and --ENDTC. These variables represent the two instants that bound an interval of time, while the duration is the quantity of time that is equal to the difference between these time points.

ISO 8601 allows an interval to be represented in multiple ways. One representation, shown below, uses two dates in the format:

YYYY-MM-DDThh:mm:ss/YYYY-MM-DDThh:mm:ss

While the above would represent the interval (by providing the start date/time and end date/time to bound the interval of time), it does not provide the value of the duration (the quantity of time).

Duration is frequently used during a review; however, the duration timing variable (--DUR) should generally be used in a domain if it was collected in lieu of a start date/time (--STDTC) and end date/time (--ENDTC). If both --STDTC and --ENDTC are collected, durations can be calculated by the difference in these two values, and need not be in the submission dataset.

Both duration and duration units can be provided in the single --DUR variable, in accordance with the ISO 8601 standard. The values provided in --DUR should follow one of the following ISO 8601 duration formats:

PnYnMnDTnHnMnS

- or -

PnW

where:

  • [P] (duration designator): precedes the alphanumeric text string that represents the duration. Note that the use of the character P is based on the historical use of the term "period" for duration.
  • [n] represents a positive number or zero
  • [W] is used as week designator, preceding a data Element that represents the number of calendar weeks within the calendar year (e.g., P6W represents 6 weeks of calendar time).

The letter "P" must precede other values in the ISO 8601 representation of duration. The "n" preceding each letter represents the number of Years, Months, Days, Hours, Minutes, Seconds, or the number of Weeks. As with the date/time format, "T" is used to separate the date components from time components.

Note that weeks cannot be mixed with any other date/time components such as days or months in duration expressions.

As is the case with the date/time representation in --DTC, --STDTC, or --ENDTC, only the components of duration that are known or collected need to be represented. Also, as is the case with the date/time representation, if no time component is represented, the [T] time designator (in addition to the missing time) must be omitted in ISO 8601 representation.

ISO 8601 also allows that the "lowest-order components" of duration being represented may be represented in decimal format. This may be useful if data are collected in formats such as "one and one-half years", "two and one-half weeks", "one-half a week" or "one quarter of an hour" and the sponsor wishes to represent this "precision" (or lack of precision) in ISO 8601 representation. Remember that this is ONLY allowed in the lowest-order (right-most) component in any duration representation.

The table below provides some examples of ISO-8601-compliant representations of durations:

Duration as originally recordedISO 8601 Duration
2 YearsP2Y
10 weeksP10W
3 Months 14 daysP3M14D
3 DaysP3D
6 Months 17 Days 3 HoursP6M17DT3H
14 Days 7 Hours 57 MinutesP14DT7H57M
42 Minutes 18 SecondsPT42M18S
One-half hourPT0.5H
5 Days 12¼ HoursP5DT12.25H
4 ½ WeeksP4.5W

Note that a leading zero is required with decimal values less than one.

4.4.3.2 Interval with Uncertainty

When an interval of time is an amount of time (duration) following an event whose start date/time is recorded (with some level of precision, i.e. when one knows the start date/time and the duration following the start date/time), the correct ISO 8601 usage to represent this interval is as follows:

YYYY-MM-DDThh:mm:ss/PnYnMnDTnHnMnS

where the start date/time is represented before the solidus [/], the "Pn…" following the solidus represents a "duration", and the entire representation is known as an "interval". Note that this is the recommended representation of elapsed time, given a start date/time and the duration elapsed.

When an interval of time is an amount of time (duration) measured prior to an event whose start date/time is recorded (with some level of precision, i.e., where one knows the end date/time and the duration preceding that end date/time), the syntax is:

PnYnMnDTnHnMnS/YYYY-MM-DDThh:mm:ss

where the duration, "Pn…", is represented before the solidus [/], the end date/time is represented following the solidus, and the entire representation is known as an "interval".

4.4.4 Use of the "Study Day" Variables

The permissible Study Day variables (--DY, --STDY, and --ENDY) describe the relative day of the observation starting with the reference date as Day 1. They are determined by comparing the date portion of the respective date/time variables (--DTC, --STDTC, and --ENDTC) to the date portion of the Subject Reference Start Date (RFSTDTC from the Demographics domain).

The Subject Reference Start Date (RFSTDTC) is designated as Study Day 1. The Study Day value is incremented by 1 for each date following RFSTDTC. Dates prior to RFSTDTC are decreased by 1, with the date preceding RFSTDTC designated as Study Day -1 (there is no Study Day 0). This algorithm for determining Study Day is consistent with how people typically describe sequential days relative to a fixed reference point, but creates problems if used for mathematical calculations because it does not allow for a Day 0. As such, Study Day is not suited for use in subsequent numerical computations, such as calculating duration. The raw date values should be used rather than Study Day in those calculations.

All Study Day values are integers. Thus, to calculate Study Day:

--DY = (date portion of --DTC) - (date portion of RFSTDTC) + 1 if --DTC is on or after RFSTDTC 
--DY = (date portion of --DTC) - (date portion of RFSTDTC) if --DTC precedes RFSTDTC

This algorithm should be used across all domains.

4.4.5 Clinical Encounters and Visits

All domains based on the three general observation classes should have at least one timing variable. For domains in the Events or Interventions observation classes, and for domains in the Findings observation class, for which data are collected only once during the study, the most appropriate timing variable may be a date (e.g., --DTC, --STDTC) or some other timing variable. For studies that are designed with a prospectively defined schedule of visit-based activities, domains for data that are to be collected more than once per subject (e.g., Labs, ECG, Vital Signs) are expected to include VISITNUM as a timing variable.

Clinical encounters are described by the CDISC Visit variables. For planned visits, values of VISIT, VISITNUM, and VISITDY must be those defined in the Trial Visits (TV) dataset (Section 7.3.1, Trial Visits). For planned visits:

  • Values of VISITNUM are used for sorting and should, wherever possible, match the planned chronological order of visits. Occasionally, a protocol will define a planned visit whose timing is unpredictable (e.g., one planned in response to an adverse event, a threshold test value, or a disease event), and completely chronological values of VISITNUM may not be possible in such a case.
  • There should be a one-to-one relationship between values of VISIT and VISITNUM.
  • For visits that may last more than one calendar day, VISITDY should be the planned day of the start of the visit.

Sponsor practices for populating visit variables for unplanned visits may vary across sponsors.

  • VISITNUM should generally be populated, even for unplanned visits, as it is expected in many Findings domains, as described above. The easiest method of populating VISITNUM for unplanned visits is to assign the same value (e.g., 99) to all unplanned visits, but this method provides no differentiation between the unplanned visits and does not provide chronological sorting. Methods that provide a one-to-one relationship between visits and values of VISITNUM, that are consistent across domains, and that assign VISITNUM values that sort chronologically require more work and must be applied after all of a subject's unplanned visits are known.
  • VISIT may be left null or may be populated with a generic value (e.g., "Unscheduled") for all unplanned visits, or individual values may be assigned to different unplanned visits.
  • VISITDY must not be populated for unplanned visits, since VISITDY is, by definition, the planned study day of visit, and since the actual study day of an unplanned visit belongs in a --DY variable.

The following table shows an example of how the visit identifiers might be used for lab data:

USUBJIDVISITVISITNUMVISITDYLBDY
001Week 1277
001Week 231413
001Week 2 Unscheduled3.1
17

4.4.6 Representing Additional Study Days

The SDTM allows to represent study days relative to the RFSTDTC reference start date variable in the DM dataset, using variables --DY, as described above in Section 4.4.4, Use of the "Study Day" Variables. The calculation of additional study days within subdivisions of time in a clinical trial may be based on one or more sponsor-defined reference dates not represented by RFSTDTC. In such cases, the sponsor may define Supplemental Qualifier variables and the Define-XML document should reflect the reference dates used to calculate such study days. If the sponsor wishes to define "day within element" or "day within epoch", the reference date/time will be an element start date/time in the Subject Elements (SE) dataset (Section 5.3, Subject Elements).

4.4.7 Use of Relative Timing Variables

--STRF and --ENRF

The variables --STRF and --ENRF represent the timing of an observation relative to the sponsor-defined Study Reference Period, when information such as "BEFORE", "PRIOR", "ONGOING"', or "CONTINUING" is collected in lieu of a date and this collected information is in relation to the sponsor-defined Study Reference Period. The sponsor-defined Study Reference Period is the continuous period of time defined by the discrete starting point, RFSTDTC, and the discrete ending point, RFENDTC, for each subject in the Demographics dataset.

--STRF is used to identify the start of an observation relative to the sponsor-defined Study Reference Period.

--ENRF is used to identify the end of an observation relative to the sponsor-defined Study Reference Period.

Allowable values for --STRF are "BEFORE", "DURING", "DURING/AFTER", "AFTER", and "U" (for unknown). Although "COINCIDENT" and "ONGOING" are in the STENRF codelist, they describe timing relative to a point in time rather than an interval of time, so are not appropriate for use with --STRF variables. It would be unusual for an event or intervention to be recorded as starting "AFTER" the Study Reference Period, but could be possible, depending on how the Study Reference Period is defined in a particular study.

Allowable values for --ENRF are "BEFORE", "DURING", "DURING/AFTER", "AFTER" and "U" (for unknown). If --ENRF is used, then --ENRF = "AFTER" means that the event did not end before or during the Study Reference Period. Although "COINCIDENT" and "ONGOING" are in the STENRF codelist, they describe timing relative to a point in time rather than an interval of time, so are not appropriate for use with --ENRF variables.

As an example, a CRF checkbox that identifies concomitant medication use that began prior to the Study Reference Period would translate into CMSTRF = "BEFORE", if selected. Note that in this example, the information collected is with respect to the start of the concomitant medication use only, and therefore the collected data corresponds to variable CMSTRF, not CMENRF. Note also that the information collected is relative to the Study Reference Period, which meets the definition of CMSTRF.

Some sponsors may wish to derive --STRF and --ENRF for analysis or reporting purposes even when dates are collected. Sponsors are cautioned that doing so in conjunction with directly collecting or mapping data such as "BEFORE", "PRIOR", "ONGOING", etc., to --STRF and --ENRF will blur the distinction between collected and derived values within the domain. Sponsors wishing to do such derivations are instead encouraged to use analysis datasets for this derived data.

In general, sponsors are cautioned that representing information using variables --STRF and --ENRF may not be as precise as other methods, particularly because information is often collected relative to a point in time or to a period of time other than the one defined as the Study Reference Period. SDTMIG v3.1.2 attempted to address these limitations by the addition of four new relative timing variables, which are described in the following paragraph. Sponsors should use the set of variables that allows for accurate representation of the collected data. In many cases, this will mean using these new relative timing variables in place of --STRF and --ENRF.

--STRTPT, --STTPT, --ENRTPT, and --ENTPT

While the variables --STRF and --ENRF are useful in the case when relative timing assessments are made coincident with the start and end of the Study Reference Period, these may not be suitable for expressing relative timing assessments such as "Prior" or "Ongoing" that are collected at other times of the study. As a result, four new timing variables were added in v3.1.2 to express a similar concept at any point in time. The variables --STRTPT and --ENRTPT contain values similar to --STRF and --ENRF, but may be anchored with any timing description or date/time value expressed in the respective --STTPT and --ENTPT variables, and are not limited to the Study Reference Period. Unlike the variables --STRF and --ENRF, which for all domains are defined relative to one Study Reference Period, the timing variables --STRTPT, --STTPT, --ENRTPT, and --ENTPT are defined by each sponsor for each study. Allowable values for --STRTPT and --ENRTPT are as follows:

If the reference time point corresponds to the date of collection or assessment:

  • Start values: An observation can start BEFORE that time point, can start COINCIDENT with that time point, or it is unknown (U) when it started.
  • End values: An observation can end BEFORE that time point, can end COINCIDENT with that time point, can be known that it didn't end but was ONGOING, or it is unknown (U) when it ended or if it was ongoing.
  • AFTER is not a valid value in this case because it would represent an event after the date of collection.

If the reference time point is prior to the date of collection or assessment:

  • Start values: An observation can start BEFORE the reference point, can start COINCIDENT with the reference point, can start AFTER the reference point, or it may not be known (U) when it started.
  • End values: An observation can end BEFORE the reference point, can end COINCIDENT with the reference point, can end AFTER the reference point, can be known that it didn't end but was ONGOING, or it is unknown (U) when it ended or if it was ongoing.

Although "DURING" and "DURING/AFTER" are in the STENRF codelist, they describe timing relative to an interval of time rather than a point in time, so are not allowable for use with --STRTPT and --ENRTPT variables.

Examples of --STRTPT, --STTPT, --ENRTPT, and --ENTPT

Example: Medical History

Assumptions:

  • CRF contains "Year Started" and check box for "Active"
  • "Date of Assessment" is collected

Example when "Active" is checked:

  • MHDTC = date of assessment value, e.g., "2006-11-02"
  • MHSTDTC = year of condition start, e.g., "2002"
  • MHENRTPT = "ONGOING"
  • MHENTPT = date of assessment value, e.g., "2006-11-02"

Figure 4.4.7: Example of --ENRTPT and --ENTPT for Medical History

Example: Prior and Concomitant Medications

Assumptions:

  • CRF includes collection of "Start Date" and "Stop Date", and check boxes for
    • "Prior" if start date was before the screening visit and was unknown or uncollected
    • "Continuing" if medication had not stopped as of the final study visit, so no end date was collected

Example when both "Prior" and "Continuing" are checked:

  • CMSTDTC is null
  • CMENDTC is null
  • CMSTRTPT = "BEFORE"
  • CMSTTPT is screening date, e.g., "2006-10-21"
  • CMENRTPT = "ONGOING"
  • CMENTPT is final study visit date, e.g., "2006-11-02"

Example: Adverse Events

Assumptions:

  • CRF contains "Start Date", "Stop Date"
  • Collection of "Outcome" includes check boxes for "Continuing" and "Unknown", to be used, if necessary, at the end of the subject's participation in the trial
  • No assessment date or visit information was collected

Example when "Unknown" is checked:

  • AESTDTC is start date, e.g., "2006-10-01"
  • AEENDTC is null
  • AEENRTPT = "U"
  • AEENTPT is final subject contact date, e.g., "2006-11-02"

4.4.8 Date and Time Reported in a Domain Based on Findings

When the date/time of collection is reported in any domain, the date/time should go into the --DTC field (e.g., EGDTC for Date/Time of ECG). For any domain based on the Findings general observation class, such as lab tests which are based on a specimen, the collection date is likely to be tied to when the specimen or source of the finding was captured, not necessarily when the data were recorded. In order to ensure that the critical timing information is always represented in the same variable, the --DTC variable is used to represent the time of specimen collection. For example, in the LB domain the LBDTC variable would be used for all single-point blood collections or spot urine collections. For timed lab collections (e.g., 24-hour urine collections) the LBDTC variable would be used for the start date/time of the collection and LBENDTC for the end date/time of the collection. This approach will allow the single-point and interval collections to use the same date/time variables consistently across all datasets for the Findings general observation class. The table below illustrates the proper use of these variables. Note that --STDTC is not used for collection dates over an interval in the Findings general observation class and is therefore blank in the following table.

Collection Type--DTC--STDTC--ENDTC
Single-Point CollectionX

Interval CollectionX
X

4.4.9 Use of Dates as Result Variables

Dates are generally used only as timing variables to describe the timing of an event, intervention, or collection activity, but there may be occasions when it may be preferable to model a date as a result (--ORRES) in a Findings dataset. Note that using a date as a result to a Findings question is unusual and atypical, and should be approached with caution. This situation, however, may occasionally occur when a) a group of questions (each of which has a date response) is asked and analyzed together; or b) the Event(s) and Intervention(s) in question are not medically significant (often the case when included in questionnaires). Consider the following cases:

  • Calculated due date
  • Date of last day on the job
  • Date of high school graduation

One approach to modeling these data would be to place the text of the question in --TEST and the response to the question, a date represented in ISO 8601 format, in --ORRES and --STRESC, as long as these date results do not contain the dates of medically significant events or interventions.

Again, use extreme caution when storing dates as the results of Findings. Remember, in most cases, these dates should be timing variables associated with a record in an Intervention or Events dataset.

4.4.10 Representing Time Points

Time points can be represented using the time point variables, --TPT, --TPTNUM, --ELTM, and the time point anchors, --TPTREF (text description) and --RFTDTC (the date/time). Note that time-point data will usually have an associated --DTC value. The interrelationship of these variables is shown in Figure 4.4.10 below.

Figure 4.4.10: Relationships among Time Point Variables

Values for these variables for Vital Signs measurements taken at 30, 60, and 90 minutes after dosing would look like the following.

VSTPTNUMVSTPTVSELTMVSTPTREFVSRFTDTCVSDTC
130 MINPT30MDOSE ADMINISTRATION2006-08-01T08:002006-08-01T08:30
260 MINPT1HDOSE ADMINISTRATION2006-08-01T08:002006-08-01T09:01
390 MINPT1H30MDOSE ADMINISTRATION2006-08-01T08:002006-08-01T09:32

Note that VSELTM is the planned elapsed time, not the actual elapsed time. The actual elapsed time could be derived in an analysis dataset, if desired, as VSDTC-VSRFTDTC.

Values for these variables for Urine Collections taken pre-dose, and from 0-12 hours and 12-24 hours after dosing would look like the following.

LBTPTNUMLBTPTLBELTMLBTPTREFLBRFTDTCLBDTC
115 MIN PRE-DOSE-PT15MDOSE ADMINISTRATION2006-08-01T08:002006-08-01T07:45
20-12 HOURSPT12HDOSE ADMINISTRATION2006-08-01T08:002006-08-01T20:35
312-24 HOURSPT24HDOSE ADMINISTRATION2006-08-01T08:002006-08-02T08:40

Note that the value in LBELTM represents the end of the specimen collection interval.

When time points are used, --TPTNUM is expected. Time points may or may not have an associated --TPTREF. Sometimes, --TPTNUM may be used as a key for multiple values collected for the same test within a visit; as such, there is no dependence upon an anchor such as --TPTREF, but there will be a dependency upon the VISITNUM. In such cases, VISITNUM will be required to confer uniqueness to values of --TPTNUM.

If the protocol describes the scheduling of a dose using a reference intervention or assessment, then --TPTREF should be populated, even if it does not contribute to uniqueness. The fact that time points are related to a reference time point, and what that reference time point is, are important for interpreting the data collected at the time point.

Not all time points will require all three variables to provide uniqueness. In fact, in some cases a time point may be uniquely identified without the use of VISIT, or without the use of --TPTREF, or, without the use of either one. For instance:

  • A trial might have time points only within one visit, so that the contribution of VISITNUM to uniqueness is trivial. (VISITNUM would be populated, but would not contribute to uniqueness.)
  • A trial might have time points that do not relate to any visit, such as time points relative to a dose of drug self-administered by the subject at home. (Visit variables would not be included, but --TPTREF and other time point variables would be populated.)
  • A trial may have only one reference time point per visit, and all reference time points may be similar, so that only one value of --TPTREF (e.g., "DOSE") is needed. (--TPTREF would be populated, but would not contribute to uniqueness.)
  • A trial may have time points not related to a reference time point. For instance, --TPTNUM values could be used to distinguish first, second, and third repeats of a measurement scheduled without any relationship to dosing. (--TPTREF and --ELTM would not be included.) In this case, where the protocol calls for repeated measurements but does not specify timing of the measurements, the --REPNUM variable could be used instead of time point variables.

For trials with many time points, the requirement to provide uniqueness using only VISITNUM, --TPTREF, and --TPTNUM may lead to a scheme where multiple natural keys are combined into the values of one of these variables.

For instance, in a crossover trial with multiple doses on multiple days within each period, either of the following options could be used. VISITNUM might be used to designate period, --TPTREF might be used to designate the day and the dose, and --TPTNUM might be used to designate the timing relative to the reference time point. Alternatively, VISITNUM might be used to designate period and day within period, --TPTREF might be used to designate the dose within the day, and --TPTNUM might be used to designate the timing relative to the reference time point.

Option 1

VISITVISITNUM--TPT--TPTNUM--TPTREF
PERIOD 13PRE-DOSE1DAY 1, AM DOSE
1H2
4H3
PRE-DOSE1DAY 1, PM DOSE
1H2
4H3
PRE-DOSE1DAY 5, AM DOSE
1H2
4H3
PRE-DOSE1DAY 5, PM DOSE
1H2
4H3
PERIOD 24PRE-DOSE1DAY 1, AM DOSE
1H2
4H3
PRE-DOSE1DAY 1, PM DOSE
1H2
4H3

Option 2

VISITVISITNUM--TPT--TPTNUM--TPTREF
PERIOD 1, DAY 13PRE-DOSE1AM DOSE
1H2
4H3
PRE-DOSE1PM DOSE
1H2
4H3
PERIOD 1, DAY 54PRE-DOSE1AM DOSE
1H2
4H3
PRE-DOSE1PM DOSE
1H2
4H3
PERIOD 2, DAY 15PRE-DOSE1AM DOSE
1H2
4H3
PRE-DOSE1PM DOSE
1H2
4H3

Within the context that defines uniqueness for a time point, which may include domain, visit, and reference time point, there must be a one-to-relationship between values of --TPT and --TPTNUM. In other words, if domain, visit, and reference time point uniquely identify subject data, then if two subjects have records with the same values of DOMAIN, VISITNUM, --TPTREF, and --TPTNUM, then these records may not have different time point descriptions in --TPT.

Within the context that defines uniqueness for a time point, there is likely to be a one-to-one relationship between most values of --TPT and --ELTM. However, since --ELTM can only be populated with ISO 8601 periods of time (as described in Section 4.4.3, Intervals of Time and Use of Duration for --DUR Variables), --ELTM may not be populated for all time points. For example, --ELTM is likely to be null for time points described by text such as "pre-dose" or "before breakfast". When --ELTM is populated, if two subjects have records with the same values of DOMAIN, VISITNUM, --TPTREF, and --TPTNUM, then these records may not have different values in --ELTM.

When the protocol describes a time point with text such as "4-6 hours after dose" or "12 hours +/- 2 hours after dose" the sponsor may choose whether and how to populate --ELTM. For example, a time point described as "4-6 hours after dose" might be associated with an --ELTM value of PT4H. A time point described as "12 hours +/- 2 hours after dose" might be associated with an --ELTM value of PT12H. Conventions for populating --ELTM should be consistent (the examples just given would probably not both be used in the same trial). It would be good practice to indicate the range of intended timings by some convention in the values used to populate --TPT.

Sponsors may, of course, use more stringent requirements for populating --TPTNUM, --TPT, and --ELTM. For instance, a sponsor could decide that all time points with a particular --ELTM value would have the same values of --TPTNUM, and --TPT, across all visits, reference time points, and domains.

4.4.11 Disease Milestones and Disease Milestone Timing Variables

A "disease milestone" is an event or activity that can be anticipated in the course of a disease, but whose timing is not controlled by the study schedule. A disease milestone may be something that occurred pre-study, but which represents a time at which data would have been collected, such as diagnosis of the disease under study. A disease milestone may also be something which is anticipated to occur during a study and which, if it occurs, triggers the collection of related data outside the regular schedule of visits, such as an adverse event of interest. The types of Disease Milestones for a study are defined in the study-level Trial Disease Milestones (TM) dataset (Section 7.3.3, Trial Disease Milestones). The times at which disease milestones occurred for a particular subject are summarized in the special purpose Subject Disease Milestones (SM) domain (Section 5.4, Subject Disease Milestones), a domain similar in structure to the Subject Visits (SV) and Subject Elements (SE) domains.

Not all studies will have disease milestones. If a study does not have disease milestones, the TM and SM domains will not be present and the disease milestones timing variables may not be included in other domains.

Disease Milestone Naming

Instances of disease milestones are given names at a subject level. The name of a disease milestone is composed of a character string that depends on the disease milestone type (MIDSTYPE in TM and SM) and, if the type of disease milestone is one that may occur multiple times, a chronological sequence number for this disease milestone among other instances of the same type for the subject. The character string used in the name of a disease milestone is usually a short form of the disease milestone type. For example, if the type of disease milestone was "EPISODE OF DISEASE UNDER STUDY", the values of MIDS for instances of this type of event could include "EPISODE1", "EPISODE2", etc, or "EPISODE01", "EPISODE02", etc. The association between the longer text in MIDSTYPE and the shorter text in MIDS can be seen in SM, which includes both variables.

Disease Milestones Name (MIDS)

If something that has been defined as a disease milestone for a particular study occurred for a particular subject, it is represented as usual, in the appropriate findings, intervention, or events class record. In addition this record will include the MIDS timing variable, populated with the name of the disease milestone. The timing of a disease milestone is also represented in the special purpose SM domain.

The record that represents a disease milestone does not include values for the timing variables RELMIDS and MIDSDTC, which are used to represent the timing of other observations relative to a disease milestone. The usual timing variables in the record for a disease milestone (e.g., --DTC, --STDTC, --ENDTC) provide the needed timing for this observation and for the timing information represented in the SM domain.

Timing Relative to a Disease Milestone (MIDS, RELMIDS, MIDSDTC)

For an observation triggered by the occurrence of a disease milestone, the relationship of the observation to the disease milestone can be represented using the disease milestones timing variables MIDS, RELMIDS, and MIDSDTC to describe the timing of the observation.

  • MIDS is populated with the name of a disease milestone for this subject. MIDS is the "anchor" for describing the timing of the observation relative to the disease milestone. In this sense, its function is similar to --TPTREF for time points.
  • RELMIDS is usually populated with a textual description of the temporal relationship between the observation and the disease milestone named in MIDS. Controlled vocabulary has not yet been developed for RELMIDS, but is likely to include terms such as "IMMEDIATELY BEFORE", "AT START OF", "DURING", "AT END OF", and "SHORTLY AFTER". It is similar to --ELTM, except that --ELTM is represented ISO 8601 duration.
  • MIDSDTC is populated with the date/time of the disease milestone. This is the --DTC for a finding, or the --STDTC for an event or intervention, and is the date recorded in SMSTDTC in the SM domain. Its function is similar to --RFTDTC for time points.

In some cases, data collected in conjunction with a disease milestone does not include the collection of a separate date for the related observation. This is particularly common for pre-study disease milestones, but may occur with on-study disease milestones as well. In such cases, MIDSDTC provides a related date/time in records that would not otherwise contain any date. In records that do contain date/time(s) of the observation, MIDSDTC allows easy comparison of the date(s) of the observation to the (start) date of the disease milestone. In such cases, it functions much like the reference time point date/time (--RFTDTC) in observations at time points.

When a disease milestone is an event or intervention, some data triggered by the disease milestone may be modeled as Findings About the disease milestone (i.e., FAOBJ is the disease milestone). In such cases, RELMIDS should be used to describe the temporal relationship between the Disease Milestone and the subject of the question being asked in the finding, rather than as describing when the question was asked.

  • When the subject of the question is the disease milestone itself, RELMIDS may be populated with a value such as "ENTIRE EVENT" or "ENTIRE TREATMENT."
  • When the subject of the question is a question about the occurrence of some activity or event related to the disease milestone, RELMIDS acts like an evaluation interval, describing the period of time over which the question is focused.
    • For questions about a possible cause of an event or about the indication for a treatment, RELMIDS would have a value such as "WEEK PRIOR" or "IMMEDIATELY BEFORE", or even just "BEFORE".
    • RELMIDS would be "DURING" for questions about things that may have occurred while an event or intervention disease milestone was in progress.
    • For sequelae of a disease milestone, RELMIDS would have a value such as "AT DISCHARGE" or "WEEK AFTER" or simply "AFTER".

Use of Disease Milestone Timing Variables with other Timing Variables

The disease milestone timing variables provide timing relative to an activity or event that has been identified, for the particular study, as a disease milestone. Their use does not preclude the use of variables that collect actual date/times or timing relative to the study schedule.

  • The use of actual date/times is unaffected. The Disease Milestone Timing variables may provide timing information in cases where actual date/times are unavailable, particularly for pre-study disease milestones. When the question text for an observation references a disease milestone, but a separate date for the observation is not collected, the disease milestone timing variables should be populated but the actual date/s should not be imputed by populating them with the date of the disease milestone. Examples of such questions: Disease stage at initial diagnosis of disease under study; Treatment for most recent disease episode.
  • Study-day variables should be populated wherever complete actual date/times are populated. This includes negative study days for pre-study observations.
  • The timing variables EPOCH and TAETORD (Planned Order of Element within Arm) may be populated for on-study observations associated with disease milestones. However, pre-study disease milestones, those which occur before the start of study participation when informed consent is obtained, by definition, do not have an associated EPOCH or TAETORD.
  • Visit variables are expected in many findings domains, but findings triggered by the occurrence of a study milestone may not occur at a scheduled visit.
    • Findings associated with pre-study disease milestones are often collected at a screening visit, although the test was not performed at that visit.
    • For findings associated with on-study disease milestones but not conducted at a scheduled visit, practices for populating VISITNUM as for an unscheduled visit should be followed.
  • The use of time-point variables with disease milestone variables may occur in cases where a disease milestone triggers treatment, and time points relative to treatment are part of the study schedule. For instance, a migraine trial may call for assessments of symptom severity at prescribed times after treatment of the migraine. If the migraine episodes were treated as disease milestones, then the disease milestone timing variables might be populated in the exposure and symptom-severity records. If the study planned to treat multiple migraine episodes, the MIDS variable would provide a convenient way to determine the episode with which data were associated.
  • An evaluation interval variable (--EVLINT or --EVLTXT) could be used in conjunction with disease milestone variables. For instance, patient-reported outcome instruments might be administered at the time of a disease milestone, and the questions in the instrument might include an evaluation interval.
  • The timing variables for start and end of an event or intervention relative to the study reference period (--STRF and --ENRF) or relative to a reference time point (--STRTPT and --STTPT, --ENRTPT and --ENTPT) could be used in conjunction with disease milestone variables. For example, a concomitant medication could be collected in association with a disease milestone, so that the disease milestone timing variables were populated, but relative timing variables could be used for the start or end of the concomitant medication.
  • The timing variables for start and end of a planned assessment interval might be populated for an assessment triggered by a disease milestone, if applicable. For example, the occurrence of a particular event might trigger both a treatment and Holter monitoring for 24 hours after the treatment.

Linking and Disease Milestones

When disease milestones have been defined for a study, the MIDS variable serves to link observations associated with a disease milestone in a way similar to the way that VISITNUM links observations collected at a visit. If disease milestones were not defined for the study, it would be possible to link records associated with a disease milestone using RELREC, but the use of disease milestones has certain advantages:

  • RELREC indicates that there is a relationship between records or datasets, but not the nature of the relationship. Records with the same MIDS value are related to the same disease milestone.
  • When disease milestones are defined, it is not necessary to create RELREC records to establish relationships between observations associated with a disease milestone.

4.5 Other Assumptions

4.5.1 Original and Standardized Results of Findings and Tests Not Done

4.5.1.1 Original and Standardized Results

The --ORRES variable contains the result of the measurement or finding as originally received or collected. --ORRES is an expected variable and should always be populated, with two exceptions:

  • When --STAT = "NOT DONE" since there is no result for such a record
  • When --DRVFL = "Y" since the distinction between an original result and a standard result is not applicable for records for which --DRVFL = "Y".

Note that records for which --DRVFL = "Y" may combine data collected at more than one visit. In such a case the sponsor must define the value for VISITNUM, addressing the correct temporal sequence. If a new record is derived for a dataset, and the source is not eDT, then that new record should be flagged as derived.

For example, in ECG data, if a corrected QT interval value derived in-house by the sponsor were represented in an SDTM record, then EGDRVFL would be "Y". If a corrected QT interval value was received from a vendor or was produced by the ECG machine, the derived flag would be null.

When --ORRES is populated, --STRESC must also be populated, regardless of whether the data values are character or numeric. The variable, --STRESC, is populated either by the conversion of values in --ORRES to values with standard units, or by the assignment of the value of --ORRES (as in the PE Domain, where --STRESC could contain a dictionary-derived term). A further step is necessary when --STRESC contains numeric values. These are converted to numeric type and written to --STRESN. Because --STRESC may contain a mixture of numeric and character values, --STRESN may contain null values, as shown in the flowchart below.

--ORRES
(all original values)
--STRESC
(derive or copy all results)
--STRESN
(numeric results only)

When the original measurement or finding is a selection from a defined codelist, in general, the --ORRES and --STRESC variables contain results in decoded format, that is, the textual interpretation of whichever code was selected from the codelist. In some cases where the code values in the codelist are statistically meaningful standardized values or scores, which are defined by sponsors or by valid methodologies such as SF36 questionnaires, the --ORRES variables will contain the decoded format, whereas, the --STRESC variables as well as the --STRESN variables will contain the standardized values or scores.

Occasionally data that are intended to be numeric are collected with characters attached that cause the character-to-numeric conversion to fail. For example, numeric cell counts in the source data may be specified with a greater than (>) or less than (<) sign attached (e.g. >10,000 or <1). In these cases, the value with the greater than (>) or less than (<) sign attached should be moved to the --STRESC variable, and --STRESN should be null. The rules for modifying the value for analysis purposes should be defined in the analysis plan and a numeric value should only be imputed in the ADaM datasets. If the value in --STRESC has different units, the greater than (>) or less than (<) sign should be maintained. An example is included in Section 4.5.1.3, Examples of Original and Standard Units and Test Not Done, Example 1, Rows 11 and 12.

4.5.1.2 Tests Not Done

If the data on the CRF is missing and "Yes/No" or "Done/Not Done" was not explicitly captured, a record should not be created to indicate that the data was not collected.

When an entire examination (laboratory draw, ECG, vital signs, or physical examination), or a group of tests (hematology or urinalysis), or an individual test (glucose, PR interval, blood pressure, or hearing) is not done, and this information is explicitly captured with a "Yes/No" or "Done/Not Done" question, this information should be represented in the dataset. The reason for the missing information may or may not have been collected.

A sponsor has two options:

  1. Submit individual records for each test not done.
  2. Submit one record for a group of tests that were not done.

The example below illustrates the single-record approach for representing a group of tests not done.

If a single record is used to represent a group of tests were not done:

  • --TESTCD should be --ALL
  • --TEST should be <Domain description>
  • --CAT should be <Name of group of tests>
  • --ORRES should be null
  • --STAT should be "NOT DONE"
  • --REASND, if collected, might be "Specimen lost"

For example, if urinalysis tests were not done, then:

  • LBTESTCD would be "LBALL".
  • LBTEST would be "Laboratory Test Results".
  • LBCAT would be "URINALYSIS".
  • LBORRES would be null.
  • LBSTAT would be "NOT DONE".
  • LBREASND, if collected, might be "Subject could not void".

4.5.1.3 Examples of Original and Standard Units and Test Not Done

The following examples are meant to illustrate the use of Findings results variables, and are not meant as comprehensive domain examples. Certain required and expected variables are omitted, for example USUBJID, and the samples may represent data for more than one subject.

Example

Row 1:A numeric value was converted to the standard unit.
Row 2:A numeric value was copied; the original unit was the standard unit so conversion was not needed.
Rows 3-4:A character result was copied from the LBORRES to LBSTRESC. Since this is not a numeric result, LBSTRESN is null.
Row 5:A character result was converted to a standardized format.
Row 6:A result of "BLQ" was collected and copied to LBSTRESC. Note that the sponsor populated both LBORRESU and LBSTRESU with standard units, but these could have been left null.
Row 7:A result was derived from multiple results, so LBDRVFL = "Y". Note that the original collected data are not shown in this example.
Row 8:A result for LBTEST = "HCT" is missing for visit 2, as indicated by LBSTAT = "NOT DONE"; neither LBORRES nor LBSTRESC is populated.
Row 9:Tests in the category "HEMATOLOGY" were not done at visit 3, as indicated by LBTESTCD = "LBALL" and LBSTAT = "NOT DONE".
Row 10:None of the tests in the LB domain were done at visit 4, as indicated by LBTESTCD = "LBALL", a null LBCAT value, and LBSTAT = "NOT DONE".
Row 11:Shows a result collected as an inequality. The unit collected was the standard unit, so the result required no conversion and was copied to LBSTRESC.
Row 12:Shows a result collected as an inequality. In LBSTRESC, the numeric part of LBORRES has been converted to the standard unit, and the less than (<) sign has been retained. LBSTRESN is not populated.

lb.xpt

RowLBTESTCDLBCATLBORRESLBORRESULBSTRESCLBSTRESNLBSTRESULBSTATLBLOBXFLVISITNUMLBDTC
1GLUCCHEMISTRY6.0mg/dL60.060.0mg/L

12016-02-01
2ALTCHEMISTRY12.1mg/L12.112.1mg/L

12016-02-01
3BACTURINALYSISMODERATE
MODERATE



12016-02-01
4RBCURINALYSISTRACE
TRACE



12016-02-01
5WBCURINALYSIS++
2+



12016-02-01
6KETONESCHEMISTRYBLQmg/LBLQ
mg/L

12016-02-01
7MCHCHEMATOLOGY

33.833.8g/dL
Y32016-02-15
8HCTHEMATOLOGY




NOT DONE
22016-02-08
9LBALLHEMATOLOGY




NOT DONE
32016-02-29
10LBALL





NOT DONE
42016-02-22
11WBCHEMATOLOGY<4, 00010^6/L<4,000
10^6/L

62016-02-07
12BILICHEMISTRY<0.1mg/dL<1.71
umol/L

62016-02-07

Example

Row 1:A numeric result was collected in standard units. Since no conversion was necessary, the result was copied into LBSTRESC and LBSTRESN.
Rows 2-3:Numeric results were converted to standard units.
Row 4:Character values were copied to EGSTRESC. EGSTRESN is null.
Row 5:The overall interpretation of the ECG is represented as a separate test.
Row 6:The result for EGTESTCD = "PRAG" was missing at visit 2, as indicated by EGSTAT = "NOT DONE"; neither EGORRES nor EGSTRESC is populated.
Row 7:At visit 3, there were no ECG results, as indicated by EGTESTCD = "EGALL" and EGSTAT = "NOT DONE".

eg.xpt

RowEGTESTCDEGTESETEGORRESEGORRESUEGSTRESCEGSTRESNEGSTRESUEGSTATVISITNUMEGDTC
1QRSAGPR Interval, Aggregate0.362sec0.3620.362sec
12015-03-07
2QTAGQT Interval, Aggregate221msec0.2210.221sec
12015-03-07
3QTCBAGQTcB Interval, Aggregate412msec0.4120.412sec
12015-03-07
4SPRTARRYSupraventricular ArrhythmiasATRIAL FLUTTER
ATRIAL FLUTTER


12015-03-07
6INTPInterpretationABNORMAL
ABNORMAL


12015-03-07
5PRAGPR Interval, Aggregate




NOT DONE22015-03-14
7EGALLECG Test Results




NOT DONE32015-03-21

Example

Rows 1-2:Numeric values were converted to standard units.
Row 3:A result for VSTESTCD = "HR" is missing, as indicated by VSSTAT = "NOT DONE"; neither VSORRES nor VSSTRESC is populated.
Rows 4-5:Two measurements for VSTESTCD= "SYSBP" were done at visit 1.
Row 6:A third measurement for VSTESTCD = "SYSBP" at visit 1 was a derived record, as indicated by VSDRVFL = "Y".
Row 7:At visit 2, there were no Vital Signs results, as indicated by VSTESTCD = "VSALL" and VSSTAT = "NOT DONE".

vs.xpt

RowVSTESTCDVSORRESVSORRESUVSSTRESCVSSTRESNVSSTRESUVSSTATVSDRVFLVISITNUMVSDTC
1HEIGHT60in152152cm

12016-07-18
2WEIGHT110LB5050kg

12016-07-18
3HR




NOT DONE
12016-07-18
4SYSBP96mmHg9696mmHg

12016-07-18
5SYSBP100mmHg100100mmHg

12016-07-18
6SYSBP

9898mmHg
Y12016-07-18
7VSALL




NOT DONE
22016-07-25

4.5.2 Linking of Multiple Observations

See Section 8, Representing Relationships and Data, for guidance on expressing relationships among multiple observations.

4.5.3 Text Strings That Exceed the Maximum Length for General-Observation-Class Domain Variables

4.5.3.1 Test Name (--TEST) Greater than 40 Characters

Sponsors may have test descriptions (--TEST) longer than 40 characters in their operational database. Since the --TEST variable is meant to serve as a label for a --TESTCD when a Findings dataset is transposed to a more horizontal format, the length of --TEST is limited to 40 characters (except as noted below) to conform to the limitations of the SAS v5 Transport format currently used for submission datasets. Therefore, sponsors have the choice to either insert the first 40 characters or a text string abbreviated to 40 characters in --TEST. Sponsors should include the full description for these variables in the study metadata in one of two ways:

  • If the annotated CRF contains the full text, provide a reference to the annotated CRF page containing the full test description in the Define-XML document origin definition for --TEST.
  • If the annotated CRF does not specify the full text, then the full text should be documented in the Define-XML document or the Clinical Study Data Reviewer's Guide.

The convention above should also be applied to the Qualifier Value Label (QLABEL) in Supplemental Qualifiers (SUPP--) datasets. IETEST values in IE and TI are exceptions to the above 40-character rule and are limited to 200 characters, since they are not expected to be transformed to column labels. Values of IETEST that exceed 200 characters should be described in study metadata as per the convention above. For further details see IE Assumption 3, and TI Assumption 5.

4.5.3.2 Text Strings Greater than 200 Characters in Other Variables

Some sponsors may collect data values longer than 200 characters for some variables. Because of the current requirement for the SAS v5 Transport file format, it is not possible to store the long text strings using only one variable. Therefore, the SDTMIG has defined conventions for storing long text string using multiple variables.

For general-observation-class variables and supplemental qualifiers (i.e., non-standard variables), the conventions are as follows:

  • The first 200 characters of text should be stored in the parent domain variable and each additional 200 characters of text should be stored in a record in the SUPP-- dataset (see Section 8.4, Relating Non-Standard Variables Values to a Parent Domain).
    • When splitting a text string into several records, the text should be split between words to improve readability.
    • When the text longer than 200 characters is for a supplemental qualifier, the first QNAM should describe the non-standard variable without any numeric suffix.
    • The value for QNAMs for additional text (>200 characters) should contain a sequential variable name, which is formed by appending a one-digit integer, beginning with 1, to the original domain variable name.
    • The value for QLABEL should be the original domain variable label.
      • The reason a digit integer or suffix is not appended to the label is because the long text string represents a single value for a variable. The physical representation due to the SAS v5 Transport file format does not change the concept described by the label.
      • This is different conceptually from when multiple values for a non-result qualifier variable where values are individually stored in SUPP--. In that case, both the QNAM and QLABEL must be uniquely named (see Section 4.2.8.3, Multiple Values for a Non-Result Qualifier Variable) because they represent multiple values for a single variable.
      • In cases where the standard domain variable name is already 8 characters in length, sponsors should replace the last character with a digit when creating values for QNAM. As an example, for Other Action Taken in Adverse Events (AEACNOTH), values for QNAM for the SUPPAE records would have the values AEACNOT1, AEACNOT2, and so on.

Example: MHTERM with 500 characters

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHTERM
112345MH99-12361st ~200 chars of text, split between words

suppmh.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
112345MH99-123MHSEQ6MHTERM1Reported Term for the Medical History2nd ~200 chars of text, split between wordsCRF
212345MH99-123MHSEQ6MHTERM2Reported Term for the Medical Historylast 100 or more chars of textCRF

Example: AEACN with >200 characters

In this example, the text entered for AEACNOTH was longer than 200 characters, but required only one supplemental qualifier for the text that extended beyond what could be represented in the standard variable.

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAEACNOTH
112345AE99-1234HEART FAILURE1st ~200 characters of text, split between words

suppae.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
112345AE99-123AESEQ4AEACNOT1Other Action Takenremaining characters of textCRF

Example

pr.xpt

RowSTUDYIDDOMAINUSUBJIDPRSEQPRTRT
112345PR99-1234KIDNEY TRANSPLANT

In this example, the text of the supplemental qualifier PRREAS was longer than 200 characters, but required only one additional supplemental qualifier to represent the remaining text.

supppr.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIG
112345PR99-123PRSEQ1PRREASReason1st ~200 characters of text, split between wordsCRF
212345PR99-123PRSEQ1PRREAS1Reasonremaining characters of textCRF

The following domains have specialized conventions for representing values longer than 200 characters:

The following table summarizes the conventions and notes the specializations.

Text Strings >200 Char Conventions

General Observation Class & Supplemental Qualifier Variables

Text Strings >200 Char Conventions

CO.COVAL

Text Strings >200 Char Conventions

TS.TSVAL

Text Strings >200 Char Conventions

TI.IETEST and IE.IETEST

The first 200 characters of text should be stored in the variable and each additional 200 characters of text should be stored as a record in the SUPP-- datasetThe first 200 characters of text should be stored in COVAL and each additional 200 characters of text should be stored in COVAL1 to COVALn.The first 200 characters of text should be stored in TSVAL and each additional 200 characters of text should be stored in TSVAL1 to TSVALn.If the inclusion/exclusion criteria text is >200 characters, put meaningful text in IETEST and describe the full text in the study metadata.
When splitting a text string into several records, the text should be split between words to improve readability.When splitting a text string into several records, the text should be split between words to improve readability.When splitting a text string into several records, the text should be split between words to improve readability.Not applicable.
The value for QLABEL should be the original domain variable label.The variable labels for COVAL1 to COVALn should be "Comment".The variable labels for TSVAL1 to TSVALn should be "Parameter Value".Not applicable.

4.5.4 Evaluators in the Interventions and Events Observation Classes

Because observations may originate from more than one source (e.g., an Investigator or Independent Assessor), the observations recorded in the Findings class include the --EVAL qualifier. For the Interventions and Events observation classes, which do not include the --EVAL variable, all data are assumed to be attributed to the principal investigator. The QEVAL variable can be used to describe the evaluator for any data item in a SUPP-- dataset (Section 8.4.1, Supplemental Qualifiers – SUPP-- Datasets), but is not required when the data are objective. For observations that have primary and secondary evaluations of specific qualifier variables, sponsors should put data from the primary evaluation into the standard domain dataset and data from the secondary evaluation into the Supplemental Qualifier datasets (SUPP--). Within each SUPP-- record, the value for QNAM should be formed by appending a "1" to the corresponding standard domain variable name. In cases where the standard domain variable name is already eight characters in length, sponsors should replace the last character with a "1" (incremented for each additional attribution).

This example illustrates a case where an adjudication committee evaluated an adverse event. The evaluations of the adverse event by the primary investigator were represented in the standard AE dataset. The evaluations of the adjudication committee were represented in SUPPAE. See Section 8.4, Relating Non-Standard Variables Values to a Parent Domain. Note that the QNAM for the "Relationship to Non-Study Treatment" supplemental qualifier is AERELNS1, rather than AERELNST1, since AERELNST already eight characters in length.

suppae.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
112345AE99-123AESEQ3AESEV1Severity/ IntensityMILDCRFADJUDICATION COMMITTEE
212345AE99-123AESEQ3AEREL1CausalityPOSSIBLY RELATEDCRFADJUDICATION COMMITTEE
312345AE99-123AESEQ3AERELNS1Relationship to Non-Study TreatmentPossibly related to aspirin useCRFADJUDICATION COMMITTEE

4.5.5 Clinical Significance for Findings Observation Class Data

For assessments of clinical significance when the overall interpretation is a record in the domain, use a Supplemental Qualifier (SUPP--) record (with QNAM = "--CLSIG") linked to the record that contains the overall interpretation or a particular result. An example would be a QNAM value of "EGCLSIG" in SUPPEG with a value of "Y", indicating that an ECG result of "ATRIAL FIBRILLATION" was clinically significant.

Separate from clinical significance are results of "NORMAL" or "ABNORMAL", or lab values that are out of normal range. Examples of the latter include the following:

  • An ECG test with EGTESTCD = "INTP", which addresses the ECG as a whole, should have a result or of "NORMAL" or "ABNORMAL". A record for EGTESTCD = "INTP" may also have a record in SUPPEG indicating whether the result is clinically significant.
  • A record for a vital signs measurement (e.g., systolic blood pressure) or a lab test (e.g., hematocrit) that contains a measurement may have a normal range and a normal range indicator. It could also have a SUPP-- record indicating whether the result was clinically significant.

4.5.6 Supplemental Reason Variables

The SDTM general observation classes include the --REASND variable to submit the reason a response is not present (a result in a findings class or an --OCCUR value in an events or interventions variable). However, sponsors sometimes collect the reason that something wasdone. For the interventions general observation class, --INDC is available to represent the medical condition for which the intervention was given, and --ADJ is available to represent the reason for a dose adjustment. If the sponsor collects a reason for performing a test represented in a findings or an activity represented in an events domain, or a reason for an intervention other than a medical indication, the reason can be represented in the SUPP-- dataset (as described in Section 8.4.1, Supplemental Qualifiers – SUPP-- Datasets) using the supplemental qualifier with QNAM of "--REAS" listed in Appendix C2, Supplemental Qualifiers Name Codes. If multiple reasons are reported, refer to Section 4.2.8.3, Multiple Values for a Non-Result Qualifier Variable.

For example, if the sponsor collected the reason that an extra lab test was done, a SUPPLB record might be populated as follows. Note that the sponsor used a label that was made more specific to the LB domain, rather than the label "Reason" in the appendix.

supplb.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIG
112345LB99-123LBSEQ3LBREASReason Test was PerformedORIGINAL SAMPLE LOSTCRF

4.5.7 Presence or Absence of Pre-Specified Interventions and Events

Interventions (e.g., concomitant medications) and Events (e.g., medical history) can generally be collected in two different ways, by recording either verbatim free text or the responses to a pre-specified list of treatments or terms. Since the method of solicitation for information on treatments and terms may affect the frequency at which they are reported, whether they were pre-specified may be of interest to reviewers. The --PRESP variable is used to indicate whether a specific intervention (--TRT) or event (--TERM) was solicited. The --PRESP variable has controlled terminology of "Y" (for "Yes") or a null value. It is a permissible variable, and should only be used when the topic variable values come from a pre-specified list. Questions such as "Did the subject have any concomitant medications?" or "Did the subject have any medical history?" should not have records in an SDTM domain because 1) these are not valid values for the respective topic variables of CMTRT and MHTERM, and 2) records whose sole purpose is to indicate whether or not a subject had records are not meaningful.

The --OCCUR variable is used to indicate whether a pre-specified intervention or event occurred or did not occur. It has controlled terminology of "Y" and "N" (for "Yes" and "No"). It is a permissible variable and may be omitted from the dataset if no topic-variable values were pre-specified.

If a study collects both pre-specified interventions and events as well as free-text events and interventions, the value of --OCCUR should be "Y" or "N" for all pre-specified interventions and events, and null for those reported as free-text.

The --STAT and --REASND variables can be used to provide information about pre-specified interventions and events for which there is no response (e.g., investigator forgot to ask). As in Findings, --STAT has controlled terminology of NOT DONE.

SituationValue of --PRESPValue of --OCCURValue of --STAT
Spontaneously reported event occurred


Pre-specified event occurredYY
Pre-specified event did not occurYN
Pre-specified event has no responseY
NOT DONE

Refer to the standard domains in the Events and Interventions General Observation Classes for additional assumptions and examples.

4.5.8 Accounting for Long-Term Follow-up

Studies often include long-term follow-up assessments to monitor a subject's condition. Use cases include studies in terminally ill populations that periodically assess survival and studies involving chronic disease that include follow up to assess relapse. Long-term follow-up is often conducted via telephone calls rather than clinic visits. Regardless of the method of contact, the information should be stored in the appropriate topic-based domain.

Overall study conclusion in the Disposition (DS) domain occurs once all contact with the subject ceases. If a study has a clinical treatment phase followed by a long-term follow-up phase, these two segments of the study can be represented as separate epochs within the overall study, each with its own epoch disposition record.

The recommended SDTM approach to storing these data can be described by an example.

Assume an oncology study encompasses two months of clinical treatment and assessments followed by once-monthly telephone contacts. The contacts continue until the subject dies. During the telephone contact, the investigator collects information on the subject's survival status and medication use. The answers to certain questions may trigger other data collection. For example, if the subject's survival status is "dead", then this indicates that the subject has ceased participation in the study, so a study discontinuation record would need to be created. In SDTM, the data related to these follow-up telephone contacts should be stored as follows:

  1. Concomitant medications reported during the contact should be stored in the Concomitant Medications (CM) domain.
  2. The subject's survival status should be stored in the SS (Subject Status) domain.
  3. The disposition of the subject at the time of the final follow-up contact should be stored in Disposition (DS). Note that overall study conclusion is the point where any contact with the subject ceases, which in this example is also the conclusion of long-term follow-up. The disposition of the subject at the conclusion of the two-month clinical treatment phase would be stored in DS as the conclusion to that epoch. Long-term follow-up would be represented as a separate epoch. Therefore, in this example the subject could have three disposition records in DS, with both the follow-up epoch disposition and the overall study conclusion disposition being collected at the final telephone contact. Refer to the Disposition (DS) domain (Section 6.2.3, Disposition) for detailed assumptions and examples.
  4. If the subject's survival status is "dead", the Demographics (DM) variables DTHDTC and DTHFL must be appropriately populated.
  5. The long-term follow-up phase would be represented in Trial Arms (TA), Trial Elements (TE), and Trial Visits (TV).
  6. The contacts would be recorded in Subject Visits (SV) and Subject Elements (SE) consistent with the way they are represented in TV and TE.

4.5.9 Baseline Values

The new variable --LOBXFL has been introduced in this release to address the need for a consistent definition of a value that can serve as a reference with which to compare post-treatment values. This generic definition approximates the concept of baseline and can be used to calculate post-treatment changes. In domains where --BLFL was expected, its core value has been changed from expected to permissible, the new variable --LOBXFL, with a core value of expected, has been added to contain the consistent definition. In domains where --BLFL was permissible, the new variable --LOBXFL was added with a core value of permissible.

The table below shows a set of similar flag variables and their usage across SDTM and ADaM:

VariableStructure Where It Is DefinedRequirement in That StructureDefinitionIntended Use
--LOBXFLSDTM FindingsExpected or PermissibleLast non-missing value prior to RFXSTDTC (Operationally derived)Consistent pre-treatment reference value baseline for use across all studies and sponsors.
ABLFLADaM BDSConditionally RequiredFlags the record that is the source of the baseline value for a given parameter specified in the Statistical Analysis Plan (May differ both across and within studies and datasets)Baseline for ADaM analysis as specified in the Statistical Analysis Plan
--BLFLSDTM FindingsPermissible (formerly expected in some domains)A baseline defined by the sponsor (Could be derived in the same manner as --LOBXFL or ABLFL, but is not required to be)Any sponsor-defined baseline use

As shown above, each variable serves a specific need. The SDTM variable --LOBXFL (and/or --BLFL, if used) can be copied to ADaM for traceability and transparency, but only the ADaM variable ABLFL would be used to signify baseline for analysis. The content of --LOBXFL and ABLFL will be exactly the same when the Statistical Analysis Plan specifies that the baseline used for analysis is the last non-missing value prior to RFXSTDTC.

5 Models for Special Purpose Domains

Special Purpose Domains is an SDTM category in its own right. Special Purpose Domains provide specific, standardized structures to represent additional important information that does not fit any of the General Observation Classes.

Domain CodeDomain Description
CO

Comments

A special purpose domain that contains comments that may be collected alongside other data.

DM

Demographics

A special purpose domain that includes a set of essential standard variables that describe each subject in a clinical study. It is the parent domain for all other observations for human clinical subjects.

SE

Subject Elements

A special purpose domain that contains the actual order of elements followed by the subject, together with the start date/time and end date/time for each element.

SM

Subject Disease Milestones

A special purpose domain that is designed to record the timing, for each subject, of disease milestones that have been defined in the Trial Disease Milestones (TM) domain.

SV

Subject Visits

A special purpose domain that contains the actual start and end data/time for each visit of each individual subject.

5.1 Comments

CO – Description/Overview

A special purpose domain that contains comments that may be collected alongside other data.

CO – Specification

co.xpt, Comments — Special Purpose, Version 3.3. One record per comment per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

CO – Assumptions

  1. The Comments special purpose domain provides a solution for submitting free-text comments related to data in one or more SDTM domains (as described in Section 8.5, Relating Comments to a Parent Domain) or collected on a separate CRF page dedicated to comments. Comments are generally not responses to specific questions; instead, comments usually consist of voluntary, free-text or unsolicited observations.
  2. The CO dataset accommodates three sources of comments:
    1. Those unrelated to a specific domain or parent record(s), in which case the values of the variables RDOMAIN, IDVAR, and IDVARVAL are null. CODTC should be populated if captured. See example, Row 1.
    2. Those related to a domain but not to specific parent record(s), in which case the value of the variable RDOMAIN is set to the DOMAIN code of the parent domain and the variables IDVAR and IDVARVAL are null. CODTC should be populated if captured. See example, Row 2.
    3. Those related to a specific parent record or group of parent records, in which case the value of the variable RDOMAIN is set to the DOMAIN code of the parent record(s) and the variables IDVAR and IDVARVAL are populated with the key variable name and value of the parent record(s). Assumptions for populating IDVAR and IDVARVAL are further described in Section 8.5, Relating Comments to a Parent Domain. CODTC should be null because the timing of the parent record(s) is inherited by the comment record. See example, Rows 3-5.
  3. When the comment text is longer than 200 characters, the first 200 characters of the comment will be in COVAL, the next 200 in COVAL1, and additional text stored as needed to COVALn. See example, Rows 3-4.
    Additional information about how to relate comments to parent SDTM records is provided in Section 8.5, Relating Comments to a Parent Domain.
  4. The variable COREF may be null unless it is used to identify the source of the comment. See example, Rows 1 and 5.
  5. Any Identifier variables and Timing variables may be added to the CO domain, but the following qualifiers would generally not be used in CO: --GRPID, --REFID, --SPID, VISIT, VISITNUM, VISITDY, TAETORD, --TPT, --TPTNUM, --ELTM, --TPTREF, --RFTDTC.

CO – Examples

Example

Row 1:Shows a comment collected on a separate comments page. Since it was unrelated to any specific domain or record, RDOMAIN, IDVAR, and IDVARVAL are null.
Row 2:Shows a comment that was collected on the bottom of the PE page for Visit 7, without any indication of specific records it applied to. Since the comment related to a specific domain, RDOMAIN is populated. Since it was related to a specific visit, VISIT, COREF is "VISIT 7". However, since it does not relate to a specific record, IDVAR and IDVARVAL are null.
Row 3:Shows a comment related to a single AE record having its AESEQ=7.
Row 4:Shows a comment related to multiple EX records with EXGRPID = "COMBO1".
Row 5:Shows a comment related to multiple VS records with VSGRPID = "VS2".
Row 6:Shows one option for representing a comment collected on a visit-specific comments page not associated with a particular domain. In this case, the comment is linked to the Subject Visit record in SV (RDOMAIN = "SV") and IDVAR and IDVARVAL are populated link the comment to the particular visit.
Row 7:Shows a second option for representing a comment associated only with a visit. In this case, COREF is used to show that the comment is related to the particular visit.
Row 8:Shows a third option for representing a comment associated only with a visit. In this case, the VISITNUM variable was populated to indicate that the comment was associated with a particular visit.

co.xpt

RowSTUDYIDDOMAINRDOMAINUSUBJIDCOSEQIDVARIDVARVALCOREFCOVALCOVAL1COVAL2COEVALVISITNUMCODTC
11234CO
AB-991


Comment text

PRINCIPAL INVESTIGATOR
2003-11-08
21234COPEAB-992

VISIT 7Comment text

PRINCIPAL INVESTIGATOR
2004-01-14
31234COAEAB-993AESEQ7PAGE 650First 200 charactersNext 200 charactersRemaining textPRINCIPAL INVESTIGATOR

41234COEXAB-994EXGRPIDCOMBO1PAGE 320-355First 200 charactersRemaining text
PRINCIPAL INVESTIGATOR

51234COVSAB-995VSGRPIDVS2
Comment text

PRINCIPAL INVESTIGATOR

61234COSVAB-996VISITNUM4
Comment Text

PRINCIPAL INVESTIGATOR

71234CO
AB-997

VISIT 4Comment Text

PRINCIPAL INVESTIGATOR

81234CO
AB-998


Comment Text

PRINCIPAL INVESTIGATOR4

5.2 Demographics

DM – Description/Overview

A special purpose domain that includes a set of essential standard variables that describe each subject in a clinical study. It is the parent domain for all other observations for human clinical subjects.

DM – Specification

dm.xpt, Demographics — Special Purpose, Version 3.3. One record per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

DM – Assumptions

  1. Investigator and site identification: Companies use different methods to distinguish sites and investigators. CDISC assumes that SITEID will always be present, with INVID and INVNAM used as necessary. This should be done consistently and the meaning of the variable made clear in the Define-XML document.
  2. Every subject in a study must have a subject identifier (SUBJID). In some cases a subject may participate in more than one study. To identify a subject uniquely across all studies for all applications or submissions involving the product, a unique identifier (USUBJID) must be included in all datasets. Subjects occasionally change sites during the course of a clinical trial. The sponsor must decide how to populate variables such as USUBJID, SUBJID and SITEID based on their operational and analysis needs, but only one DM record should be submitted for the subject. The Supplemental Qualifiers dataset may be used if appropriate to provide additional information.
  3. Concerns for subject privacy suggest caution regarding the collection of variables like BRTHDTC. This variable is included in the Demographics model in the event that a sponsor intends to submit it; however, sponsors should follow regulatory guidelines and guidance as appropriate.
  4. With the exception of trials that use multi-stage processes to assign subjects to Arms described below, ARM and ACTARM must be populated with ARM values from the Trial Arms (TA) dataset and ARMCD and ACTARMCD must be populated with ARMCD values from the TA dataset or be null. The ARM and ARMCD values in the TA dataset have a one-to-one relationship, and that one-to-one relationship must be preserved in the values used to populate ARM and ARMCD in DM, and to populate the values of ACTARM and ACTARMCD in DM.
    1. Rules for the Arm-Related Variables:
      1. If ARMCD is null, then ARM must be null and ARMNRS must be populated with the reason ARMCD is null.
      2. If ACTARMCD is null, then ACTARM must be null and ARMNRS must be populated with the reason ACTARMCD is null. Both ARMCD and ACTARMCD will be null for subjects who were not assigned to treatment. The same reason will provide the reason that both are null.
      3. ARMNRS may not be populated if both ARMCD and ACTARMCD are populated. ARMCD and ACTARMCD will be populated if the subject was assigned to an arm and received treatment consistent with one of the arms in the TA dataset. If ARMCD and ACTARMCD are not the same, that is sufficient to explain the situation; ARMNRS should not be populated.
      4. If ARMNRS is populated with "UNPLANNED TREATMENT", ACTARMUD should be populated with a description of the unplanned treatment received.
    2. Multi-stage Assignment to Treatment: Some trials use a multi-stage process for assigning a subject to an Arm. Example Trial 3 in Section 7.2.1 Trial Arms, illustrates such a trial. In such a case, best practice is to create ARMCD values which are composed of codes representing the results of the multiple stages of the treatment assignment process. If a subject is partially assigned, then truncated codes representing the stages completed can be used in ARMCD, and similar truncated codes can be used in ACTARMCD. The descriptions used to populate ARM and ACTARM should be similarly truncated, and the one-to-relationship between these truncated codes should be maintained for all affected subjects in the trial. Example 7 in Section 5.2, Demographics, provides an example of this situation, and Example 2 in Section 5.3, Subject Elements, shows another example. Note that this use of values not in the TA dataset is allowable only for trials with multi-stage assignment to Arms and to subjects in those trials who do not complete all stages of the assignment.
    3. Examples Illustrating the Arm-Related Variables
      1. Example 1 in Section 5.2, Demographics, shows how a subject who was a screen failure and was never treated would be handled.
      2. The Subject Elements (SE) dataset records the series of elements a subject passed through in the course of a trial, and these determine the value of ACTARMCD. The following examples include sample data for both datasets to illustrate this relationship.
        1. Example 6 in Section 5.2, Demographics, shows how subjects who started the trial but were never assigned to an Arm would be handled.
        2. Example 1 in Section 5.3, Subject Elements, shows representation of a situation for a subject who received a treatment that was not the one to which they were assigned.
        3. Example 2 in Section 5.3, Subject Elements, shows representation of a situation in which a subject received a set of treatments different from that for any of the planned Arms.
  5. Study population flags should not be included in SDTM data. The standard Supplemental Qualifiers included in previous versions of the SDTMIG (COMPLT, FULLSET, ITT, PPROT, and SAFETY) should not be used. Note that the ADaM subject-level analysis dataset (ADSL) specifies standard variable names for the most common populations and requires the inclusion of these flags when necessary for analysis; consult the ADaM Implementation Guide for more information about these variables.
  6. Submission of multiple race responses should be represented in the Demographics domain and Supplemental Qualifiers (SUPPDM) dataset as described in Section 4.2.8.3, Multiple Values for a Non-Result Qualifier Variable. If multiple races are collected, then the value of RACE should be "MULTIPLE" and the additional information will be included in the Supplemental Qualifiers dataset. Controlled terminology for RACE should be used in both DM and SUPPDM so that consistent values are available for summaries regardless of whether the data are found in a column or row. If multiple races were collected and one was designated as primary, RACE in DM should be the primary race and additional races should be reported in SUPPDM. When additional free-text information is reported about subject's RACE using "Other, Specify", sponsors should refer to Section 4.2.7.1, "Specify" Values for Non-Result Qualifier Variables. If the race was collected via an "Other, Specify" field and the sponsor chooses not to map the value as described in the current FDA guidance (see CDISC Notes for RACE in the domain specification), then the value of RACE should be "OTHER". If a subject refuses to provide race information, the value of RACE could be "UNKNOWN". See DM Example 3, DM Example 4, and DM Example 5.
  7. RFSTDTC, RFENDTC, RFXSTDTC, RFXENDTC, RFICDTC, RFPENDTC, and BRTHDTC represent date/time values, but they are considered to have a Record Qualifier role in DM. They are not considered to be Timing Variables because they are not intended for use in the general observation classes.
  8. Additional Permissible Identifier, Qualifier, and Timing Variables:
    1. Only the following Timing variables are permissible and may be added as appropriate: VISITNUM, VISIT, VISITDY. The Record Qualifier DMXFN (External File Name) is the only additional qualifier variable that may be added, which is adopted from the Findings general observation class, may also be used to refer to an external file, such as a patient narrative.
    2. The order of these additional variables within the domain should follow the rules as described in Section 4.1.4, Order of the Variables and the order described in Section 4.2, General Variable Assumptions.
  9. As described in Section 4.1.4, Order of the Variables, RFSTDTC is used to calculate study day variables. RFSTDTC is usually defined as the date/time when a subject was first exposed to study drug. This definition applies for most interventional studies, when the start of treatment is the natural and preferred starting point for study day variables and thus the logical value for RFSTDTC. In such studies, when data are submitted for subjects who are ineligible for treatment (e.g., screen failures with ARMNRS = "SCREEN FAILURE"), subjects who were enrolled but not assigned to an arm (e.g., ARMNRS = "NOT ASSIGNED"), or subjects who were randomized but not treated (e.g., ARMNRS = "NOT TREATED"), RFSTDTC will be null. For studies with designs that include a substantial portion of subjects who are not expected to be treated, a different protocol milestone may be chosen as the starting point for study day variables. Some examples include non-interventional or observational studies, studies with a no-treatment Arm, or studies where there is a delay between randomization and treatment.
  10. The DM domain contains several pairs of reference period variables: RFSTDTC and RFENDTC, RFXSTDTC and RFXENDTC and, RFICDTC and RFPENDTC. There are three sets of reference variables to accommodate distinct reference period definitions and there are instances when the values of the variables may be exactly the same, particularly the first two pairs of variables in the preceding list.
    1. RFSTDTC and RFENDTC: This pair of variables is sponsor defined, but usually represents the date/time of first and last study exposure. However, there are certain study designs where the start of the reference period is defined differently, such as studies that have a washout period before randomization or have a medical procedure, such as a biopsy, required during screening. In these cases, RFSTDTC may be the enrollment date, which is prior to first dose. Since study day values are calculated using RFSTDTC, in this case study days would not be based on the date of first dose.  
    2. RFXSTDTC and RFXENDTC: This pair of variables defines a consistent reference period for all interventional studies and is not open to customization. RFXSTDTC and RFXENDTC always represent the date/time of first and last study exposure. The study reference period often duplicates the reference period defined in RFSTDTC and RFENDTC, but not always. Therefore, this pair of variables is important as they guarantee that a reviewer will always be able to reference the first and last study exposure reference period. RFXSTDTC should be the same as SESTDTC for the first treatment Element described in the SE dataset. RFXENDTC may often be the same as the SEENDTC for the last treatment Element described in the SE dataset.
    3. RFICDTC and RFPENDTC: The definitions of this pair of variables are consistent in every study they're used in. They represent the entire period of a subject's involvement in a study, from providing informed consent through to the last participation event or activity. There may be times when this period coincides with other reference periods but that's unusual. An example in which these period would coincide with the study reference period, RFSTDTC to RFENDTC, might be an observational trial where no study intervention is administered. RFICDTC should correspond to the date of the informed consent protocol milestone in DS, if that protocol milestone is documented in DS. In the event that there are multiple informed consents, this will be the date of the first one. RFPENDTC will be the last date of participation for a subject for data included in a submission. This should be the last date of any record for the subject in the database at the time it's locked for submission. As such, it may not be the last date of participation in the study if the submission includes interim data.

DM – Examples

Example

dm.xpt

RowSTUDYIDDOMAINUSUBJIDSUBJIDRFSTDTCRFENDTCRFXSTDTCRFXENDTCRFICDTCRFPENDTCSITEIDINVNAMBRTHDTCAGEAGEUSEXRACEETHNICARMCDARMACTARMCDACTARMARMNRSACTARMUDCOUNTRY
1ABC123DMABC12301001010012006-01-122006-03-102006-01-122006-03-102006-01-032006-04-0101JOHNSON, M1948-12-1357YEARSMWHITEHISPANIC OR LATINOADrug AADrug A

USA
2ABC123DMABC12301002010022006-01-152006-02-282006-01-152006-02-282006-01-042006-03-2601JOHNSON, M1955-03-2250YEARSMWHITENOT HISPANIC OR LATINOPPlaceboPPlacebo

USA
3ABC123DMABC12301003010032006-01-162006-03-192006-01-162006-03-192006-01-022006-03-1901JOHNSON, M1938-01-1968YEARSFBLACK OR AFRICAN AMERICANNOT HISPANIC OR LATINOPPlaceboPPlacebo

USA
4ABC123DMABC1230100401004



2006-01-072006-01-0801JOHNSON, M1941-07-02

MASIANNOT HISPANIC OR LATINO



SCREEN FAILURE
USA
5ABC123DMABC12302001020012006-02-022006-03-312006-02-022006-03-312006-01-152006-04-1202GONZALEZ, E1950-06-2355YEARSFAMERICAN INDIAN OR ALASKA NATIVENOT HISPANIC OR LATINOPPlaceboPPlacebo

USA
6ABC123DMABC12302002020022006-02-032006-04-052006-02-032006-04-052006-01-102006-04-2502GONZALEZ, E1956-05-0549YEARSFNATIVE HAWAIIAN OR OTHER PACIFIC ISLANDERSNOT HISPANIC OR LATINOADrug AADrug A

USA

Example

Sample CRF:

EthnicityCheck one
Hispanic or Latino
Not Hispanic or Latino
RaceCheck one
American Indian or Alaska Native
Asian
Black or African American
Native Hawaiian or Other Pacific Islander
White
Row 1:Shows data for a subject who was "NOT HISPANIC OR LATINO" and was "ASIAN".
Row 2:Shows data for a subject who was "HISPANIC OR LATINO" and "WHITE".

dm.xpt

RowSTUDYIDDOMAINUSUBJIDRACEETHNIC
1ABCDM001ASIANNOT HISPANIC OR LATINO
2ABCDM002WHITEHISPANIC OR LATINO

Example

In this example, the subject is permitted to check all applicable races. Sample CRF:

RaceCheck all that apply
American Indian or Alaska Native
Asian
Black or African American
Native Hawaiian or Other Pacific Islander
White
Other, Specify: ____________________
Row 1:Subject "001" checked "Other, Specify" and entered a specify value of "Brazilian". "Brazilian" is represented in a supplemental qualifier.
Row 2:Subject "002" checked three of the listed races and "Other, Specify." The RACE variable is populated with "MULTIPLE" and the individual races are represented in supplemental qualifiers.
Row 3:Shows the record for a subject who refused to provide information about race.
Row 4:Shows the record for a subject who checked just one race, "ASIAN".

dm.xpt

RowSTUDYIDDOMAINUSUBJIDRACE
1ABCDM001OTHER
2ABCDM002MULTIPLE
3ABCDM003
4ABCDM004ASIAN
Row 1:The other race specified by subject "001" was represented using the supplemental qualifier RACEOTH.
Rows 2-4:The three selections made by subject "002" were represented using supplemental qualifiers RACE1, RACE2, and RACE 3. The third race checked was "Other, Specify", so the value of RACE3 is "OTHER".
Row 5:The other race specified by subject "002" was represented using the supplemental qualifier RACEOTH, in the same manner as for subject "001".

suppdm.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCDM001

RACEOTHRace, OtherBRAZILIANCRF
2ABCDM002

RACE1Race 1BLACK OR AFRICAN AMERICANCRF
3ABCDM002

RACE2Race 2AMERICAN INDIAN OR ALASKA NATIVECRF
4ABCDM002

RACE3Race 3OTHERCRF
5ABCDM002

RACEOTHRace, OtherABORIGINECRF

Example

In this example, the sponsor has chosen to map some of the predefined races to other races, specifically Japanese and Non-Japanese to Asian. Note: Sponsors may choose not to map race data, in which case the previous examples should be followed.

Sample CRF:

RaceCheck One
American Indian or Alaska Native
Asian
Japanese
Non-Japanese
Black or African American
Native Hawaiian or Other Pacific Islander
White
Row 1:Shows the record for a subject who checked "Non-Japanese", which was mapped by the sponsor to the RACE value "ASIAN".
Row 2:Shows the record for a subject who checked "Japanese", which was mapped by the sponsor to the RACE value "ASIAN".

dm.xpt

RowSTUDYIDDOMAINUSUBJIDRACE
1ABCDM001ASIAN
2ABCDM002ASIAN

The values captured on the CRF, "Non-Japanese" and "Japanese", were represented using the supplemental qualifier "RACEOR".

suppdm.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCDM001

RACEOROriginal RaceNON-JAPANESECRF
2ABCDM002

RACEOROriginal RaceJAPANESECRF

Example

In this example, the sponsor has chosen to map the values entered into the "Other, Specify" field to one of the preprinted races.

Note: Sponsors may choose not to map race data, in which case the first two examples should be followed.

Sample CRF:

RaceCheck One
American Indian or Alaska Native
Asian
Black or African American
Native Hawaiian or Other Pacific Islander
White
Other, Specify: _____________________
Row 1:Shows the record for a subject who checked "Other, Specify" and entered "Japanese". Their race was was mapped to "ASIAN" by the sponsor.
Row 2:Shows the record for a subject who checked "Other, Specify" and entered "Swedish". Their race was mapped to "WHITE" by the sponsor.

dm.xpt

RowSTUDYIDDOMAINUSUBJIDRACE
1ABCDM001ASIAN
2ABCDM002WHITE

The text entered in the "Other, Specify" line of the CRF was represented using the Supplemental qualifier RACEOR.

suppdm.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCDM001

RACEOROriginal RaceJAPANESECRF
2ABCDM002

RACEOROriginal RaceSWEDISHCRF

Example

The following example illustrates values of ARMCD for subjects in Example Trial 1, described in Section 7.2.1, Trial Arms. This study included two elements, Screen and Run-In, before subjects were randomized to treatment. For this study, the sponsor submitted data on all subjects, including screen-failure subjects.

This example Demography dataset does not include all the DM required and expected variables, only those that illustrate the variables that represent arm information.

Row 1:Subject "001" was randomized to Arm "Drug A". As shown in the SE dataset, this subject completed the "Drug A" element, so their actual arm was also "Drug A".
Row 2:Subject "002" was randomized to Arm "Drug B". As shown in the SE dataset, their actual arm was consistent with their randomization.
Row 3:Subject "003" was a screen failure, so they were not assigned to an arm or treated. The arm actual arm variables are null, and ARMNRS = "SCREEN FAILURE".
Row 4:Subject "004" withdrew during the Run-in Element. Like Subject "003", they were not assigned to an arm or treated. However, they were not considered a screen failure, and ARMNRS = "NOT ASSIGNED".
Row 5:Subject "005" was randomized but dropped out before being treated. Thus the actual arm variables are not populated and ARMNRS = "ASSIGNED, NOT TREATED".

dm.xpt

RowSTUDYIDDOMAINUSUBJIDARMCDARMACTARMCDACTARMARMNRSACTARMUD
1ABCDM001ADrug AADrug A

2ABCDM002BDrug BBDrug B

3ABCDM003



SCREEN FAILURE
4ABCDM004



NOT ASSIGNED
5ABCDM005ADrug A

ASSIGNED, NOT TREATED
Rows 1-3:Subject "001" completed all the Elements for Arm A.
Rows 4-6:Subject "002" completed all the Elements for Arm B.
Row 7:Subject "003" was a screen failure, who participated only in the "Screen" element.
Rows 8-9:Subject "004" withdrew during the "Run-in" Element, before they could be randomized.
Rows 10-11:Subject "005" withdrew after they were randomized, but did not start treatment.

se.xpt

RowSTUDYIDDOMAINUSUBJIDSESEQETCDELEMENTSESTDTCSEENDTC
1ABCSE0011SCRNScreen2006-06-012006-06-07
2ABCSE0012RIRun-In2006-06-072006-06-21
3ABCSE0013ADrug A2006-06-212006-07-05
4ABCSE0021SCRNScreen2006-05-032006-05-10
5ABCSE0022RIRun-In2006-05-102006-05-24
6ABCSE0023BDrug B2006-05-242006-06-07
7ABCSE0031SCRNScreen2006-06-272006-06-30
8ABCSE0041SCRNScreen2006-05-142006-05-21
9ABCSE0042RIRun-In2006-05-212006-05-26
10ABCSE0051SCRNScreen2006-05-142006-05-21
11ABCSE0052RIRun-In2006-05-212006-05-26

Example

The following example illustrates values of ARMCD for subjects in Example Trial 3, described in Section 7.2.1, Trial Arms.

Row 1:Subject "001" was randomized to Drug A. At the end of the Double Blind Treatment Epoch, they were assigned to Open Label A. Thus their ARMCD is "AA". They received the treatment to which they were assigned, so ACTRMCD is also "AA".
Row 2:Subject "002" was randomized to Drug A. They were lost to follow-up during the Double Blind Treatment Epoch, so never reached the Open Label Epoch, when they would have been assigned to either the Open Drug A or the Rescue Element. Their ARMCD is "A". This case illustrates the exception to the rule that ARMCD, ARM, ACTARMCD, and ACTARM must be populated with values from the TA dataset.
Row 3:Subject "003" was randomized to Drug A, but Received Drug B. At the end of the Double Blind Treatment Epoch, they were assigned to Rescue Treatment. ARMCD shows the result of their assignments, "AR", while ACTARMCD shows their actual treatment, "BR".

dm.xpt

RowSTUDYIDDOMAINUSUBJIDARMCDARMACTARMCDACTARMARMNRSACTARMUD
1DEFDM001AAA-OPEN AAAA-OPEN A

2DEFDM002AAAA

3DEFDM003ARA-RESCUEBRB-RESCUE

Rows 1-3:Show that the subject passed through all three Elements for the AA Arm.
Rows 4-5:Show the two Elements ("Screen" and "Treatment A") the subject passed through.
Rows 6-8:Show that the subject passed through the three Elements associated with the "B-Rescue" Arm.

se.xpt

RowSTUDYIDDOMAINUSUBJIDSESEQETCDELEMENTSESTDTCSEENDTC
1DEFSE0011SCRNScreen2006-01-072006-01-12
2DEFSE0012DBATreatment A2006-01-122006-04-10
3DEFSE0013OAOpen Drug A2006-04-102006-07-05
4DEFSE0021SCRNScreen2006-02-032006-02-10
5DEFSE0022DBATreatment A2006-02-102006-03-24
6DEFSE0031SCRNScreen2006-02-222006-03-01
7DEFSE0032DBBTreatment B2006-03-012006-06-27
8DEFSE0033RSCRescue2006-06-272006-09-24

5.3 Subject Elements

SE – Description/Overview

A special purpose domain that contains the actual order of elements followed by the subject, together with the start date/time and end date/time for each element.

The Subject Elements dataset consolidates information about the timing of each subject's progress through the Epochs and Elements of the trial. For Elements that involve study treatments, the identification of which Element the subject passed through (e.g., Drug X vs. placebo) is likely to derive from data in the Exposure domain or another Interventions domain. The dates of a subject's transition from one Element to the next will be taken from the Interventions domain(s) and from other relevant domains, according to the definitions (TESTRL values) in the Trial Elements (TE) dataset (Section 7.2.2, Trial Elements).

The Subject Elements dataset is particularly useful for studies with multiple treatment periods, such as crossover studies. The Subject Elements dataset contains the date/times at which a subject moved from one Element to another, so when the Trial Arms (TA; Section 7.2.1, Trial Arms), Trial Elements (TE; Section 7.2.2, Trial Elements), and Subject Elements datasets are included in a submission, reviewers can relate all the observations made about a subject to that subject's progression through the trial.

  • Comparison of the --DTC of a finding observation to the Element transition dates (values of SESTDTC and SEENDTC) tells which Element the subject was in at the time of the finding. Similarly, one can determine the Element during which an event or intervention started or ended.
  • "Day within Element" or "Day within Epoch" can be derived. Such variables relate an observation to the start of an Element or Epoch in the same way that study day (--DY) variables relate it to the reference start date (RFSTDTC) for the study as a whole. See Section 4.4.4, Use of the "Study Day" Variables.
  • Having knowledge of Subject Element start and end dates can be helpful in the determination of baseline values.

SE – Specification

se.xpt, Subject Elements — Special Purpose, Version 3.3. One record per actual Element per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

SE – Assumptions

Submission of the Subject Elements dataset is strongly recommended, as it provides information needed by reviewers to place observations in context within the study. The Trial Elements and Trial Arms datasets should also be submitted, as they define the design and the terms referenced by the Subject Elements dataset.

The Subject Elements domain allows the submission of data on the timing of the trial Elements a subject actually passed through in their participation in the trial. Read Section 7.2.2, Trial Elements, on the Trial Elements (TE) dataset and Section 7.2.1, Trial Arms, on the Trial Arms (TA) dataset, as these datasets define a trial's planned Elements and describe the planned sequences of Elements for the Arms of the trial.

  1. For any particular subject, the dates in the subject Elements table are the dates when the transition events identified in the Trial Elements table occurred. Judgment may be needed to match actual events in a subject's experience with the definitions of transition events (the events that mark the starts of new Elements) in the Trial Elements table, since actual events may vary from the plan. For instance, in a single-dose PK study, the transition events might correspond to study drug doses of 5 and 10 mg. If a subject actually received a dose of 7 mg when they were scheduled to receive 5 mg, a decision will have to be made on how to represent this in the SE domain.
  2. If the date/time of a transition Element was not collected directly, the method used to infer the Element start date/time should be explained in the Comments column of the Define-XML document.
  3. Judgment will also have to be used in deciding how to represent a subject's experience if an Element does not proceed or end as planned. For instance, the plan might identify a trial Element that is to start with the first of a series of 5 daily doses and end after 1 week, when the subject transitions to the next treatment Element. If the subject actually started the next treatment Epoch (see Section 7.1, Introduction to Trial Design Model Datasets and Section 7.1.2, Definitions of Trial Design Concepts) after 4 weeks, the sponsor would have to decide whether to represent this as an abnormally long Element, or as a normal Element plus an unplanned non-treatment Element.
  4. If the sponsor decides that the subject's experience for a particular period of time cannot be represented with one of the planned Elements, then that period of time should be represented as an unplanned Element. The value of ETCD for an unplanned Element is "UNPLAN" and SEUPDES should be populated with a description of the unplanned Element.
  5. The values of SESTDTC provide the chronological order of the actual subject Elements. SESEQ should be assigned to be consistent with the chronological order. Note that the requirement that SESEQ be consistent with chronological order is more stringent than in most other domains, where --SEQ values need only be unique within subject.
  6. When TAETORD is included in the SE domain, it represents the planned order of an Element in an Arm. This should not be confused with the actual order of the Elements, which will be represented by their chronological order and SESEQ. TAETORD will not be populated for subject Elements that are not planned for the Arm to which the subject was assigned. Thus, TAETORD will not be populated for any Element with an ETCD value of "UNPLAN". TAETORD also will not be populated if a subject passed through an Element that, although defined in the TE dataset, was out of place for the Arm to which the subject was assigned. For example, if a subject in a parallel study of Drug A vs. Drug B was assigned to receive Drug A, but received Drug B instead, then TAETORD would be left blank for the SE record for their Drug B Element. If a subject was assigned to receive the sequence of Elements A, B, C, D, and instead received A, D, B, C, then the sponsor would have to decide for which of these subject Element records TAETORD should be populated. The rationale for this decision should be documented in the Comments column of the Define-XML document.
  7. For subjects who follow the planned sequence of Elements for the Arm to which they were assigned, the values of EPOCH in the SE domain will match those associated with the Elements for the subject's Arm in the Trial Arms dataset. The sponsor will have to decide what value, if any, of EPOCH to assign SE records for unplanned Elements and in other cases where the subject's actual Elements deviate from the plan. The sponsor's methods for such decisions should be documented in the Define-XML document, in the row for EPOCH in the SE dataset table.
  8. Since there are, by definition, no gaps between Elements, the value of SEENDTC for one Element will always be the same as the value of SESTDTC for the next Element.
  9. Note that SESTDTC is required, although --STDTC is not required in any other subject-level dataset. The purpose of the dataset is to record the Elements a subject actually passed through. We assume that if it is known that a subject passed through a particular Element, then there must be some information on when it started, even if that information is imprecise. Thus, SESTDTC may not be null, although some records may not have all the components (e.g., year, month, day, hour, minute) of the date/time value collected.
  10. The following Identifier variables are permissible and may be added as appropriate: --GRPID, --REFID, --SPID.
  11. Care should be taken in adding additional Timing variables:
    1. The purpose of --DTC and --DY in other domains with start and end dates (Event and Intervention Domains) is to record the date and study day on which data was collected. The starts and ends of Elements are generally "derived" in the sense that they are a secondary use of data collected elsewhere; it is not generally useful to know when those date/times were recorded.
    2. --DUR could be added only if the duration of an element was collected, not derived.
    3. It would be inappropriate to add the variables that support time points (--TPT, --TPTNUM, --ELTM, --TPTREF, and --RFTDTC), since the topic of this dataset is Elements.

SE – Examples

STUDYID and DOMAIN, which are required in the SE and DM domains, have not been included in the following examples, to improve readability.

Example

This example shows data for two subjects for a crossover trial with four Epochs.

STUDYID and DOMAIN, which are required in the SE and DM domains, have not been included in the following examples, to improve readability.

Row 1:The record for the SCREEN Element for subject "789". Note that only the date of the start of the "SCREEN" Element was collected, while for the end of the Element, which corresponds to the start of IV dosing, both date and time were collected.
Row 2:The record for the IV Element for subject "789". The IV Element started with the start of IV dosing and ended with the start of oral dosing, and full date/times were collected for both.
Row 3:The record for the ORAL Element for subject "789". Only the date, and not the time, of the start of follow-up was collected.
Row 4:The FOLLOWUP Element for subject "789" started and ended on the same day. Presumably, the Element had a positive duration, but no times were collected.
Rows 5-8:Subject "790" was treated incorrectly, as shown by the fact that the values of SESEQ and TAETORD do not match. This subject entered the "IV" Element before the "ORAL" Element, but the planned order of Elements for this subject was "ORAL", then "IV". The sponsor has assigned EPOCH values for this subject according to the actual order of Elements, rather than the planned order. The correct order of Elements is the subject's ARMCD, shown in the DM dataset.
Rows 9-10:Subject "791" was screened, randomized to the IV-ORAL arm, and received the IV treatment, but did not return to the unit for the treatment epoch or follow up.

se.xpt

RowUSUBJIDSESEQETCDSESTDTCSEENDTCSEUPDESTAETORDEPOCH
17891SCREEN2006-06-012006-06-03T10:32
1SCREENING
27892IV2006-06-03T10:322006-06-10T09:47
2TREATMENT 1
37893ORAL2006-06-10T09:472006-06-17
3TREATMENT 2
47894FOLLOWUP2006-06-172006-06-17
4FOLLOW-UP
57901SCREEN2006-06-012006-06-03T10:14
1SCREENING
67902IV2006-06-03T10:142006-06-10T10:32
3TREATMENT 1
77903ORAL2006-06-10T10:322006-06-17
2TREATMENT 2
87904FOLLOWUP2006-06-172006-06-17
4FOLLOW-UP
97911SCREEN2006-06-012006-06-03T10:17
1SCREENING
107912IV2006-06-03T10:172006-06-7
3TREATMENT 1
Row 1:Subject "789" was assigned to the "IV-ORAL" arm and was treated accordingly.
Row 2:Subject "790" was assigned to the "ORAL-IV" arm, but their actual treatment was "IV" then "ORAL".
Row 3:Subject "791" was assigned to the "IV-ORAL" arm. Although they received only the first of the two planned treatment elements, they were following their assigned treatment when they withdrew early, so the actual arm variables are populated with the values for the arm to which they were assigned.

dm.xpt

RowUSUBJIDSUBJIDRFSTDTCRFENDTCSITEIDINVNAMBIRTHDTCAGEAGEUSEXRACEETHNICARMCDARMACTARMCDACTARMARMNRSACTARMUDCOUNTRY
17890012006-06-032006-06-1701SMITH, J1948-12-1357YEARSMWHITEHISPANIC OR LATINOIOIV-ORALIOIV-ORAL

USA
27900022006-06-032006-06-1701SMITH, J1955-03-2251YEARSMWHITENOT HISPANIC OR LATINOOIORAL-IVIOIV-ORAL

USA
37910032006-06-032006-06-0701SMITH, J1956-07-1749YEARSMWHITENOT HISPANIC OR LATINOIOIV-ORALIOIV-ORAL

USA

Example

The data below represent two subjects enrolled in a trial in which assignment to an arm occurs in two stages.

See Example Trial 3 as described in Section 7.2.1, Trial Arms. In this trial, subjects were randomized at the beginning of the blinded treatment epoch, then assigned to treatment for the open treatment epoch according to their response to treatment in the blinded treatment epoch. See Demographics domain DM Example 6 for other examples of ARM and ARMCD values for this trial.

In this trial, start of dosing was recorded as dates without times, so SESTDTC values include only dates. Epochs could not be assigned to observations that occurred on epoch transition dates on the basis of the SE dataset alone, so the sponsors algorithms for dealing with this ambiguity were documented in the Define-XML document.

Rows 1-2:Show data for a subject who completed only two Elements of the trial.
Rows 3-6:Show data for a subject who completed the trial, but received the wrong drug for the last 2 weeks of the double-blind treatment period. This has been represented by treating the period when the subject received the wrong drug as an unplanned Element. Note that TAETORD, which represents the planned order of Elements within an Arm, has not been populated for this unplanned Element. Even though this Element was unplanned, the sponsor assigned a value of BLINDED TREATMENT to EPOCH.

se.xpt

RowUSUBJIDSESEQETCDSESTDTCSEENDTCSEUPDESTAETORDEPOCH
11231SCRN2006-06-012006-06-03
1SCREENING
21232DBA2006-06-032006-06-10
2BLINDED TREATMENT
34561SCRN2006-05-012006-05-03
1SCREENING
44562DBA2006-05-032006-05-31
2BLINDED TREATMENT
54563UNPLAN2006-05-312006-06-13Drug B dispensed in error
BLINDED TREATMENT
64564RSC2006-06-132006-07-30
3OPEN LABEL TREATMENT
Row 1:Shows the record for a subject who was randomized to blinded treatment A, but withdrew from the trial before the open treatment epoch and did not have a second treatment assignment. They were thus incompletely assigned to an arm. The code used to represent this incomplete assignment, "A", is not in the Trial Arms table for this trial design, but is the first part of the codes for the two arms to which they could have been assigned ("AR" or "AO").
Row 2:Shows the record for a subject who was randomized to blinded treatment A, but was erroneously treated with B for part of the blinded treatment epoch. ARM and ARMCD for this subject reflect their planned treatment and are not affected by the fact that their treatment deviated from plan. Their assignment to Rescue treatment for the open treatment epoch proceeded as planned. The sponsor decided that the subject's treatment, which consisted partly of Drug A and partly of Drug B, did not match any planned arm, so ACTARMCD and ACTARM were left null. ARMNRS was populated with "UNPLANNED TREATMENT" and the way in which this treatment was unplanned was described in ACTARMUD.

dm.xpt

RowUSUBJIDSUBJIDRFSTDTCRFENDTCSITEIDINVNAMBIRTHDTCAGEAGEUSEXRACEETHNICARMCDARMACTARMCDACTARMARMNRSACTARMUDCOUNTRY
11230122006-06-032006-06-1001JONES, D1943-12-0862YEARSMASIANHISPANIC OR LATINOAAAA

USA
24561032006-05-032006-07-3001JONES, D1950-05-1555YEARSFWHITENOT HISPANIC OR LATINOARA-Rescue

UNPLANNED TREATMENTDrug B dispensed for part of Drug A elementUSA

5.4 Subject Disease Milestones

SM – Description/Overview

A special purpose domain that is designed to record the timing, for each subject, of disease milestones that have been defined in the Trial Disease Milestones (TM) domain.

SM – Specification

sm.xpt, Subject Disease Milestones — Special Purpose, Version 1.0. One record per Disease Milestone per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

SM – Assumptions

  1. Disease Milestones are observations or activities whose timings are of interest in the study. The types of Disease Milestones are defined at the study level in the Trial Disease Milestones (TM) dataset. The purpose of the Subject Disease Milestones dataset is to provide a summary timeline of the milestones for a particular subject.
  2. The name of the Disease Milestone is recorded in MIDS.
    1. For Disease Milestones that can occur only once (TMRPT = "N") the value of MIDS may be the value in MIDSTYPE or may an abbreviated version.
    2. For types of Disease Milestones that can occur multiple times, MIDS will usually be an abbreviated version of MIDSTYPE and will always end with a sequence number. Sequence numbers should start with one and indicate the chronological order of the instances of this type of Disease Milestone.
  3. The timing variables SMSTDTC and SMENDTC hold start and end date/times of data collected for the Disease Milestone(s) for each subject. SMSTDY and SMENDY represent the corresponding Study Day variables.
    1. The start date/time of the Disease Milestone is the critical date/time, and must be populated. If the Disease Milestone is an event, then the meaning of "start date" for the event may need to be defined.
    2. The start study day will not be populated if the start date/time includes only a year or only a year and month.
    3. The end date/time for the Disease Milestone is less important than the start date/time. It will not be populated if the Disease Milestone is a finding without an end date/time or if it is an event or intervention for which an end date/time has not yet occurred or was not collected.
    4. The end study day will not be populated if the end date/time includes only a year or only a year and month.

SM – Examples

Example

In this study, the Disease Milestones of interest were initial diagnosis and hypoglycemic events, as shown in Section 7.3.3, Trial Disease Milestones, Example 1.

Row 1:Shows that this subject's initial diagnosis of diabetes occurred in October of 2005. Since this is a partial date, SMDY is not populated. No end date/time was recorded for this Milestone.
Rows 2-3:Show that this subject had two hypoglycemic events. In this case, only start date/times have been collected. Since these date/times include full dates, SMSTDY has been populated in each case.
Row 4:Shows that this subject's initial diagnosis of diabetes occurred on May 15, 2010. Since a full date was collected, the study day of this Milestone was populated. Since diagnosis was pre-study, the study day of the Disease Milestone is negative. No hypoglycemic events were recorded for this subject.

sm.xpt

RowSTUDYIDDOMAINUSUBJIDSMSEQMIDSMIDSTYPESMSTDTCSMENDTCSMSTDYSMENDY
1XYZSM0011DIAGDIAGNOSIS2005-10


2XYZSM0012HYPO1HYPOGLYCEMIC EVENT2013-09-01T11:00
25
3XYZSM0013HYPO2HYPOGLYCEMIC EVENT2013-09-24T8:48
50
4XYZSM0021DIAGDIAGNOSIS2010-05-15
-1046

Information in SM is taken from records in other domains. In this study, diagnosis was represented in the MH domain, and hyypoglycemic events were represented in the CE domain.

The MH records for diabetes with MHEVDTYP = "DIAGNOSIS" are the records which represent the disease milestones for the defined MIDSTYPE of "DIAGNOSIS", so these records include the MIDS variable with the value "DIAG". Since these are records for disease milestones, rather than associated records, the variables RELMIDS and MIDSDTC are not needed.

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHTERMMHDECODMHEVDTYPMHPRESPMHOCCURMHDTCMHSTDTCMHENDTCMHDYMIDS
1XYZMH0011TYPE 2 DIABETESType 2 diabetes mellitusDIAGNOSISYY2013-08-062005-10
1DIAG
2XYZMH0021TYPE 2 DIABETESType 2 diabetes mellitusDIAGNOSISYY2013-08-062010-05-15
1DIAG

In this study, information about hypoglycemic events was collected in a separate CRF module, and CE records recorded in this module were represented with CECAT = "HYPOGLYCEMIC EVENT". Each CE record for a hypoglycemic event is a disease milestone, and records for a study have distinct values of MIDS.

ce.xpt

RowSTUDYIDDOMAINUSUBJIDCESEQCETERMCEDECODCECATCEPRESPCEOCCURCESTDTCCEENDTCMIDS
1XYZCE0011HYPOGLYCEMIC EVENTHypoglycaemiaHYPOGLYCEMIC EVENTYY2013-09-01T11:002013-09-01T2:30HYPO1
2XYZCE0011HYPOGLYCEMIC EVENTHypoglycaemiaHYPOGLYCEMIC EVENTYY2013-09-24T8:482013-09-24T10:00HYPO2

5.5 Subject Visits

SV – Description/Overview

A special purpose domain that contains the actual start and end data/time for each visit of each individual subject.

The Subject Visits domain consolidates information about the timing of subject visits that is otherwise spread over domains that include the visit variables (VISITNUM and possibly VISIT and/or VISITDY). Unless the beginning and end of each visit is collected, populating the Subject Visits dataset will involve derivations. In a simple case, where, for each subject visit, exactly one date appears in every such domain, the Subject Visits dataset can be created easily by populating both SVSTDTC and SVENDTC with the single date for a visit. When there are multiple dates and/or date/times for a visit for a particular subject, the derivation of values for SVSTDTC and SVENDTC may be more complex. The method for deriving these values should be consistent with the visit definitions in the Trial Visits (TV) dataset (Section 7.3.1, Trial Visits). For some studies, a visit may be defined to correspond with a clinic visit that occurs within one day, while for other studies, a visit may reflect data collection over a multi-day period.

The Subject Visits dataset provides reviewers with a summary of a subject's visits. Comparison of an individual subject's SV dataset with the TV dataset, which describes the planned visits for the trial, quickly identifies missed visits and "extra" visits. Comparison of the values of STVSDY and SVENDY to VISIT and/or VISITDY can often highlight departures from the planned timing of visits.

SV – Specification

sv.xpt, Subject Visits — Special Purpose, Version 3.2. One record per subject per actual visit, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

SV – Assumptions

  1. The Subject Visits domain allows the submission of data on the timing of the trial visits a subject actually passed through in their participation in the trial. Read Section 7.3.1, Trial Arms, on the Trial Visits (TV) dataset, as this dataset defines the planned visits for the trial.
  2. The identification of an actual visit with a planned visit sometimes calls for judgment. In general, data collection forms are prepared for particular visits, and the fact that data was collected on a form labeled with a planned visit is sufficient to make the association. Occasionally, the association will not be so clear, and the sponsor will need to make decisions about how to label actual visits. The sponsor's rules for making such decisions should be documented in the Define-XML document.
  3. Records for unplanned visits should be included in the SV dataset. For unplanned visits, SVUPDES should be populated with a description of the reason for the unplanned visit. Some judgment may be required to determine what constitutes an unplanned visit. When data are collected outside a planned visit, that act of collecting data may or may not be described as a "visit." The encounter should generally be treated as a visit if data from the encounter are included in any domain for which VISITNUM is included, since a record with a missing value for VISITNUM is generally less useful than a record with VISITNUM populated. If the occasion is considered a visit, its date/times must be included in the SV table and a value of VISITNUM must be assigned. See Section 4.4.5, Clinical Encounters and Visits for information on the population of visit variables for unplanned visits.
  4. VISITDY is the Planned Study Day of a visit. It should not be populated for unplanned visits.
  5. If SVSTDY is included, it is the actual study day corresponding to SVSTDTC. In studies for which VISITDY has been populated, it may be desirable to populate SVSTDY, as this will facilitate the comparison of planned (VISITDY) and actual (SVSTDY) study days for the start of a visit.
  6. If SVENDY is included, it is the actual day corresponding to SVENDTC.
  7. For many studies, all visits are assumed to occur within one calendar day, and only one date is collected for the Visit. In such a case, the values for SVENDTC duplicate values in SVSTDTC. However, if the data for a visit is actually collected over several physical visits and/or over several days, then SVSTDTC and SVENDTC should reflect this fact. Note that it is fairly common for screening data to be collected over several days, but for the data to be treated as belonging to a single planned screening visit, even in studies for which all other visits are single-day visits.
  8. Differentiating between planned and unplanned visits may be challenging if unplanned assessments (e.g., repeat labs) are performed during the time period of a planned visit.
  9. Algorithms for populating SVSTDTC and SVENDTC from the dates of assessments performed at a visit may be particularly challenging for screening visits since baseline values collected at a screening visit are sometimes historical data from tests performed before the subject started screening for the trial.
  10. The following Identifier variables are permissible and may be added as appropriate: --SEQ, --GRPID, --REFID, and --SPID.
  11. Care should be taken in adding additional Timing variables:
    1. If TAETORD and/or EPOCH are added, then the values must be those at the start of the visit.
    2. The purpose of --DTC and --DY in other domains with start and end dates (Event and Intervention Domains) is to record the date on which data was collected. It seems unnecessary to record the date on which the start and end of a visit were recorded.
    3. --DUR could be added if the duration of a visit was collected.
    4. It would be inappropriate to add the variables that support time points (--TPT, --TPTNUM, --ELTM, --TPTREF, and --RFTDTC), since the topic of this dataset is visits.
    5. --STRF and --ENRF could be used to say whether a visit started and ended before, during, or after the study reference period, although this seems unnecessary.
    6. --STRTPT, --STTPT, --ENRTPT, and --ENTPT could be used to say that a visit started or ended before or after particular dates, although this seems unnecessary.

SV – Examples

Example

The data below represents the visits for a single subject.

Row 1:Data for the screening visit was gathered over the course of six days.
Row 2:The visit called "DAY 1" started and ended as planned, on Day 1.
Row 3:The visit scheduled for Day 8 occurred one day early, on Day 7.
Row 4:The visit called "WEEK 2" started and ended as planned, on Day 15.
Row 5:Shows an unscheduled visit. SVUPDES provides the information that this visit dealt with evaluation of an adverse event. Since this visit was not planned, VISITDY was not populated. The sponsor chose not to populate VISIT. VISITNUM was populated, probably because the data collected at this encounter is in a Findings domain such as EG, LB, or VS, in which VISIT is treated as an important timing variable.
Row 6:This subject had their last visit, a follow-up visit on study day 26, eight days after the unscheduled visit, but well before the scheduled visit day of 71.

sv.xpt

RowSTUDYIDDOMAINUSUBJIDVISITNUMVISITVISITDYSVSTDTCSVENDTCSVSTDYSVENDYSVUPDES
1123456SV1011SCREEN-72006-01-152006-01-20-6-1
2123456SV1012DAY 112006-01-212006-01-2111
3123456SV1013WEEK 182006-01-272006-01-2777
4123456SV1014WEEK 2152006-02-042006-02-041515
5123456SV1014.1

2006-02-072006-02-071818Evaluation of AE
6123456SV1018FOLLOW-UP712006-02-152006-02-152626

6 Domain Models Based on the General Observation Classes

6.1 Models for Interventions Domains

Most subject-level observations collected during the study should be represented according to one of the three SDTM general observation classes. This is the list of domains corresponding to the Interventions class.

Domain CodeDomain Description
AG

Procedure Agents

An interventions domain that contains the agents administered to the subject as part of a procedure or assessment, as opposed to drugs, medications and therapies administered with therapeutic intent.

CM

Concomitant and Prior Medications

An interventions domain that contains concomitant and prior medications used by the subject, such as those given on an as needed basis or condition-appropriate medications.

EC and EX

Exposure Domains

Exposure (EX)

An interventions domain that contains the details of a subject's exposure to protocol-specified study treatment. Study treatment may be any intervention that is prospectively defined as a test material within a study, and is typically but not always supplied to the subject.

Exposure as Collected (EC)

An interventions domain that contains information about protocol-specified study treatment administrations, as collected.

ML

Meal Data

Information regarding the subject's meal consumption, such as fluid intake, amounts, form (solid or liquid state), frequency, etc., typically used for pharmacokinetic analysis.

PR

Procedures

An interventions domain that contains interventional activity intended to have diagnostic, preventive, therapeutic, or palliative effects.

SU

Substance Use

An interventions domain that contains substance use information that may be used to assess the efficacy and/or safety of therapies that look to mitigate the effects of chronic substance use.

6.1.1 Procedure Agents

AG – Description/Overview

An interventions domain that contains the agents administered to the subject as part of a procedure or assessment, as opposed to drugs, medications and therapies administered with therapeutic intent.

AG – Specification

ag.xpt, Procedure Agents — Interventions, Version 1.0. One record per recorded intervention occurrence per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

AG – Assumptions

  1. AG Purpose: Some tests involve administration of substances, and it has been unclear what domain these should be represented in.
    1. The Concomitant Medications domain seemed particularly inappropriate when the substance was one that would never be given as a medication. Even substances that are medications are not being used as such when they are given as part of a testing procedure.
    2. The Exposure domain also seemed inappropriate, since although the testing procedure might be part of the study plan, these data would not be used or analyzed in the same way as data about study treatments. The Procedure Agents domain was created to fill this gap.
    3. The Procedure Agents domain has advantages over the Procedures domain for this purpose. It allows recording of multiple substance administrations for a single testing procedure. It also separates data about substance administrations from data about procedures that do not involve substance administration.
    4. Information about the conduct of the procedure with which the procedure agent administration was associated, if collected, should be represented in the Procedures (PR) domain.
  2. AG Examples and Structure
    1. Examples of agents administered as part of a procedure include a short-acting bronchodilator administered as part of a reversibility assessment and contrast agents or radio-labeled substances used in imaging studies.
    2. The structure of the AG domain is one record per agent intervention episode, or pre-specified agent assessment per subject. It is the sponsor's responsibility to define an intervention episode. This definition may vary based on the sponsor's requirements for review and analysis.
  3. Procedure Agent Description and Coding
    1. AGTRT captures the name of the agent and it is the topic variable. It is a required variable and must have a value. AGTRT should include only the agent name, and should not include dosage, formulation, or other qualifying information. For example, "ALBUTEROL 2 PUFF" is not a valid value for AGTRT. This example should be expressed as AGTRT = "ALBUTEROL", AGDOSE = "2", AGDOSU = "PUFF", and AGDOSFRM = "AEROSOL".
    2. AGMODIFY should be included if the sponsor's procedure permits modification of a verbatim term for coding.
    3. AGDECOD is the standardized agent term derived by the sponsor from the coding dictionary. It is possible that the reported term (AGTRT) or the modified term (AGMODIFY) can be coded using a standard dictionary. In this instance the sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
  4. Pre-specified Terms; Presence or Absence of Procedure Agents
    1. AGPRESP is used to indicate whether an agent was pre-specified.
    2. AGOCCUR is used to indicate whether a pre-specified agent was used. A value of "Y" indicates that the agent was used and "N" indicates that it was not.
    3. If an agent was not pre-specified, the value of AGOCCUR should be null. AGPRESP and AGOCCUR are permissible fields and may be omitted from the dataset if all agents were collected as free text. Values of AGOCCUR may also be null for pre-specified agents if no Y/N response was collected; in this case, AGSTAT = "NOT DONE", and AGREASND could be used to describe the reason the answer was missing.
  5. Any Identifier variables, Timing variables, or Interventions general-observation-class qualifiers may be added to the AG domain.
    1. However, --INDC, although allowed, would not generally be used since substance administrations represented in AG are given as part of a testing procedure rather than with therapeutic intent.
    2. The variables --DOSTOT and --DOSRGM, although allowed, would generally not be used since procedure agents are likely to be recorded at the level of single administrations.

AG – Examples

Example

This example captures data about the allergen administered to the subject as part of a bronchial allergen challenge (BAC) test.

Prior to the BAC, the subject had a skin-prick allergen test to help identify the allergen to be used for the BAC test. It identified grass as the allergen to be used in the BAC test. Data from the allergen skin test are not shown, but the CRF for the BAC includes collection of the allergen chosen for use in the BAC. A predetermined set of ascending doses of the chosen allergen was used in the screening BAC test. The results of the screening BAC are not shown, but would be represented in the RE domain.

Row 1:The first dose given in the BAC was saline.
Rows 2-4:Three successively higher doses of grass allergen were given.

ag.xpt

RowSTUDYIDDOMAINUSUBJIDAGSEQAGTRTAGPRESPAGOCCURAGDOSEAGDOSUAGROUTEVISITAGENDTC
1XYZAGXYZ-001-0011SALINEYY0SQ-u/mLRESPIRATORY (INHALATION)SCREENING2010-11-07T10:56:00
2XYZAGXYZ-001-0012GRASSYY250SQ-u/mLRESPIRATORY (INHALATION)SCREENING2010-11-07T11:19:00
3XYZAGXYZ-001-0013GRASSYY1000SQ-u/mLRESPIRATORY (INHALATION)SCREENING2010-11-07T11:43:00
4XYZAGXYZ-001-0014GRASSYY2000SQ-u/mLRESPIRATORY (INHALATION)SCREENING2010-11-07T12:06:00

Example

In this example, first there was a check that the subject had not taken a short-acting bronchodilator in the previous 4 hours (CM domain). Then the procedure agent (AG domain) was given as part of a reversibility assessment. Spirometry measurements (RE domain) were obtained before and after agent administration. An identifier was assigned to the reversibility test and this identifier was used to be link data across the multiple SDTM domains in which the data are represented.

The question as to whether a short-acting bronchodilator was administered in the 4 hours prior to the reversibility assessment is represented in the Concomitant Medication (CM) domain, since this prior administration would have been for therapeutic effect, not as part of the procedure. The question asked was about the administration of any short-acting bronchodilator, rather than a specific medication, so both CMTRT and CMCAT are populated with the "SHORT-ACTING BRONCHODILATOR", which describes a group of medications. The CMSPID value RV1 was used to indicate that this question was associated with the reversibility test.

cm.xpt

RowSTUDYIDDOMAINUSUBJIDCMSEQCMSPIDCMTRTCMCATCMPRESPCMOCCURCMEVLINT
1XYZCMXYZ-001-0011RV1SHORT-ACTING BRONCHODILATORSHORT-ACTING BRONCHODILATORYN-PT4H

The administration of albuterol as part of the reversibility procedure is represented in the Procedure Agents (AG) domain. The AGSPID value RV1 was used to indicate that this administration was associated with the reversibility test.

ag.xpt

RowSTUDYIDDOMAINUSUBJIDAGSEQAGSPIDAGTRTAGPRESPAGOCCURAGDOSEAGDOSUAGDOSFRMAGDOSFRQAGROUTEVISITAGSTDTC
1XYZAGXYZ-001-0011RV1ALBUTEROLYY2PUFFAEROSOLONCERESPIRATORY (INHALATION)VISIT 22013-06-18T10:05

The sponsor populated REGRPID with RV1 to indicate that these pulmonary function tests were associated with the reversibility test. The spirometer used in the testing is identified in SPDEVID. See the SDTM Implementation Guide for Medical Devices (SDTMIG-MD) for information about representing device-related information.

Row 1:Shows the results for the pre-bronchodilator FEV1 test performed as part of a reversibility assessment. The timing reference variables RETPT, RETPTNUM, REELTM, RETPTREF, and RERFTDTC show that this test was performed 5 minutes before the bronchodilator challenge.
Row 2:Shows the results for FEV1 test performed 20 minutes after the bronchodilator challenge.
Row 3:Since the percentage reversibility was collected on the CRF, it is included in the SDTM dataset.

re.xpt

RowSTUDYIDDOMAINUSUBJIDSPDEVIDRESEQREGRPIDRETESTCDRETESTREORRESREORRESURESTRESCRESTRESNRESTRESUVISITREDTCRETPTRETPTNUMREELTMRETPTREFRERFTDTC
1XYZREXYZ-001-001ABC0011RV1FEV1Forced Expiratory Volume in 1 Second2.43L2.432.43LVISIT 22013-06-18T10:00PRE-BRONCHODILATOR ADMINISTRATION1-PT5MBRONCHODILATOR ADMINISTRATION2013-06-18T10:05
2XYZREXYZ-001-001ABC0012RV1FEV1Forced Expiratory Volume in 1 Second2.77L2.772.77LVISIT 22013-06-18T10:00POST-BRONCHODILATOR ADMINISTRATION2PT20MBRONCHODILATOR ADMINISTRATION2013-06-18T10:05
3XYZREXYZ-001-001ABC0013RV1PTCREVPercentage Reversibility13.99%13.9913.99%VISIT 22013-06-18T10:00


BRONCHODILATOR ADMINISTRATION2013-06-18T10:05

The identifier for the device used in the test was established in the Device Identifier (DI) domain.

di.xpt

RowSTUDYIDDOMAINSPDEVIDDISEQDIPARMCDDIPARMDIVAL
1XYZDIABC0011TYPEDevice TypeSPIROMETER

The relationship of the test agent to the spirometry measurements obtained before and after its administration and to the prior occurrence of short acting bronchodilator administration is recorded by means of a relationship in RELREC.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1XYZAGXYZ-001-001AGSPID1
1
2XYZREXYZ-001-001REGRPID1
1
3XYZCMXYZ-001-001CMSPID1
1

6.1.2 Concomitant and Prior Medications

CM – Description/Overview

An interventions domain that contains concomitant and prior medications used by the subject, such as those given on an as needed basis or condition-appropriate medications.

CM – Specification

cm.xpt, Concomitant/Prior Medications — Interventions, Version 3.3. One record per recorded intervention occurrence or constant-dosing interval per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

CM – Assumptions

  1. CM Structure
    1. The structure of the CM domain is one record per medication intervention episode, constant-dosing interval, or pre-specified medication assessment per subject. It is the sponsor's responsibility to define an intervention episode. This definition may vary based on the sponsor's requirements for review and analysis. The submission dataset structure may differ from the structure used for collection. One common approach is to submit a new record when there is a change in the dosing regimen. Another approach is to collapse all records for a medication to a summary level with either a dose range or the highest dose level. Other approaches may also be reasonable as long as they meet the sponsor's evaluation requirements.
  2. Concomitant Medications Description and Coding
    1. CMTRT captures the name of the Concomitant Medications/Therapy and it is the topic variable. It is a required variable and must have a value. CMTRT should only include the medication/therapy name and should not include dosage, formulation, or other qualifying information. For example, "ASPIRIN 100MG TABLET" is not a valid value for CMTRT. This example should be expressed as CMTRT= "ASPIRIN", CMDOSE= "100", CMDOSU= "MG", and CMDOSFRM= "TABLET".
    2. CMMODIFY should be included if the sponsor's procedure permits modification of a verbatim term for coding.
    3. CMDECOD is the standardized medication/therapy term derived by the sponsor from the coding dictionary. It is expected that the reported term (CMTRT) or the modified term (CMMODIFY) will be coded using a standard dictionary. The sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
  3. Pre-specified Terms; Presence or Absence of Concomitant Medications
    1. Information on concomitant medications is generally collected in two different ways, either by recording free text or using a pre-specified list of terms. Since the solicitation of information on specific concomitant medications may affect the frequency at which they are reported, the fact that a specific medication was solicited may be of interest to reviewers. CMPRESP and CMOCCUR are used together to indicate whether the intervention in CMTRT was pre-specified and whether it occurred, respectively.
    2. CMOCCUR is used to indicate whether a pre-specified medication was used. A value of "Y" indicates that the medication was used and "N" indicates that it was not.
    3. If a medication was not pre-specified the value of CMOCCUR should be null. CMPRESP and CMOCCUR is a permissible fields and may be omitted from the dataset if all medications were collected as free text. Values of CMOCCUR may also be null for pre-specified medications if no Y/N response was collected; in this case, CMSTAT = "NOT DONE", and CMREASND could be used to describe the reason the answer was missing.
  4. Variables for Timing Relative to a Time Point
    1. CMSTRTPT, CMSTTPT, CMENRTPT, and CMENTPT may be populated as necessary to indicate when a medication was used relative to specified time points. For example, assume a subject uses birth control medication. The subject has used the same medication for many years and continues to do so. The date the subject began using the medication (or at least a partial date) would be stored in CMSTDTC. CMENDTC is null since the end date is unknown (it hasn't happened yet). This fact can be recorded by setting CMENTPT = "2007-04-30" (the date the assessment was made) and CMENRTPT = "ONGOING".
  5. Additional Permissible Variables
    1. Any Identifier variables, Timing variables, or Interventions general-observation-class qualifiers may be added to the CM domain, but the following qualifiers would generally not be used in CM: --MOOD, --LOT.

CM – Examples

Example

Sponsors collect the timing of concomitant medication use with varying specificity, depending on the pattern of use; the type, purpose, and importance of the medication; and the needs of the study. It is often unnecessary to record every unique instance of medication use, since the same information can be conveyed with start and end dates and frequency of use. If appropriate, medications taken as needed (intermittently or sporadically over a time period) may be reported with a start and end date and a frequency of "PRN".

The example below shows three subjects who took the same medication on the same day.

Rows 1-6:For this subject, each instance of aspirin use was recorded separately, and the frequency in each record is (CMDOSFRQ) is "ONCE".
Rows 7-9:For a second subject, frequency was once a day ("QD") in their first and third records (where CMSEQ is "1" and "3"), but twice a day in their second record (CMSEQ = "2").
Row 10:Records for the third subject are collapsed into a single entry that spans the relevant time period, with a frequency of "PRN". This is shown as an example only, not as a recommendation. This approach assumes that knowing exactly when aspirin was used is not important for evaluating safety and efficacy in this study.

cm.xpt

RowSTUDYIDDOMAINUSUBJIDCMSEQCMTRTCMDOSECMDOSUCMDOSFRQCMSTDTCCMENDTC
1ABCCMABC-00011ASPIRIN100mgONCE2004-01-012004-01-01
2ABCCMABC-00012ASPIRIN100mgONCE2004-01-022004-01-02
3ABCCMABC-00013ASPIRIN100mgONCE2004-01-032004-01-03
4ABCCMABC-00014ASPIRIN100mgONCE2004-01-072004-01-07
5ABCCMABC-00015ASPIRIN100mgONCE2004-01-072004-01-07
6ABCCMABC-00016ASPIRIN100mgONCE2004-01-092004-01-09
7ABCCMABC-00021ASPIRIN100mgQD2004-01-012004-01-03
8ABCCMABC-00022ASPIRIN100mgBID2004-01-072004-01-07
9ABCCMABC-00023ASPIRIN100mgQD2004-01-092004-01-09
10ABCCMABC-00031ASPIRIN100mgPRN2004-01-012004-01-09

Example

The example below is for a study that had a particular interest in whether subjects use any anticonvulsant medications. The medication history, dosing, etc., was not of interest; the study only asked for the anticonvulsants to which subjects were exposed.

cm.xpt

RowSTUDYIDDOMAINUSUBJIDCMSEQCMTRTCMCAT
1ABC123CM11LITHIUMANTI-CONVULSANT
2ABC123CM21VPAANTI-CONVULSANT

Example

Sponsors often are interested in whether subjects are exposed to specific concomitant medications, and collect this information using a checklist. This example is for a study that had a particular interest in the antidepressant medications that subjects used. For the study's purposes, absence is just as important as presence of a medication. This can be clearly shown using CMOCCUR.

In this example, CMPRESP shows that the subjects were specifically asked if they use any of three antidepressants (Zoloft, Prozac, and Paxil). The value of CMOCCUR indicates the response to the pre-specified medication question. CMSTAT indicates whether the response was missing for a pre-specified medication, and CMREASND shows the reason for missing response. The medication details (e.g., dose, frequency) were not of interest in this study.

Row 1:Medication use was solicited and the medication was taken.
Row 2:Medication use was solicited and the medication was not taken.
Row 3:Medication use was solicited, but data was not collected. The reason for the lack of a response was collected and is represented in CMREASND.

cm.xpt

RowSTUDYIDDOMAINUSUBJIDCMSEQCMTRTCMPRESPCMOCCURCMSTATCMREASND
1ABC123CM11ZOLOFTYY

2ABC123CM12PROZACYN

3ABC123CM13PAXILY
NOT DONEDidn't ask due to interruption

Example

In this hepatitis C study, collection of data on prior treatments included reason for discontinuation. Since hepatitis C is usually treated with a combinations of medications, CMGRPID was used to group records into regimens.

Rows 1-3:This subject's treatment consisted of the three medications grouped by means of CMGRPID = "1". The subject completed the scheduled treatment.
Rows 4-6:Another subject received the same set of three medications. The medications for this subject are also grouped using CMGRPID = "1". Note, however, that the fact that the same CMGRPID value has been used for the same set of medications for subjects "ABC123-765" and "ABC123-899" is coincidence; CMGRPID groups records only within a subject. This subject stopped the regimen due to side effects.

cm.xpt

RowSTUDYIDDOMAINUSUBJIDCMSEQCMGRPIDCMTRTCMCATCMDOSFRMCMROUTECMRSDISC
1ABC123CMABC123-76511PEGINTRONHCV TREATMENTINJECTIONSUBCUTANEOUSCOMPLETED SCHEDULED TREATMENT
2ABC123CMABC123-76521RIBAVIRINHCV TREATMENTTABLETORALCOMPLETED SCHEDULED TREATMENT
3ABC123CMABC123-76531BOCEPREVIRHCV TREATMENTTABLETORALCOMPLETED SCHEDULED TREATMENT
4ABC123CMABC123-89911PEGINTRONHCV TREATMENTINJECTIONSUBCUTANEOUSTOXICITY/INTOLERANCE
5ABC123CMABC123-89921RIBAVIRINHCV TREATMENTTABLETORALTOXICITY/INTOLERANCE
6ABC123CMABC123-89931BOCEPREVIRHCV TREATMENTTABLETORALTOXICITY/INTOLERANCE

6.1.3 Exposure Domains

Clinical trial study designs can range from open label (where subjects and investigators know which product each subject is receiving) to blinded (where the subject, investigator, or anyone assessing the outcome is unaware of the treatment assignment(s) to reduce potential for bias). To support standardization of various collection methods and details, as well as process differences between open-label and blinded studies, two SDTM domains based on the Interventions General Observation Class are available to represent details of subject exposure to protocol-specified study treatment(s).

The two domains are introduced below.

6.1.3.1 Exposure

EX – Description/Overview

An interventions domain that contains the details of a subject's exposure to protocol-specified study treatment. Study treatment may be any intervention that is prospectively defined as a test material within a study, and is typically but not always supplied to the subject.

EX – Specification

ex.xpt, Exposure — Interventions, Version 3.3. One record per protocol-specified study treatment, constant-dosing interval, per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

EX – Assumptions
  1. EX Structure and Usage
    1. Examples of treatments represented in the EX domain include but are not limited to placebo, active comparators, and investigational products. Treatments that are not protocol-specified should be represented in the Concomitant Medication (CM) or another Interventions domain as appropriate.
    2. The EX domain is recognized in most cases as a derived dataset where EXDOSU reflects the protocol-specified unit per study treatment. Collected data points (e.g., number of tablets, total volume infused) along with additional inputs (e.g., randomization file, concentration, dosage strength, drug accountability) are used to derive records in the EX domain.
    3. The EX domain is required for all studies that include protocol-specified study treatment. Exposure records may be directly or indirectly determined; metadata should describe how the records were derived. Common methods for determining exposure (from most direct to least direct) include the following:
      1. Derived from actual observation of the administration of drug by the investigator
      2. Derived from automated dispensing device that records administrations
      3. Derived from subject recall
      4. Derived from drug accountability data
      5. Derived from the protocol
      When a study is still masked and protocol-specified study treatment doses cannot yet be reflected in the protocol-specified unit due to blinding requirements, then the EX domain is not expected to be populated.
    4. The EX domain should contain one record per constant-dosing interval per subject. "Constant-dosing interval" is sponsor defined, and may include any period of time that can be described in terms of a known treatment given at a consistent dose, frequency, infusion rate, etc. For example, for a study with once-a-week administration of a standard dose for 6 weeks, exposure may be represented as one of the following:
      1. If information about each dose is not collected, there would be a single record per subject, spanning the entire 6-week treatment phase.
      2. If the sponsor monitors each treatment administration, there could be up to six records (one for each weekly administration).
  2. Exposure Treatment Description
    1. EXTRT captures the name of the protocol-specified study treatment and is the topic variable. It is a Required variable and must have a value. EXTRT must include only the treatment name and must not include dosage, formulation, or other qualifying information. For example, "ASPIRIN 100MG TABLET" is not a valid value for EXTRT. This example should be expressed as EXTRT = "ASPIRIN", EXDOSE = "100", EXDOSU = "mg", and EXDOSFRM = "TABLET".
    2. Doses of placebo should be represented by EXTRT = "PLACEBO" and EXDOSE = "0" (indicating 0 mg of active ingredient was taken or administered).
  3. Categorization and Grouping
    1. EXCAT and EXSCAT may be used when appropriate to categorize treatments into categories and subcategories. For example, if a study contains several active comparator medications, EXCAT may be set to "ACTIVE COMPARATOR". Such categorization may not be useful in all studies, so these variables are permissible.
  4. Timing Variables
    1. The timing of exposure to study treatment is captured by the start/end date and start/end time of each constant-dosing interval. If the subject is only exposed to study medication within a clinical encounter (e.g., if an injection is administered at the clinic), VISITNUM may be added to the domain as an additional Timing variable. VISITDY and VISIT would then also be permissible Qualifiers. However, if the beginning and end of a constant-dosing interval is not confined within the time limits of a clinical encounter (e.g., if a subject takes pills at home), then it is not appropriate to include VISITNUM in the EX domain. This is because EX is designed to capture the timing of exposure to treatment, not the timing of dispensing treatment. Furthermore, VISITNUM should not be used to indicate that treatment began at a particular visit and continued for a period of time. The SDTM does not have any provision for recording "start visit" and "end visit" of exposure.
    2. For administrations considered given at a point in time (e.g., oral tablet, pre-filled syringe injection), where only an administration date/time is collected, EXSTDTC should be copied to EXENDTC as the standard representation.
  5. Collected exposure data points are to be represented in the EC domain. When the relationship between EC and EX records can be described in RELREC, then it should be defined. EX derivations must be described in the Define-XML document.
  6. Additional Interventions Qualifiers
    1. EX contains medications received; the inclusion of administrations not taken, not given or missed is under evaluation.
    2. --DOSTOT is under evaluation for potential deprecation and replacement with a mechanism to describe total dose over any interval of time (e.g., day, week, month). Sponsors considering use of EXDOSTOT may want to consider using other dose amount variables (EXDOSE or EXDOSTXT) in combination with frequency (EXDOSFRQ) and timing variables to represent the data.
    3. When the EC domain is implemented in conjunction with the EX domain, EXVAMT and EXVAMTU should not be used in EX; collected values instead would be represented in ECDOSE and ECDOSU.
    4. Any Identifier variables, Timing variables, or Findings general-observation-class qualifiers may be added to the EX domain, but the following qualifiers would generally not be used in EX: --PRESP, --OCCUR, --STAT, and --REASND.

6.1.3.2 Exposure as Collected

EC – Description/Overview

An interventions domain that contains information about protocol-specified study treatment administrations, as collected.

EC – Specification

ec.xpt, Exposure as Collected — Interventions, Version 3.3. One record per protocol-specified study treatment, constant-dosing interval, per subject, per mood, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

EC – Assumptions
  1. EC Definition
    1. The Exposure as Collected domain model reflects protocol-specified study treatment administrations, as collected.
      1. EC should be used in all cases where collected exposure information cannot or should not be directly represented in EX. For example, administrations collected in tablets but protocol-specified unit is mg, or administrations collected in mL but protocol-specified unit is mg/kg. Drug accountability details (e.g., amount dispensed, amount returned) are represented in DA and not in EC.
      2. Collected exposure data are in most cases represented in a combination of one or more of EC, DA, or FA domains. If the entire EC dataset is an exact duplicate of the entire EX dataset, then EC is optional and at the sponsor's discretion.
    2. Collected exposure log data points descriptive of administrations typically reflect amounts at the product-level (e.g., number of tablets, number of mL).
  2. Treatment Description
    1. ECTRT is sponsor defined and should reflect how the protocol-specified study treatment is known or referred to in data collection. In an open-label study, ECTRT should store the treatment name. In a masked study, if treatment is collected and known as Tablet A to the subject or administrator, then ECTRT = "TABLET A". If in a masked study the treatment is not known by a synonym and the data are to be exchanged between sponsors, partners and/or regulatory agency(s), then assign ECTRT the value of "MASKED".
  3. ECMOOD is permissible; when implemented, it must be populated for all records.
    1. Values of ECMOOD, to date include:
      1. "SCHEDULED" (for collected subject-level intended dose records)
      2. "PERFORMED" (for collected subject-level actual dose records)
    2. Qualifier variables should be populated with equal granularity across Scheduled and Performed records when known. For example, if ECDOSU and ECDOSFRQ are known at scheduling and administration, then the variables would be populated on both records. If ECLOC is determined at the time of administration, then it would be populated on the performed record only.
    3. Appropriate Timing variable(s) should be populated. Note: Details on Scheduled records may describe timing at a higher level than Performed records.
    4. ECOCCUR is generally not applicable for Scheduled records.
    5. An activity may be rescheduled or modified multiple times before being performed. Representation of Scheduled records is dependent on the collected, available data. If each rescheduled or modified activity is collected, then multiple Scheduled records may be represented. If only the final Scheduled activity is collected, then it would be the only scheduled record represented.
  4. Doses Not Taken, Not Given, or Missed
    1. The record qualifier --OCCUR, with value of "N", is available in domains based on the Interventions and Events General Observation Classes as the standard way to represent whether an intervention or event did not happen. In the EC domain, ECOCCUR value of "N" indicates a dose was not taken, not given, or missed. For example, if 0 tablets are taken within a timeframe or 0 mL is infused at a visit, then ECOCCUR = "N" is the standard representation of the collected doses not taken, not given, or missed. Dose amount variables (e.g., ECDOSE, ECDOSTXT) must not be set to zero (0) as an alternative method for indicating doses not taken, not given, or missed.
    2. The population of Qualifier variables (e.g., Grouping, Record, Variable) and additional Timing variables (e.g., date of collection, visit, time point) for records representing information collected about doses not taken, not given, or missed should be populated with equal granularity as administered records, when known and/or applicable. Qualifiers that indicate dose amount (e.g., ECDOSE, ECDOSTXT) may be populated with positive (non-zero) values in cases where the sponsor feels it is necessary and/or appropriate to represent specific dose amounts not taken, not given, or missed.
  5. Timing Variables
    1. Timing variables in the EC domain should reflect administrations by the intervals they were collected (e.g., constant-dosing intervals, visits, targeted dates like first dose, last dose).
    2. For administrations considered given at a point in time (e.g., oral tablet, pre-filled syringe injection), where only an administration date/time is collected, ECSTDTC should be copied to ECENDTC.
  6. The degree of summarization of records from EC to EX is sponsor defined to support study purpose and analysis. When the relationship between EC and EX records can be described in RELREC, then it should be defined. EX derivations must be described in the Define-XML document.
  7. Additional Interventions Qualifiers
    1. --DOSTOT is under evaluation for potential deprecation and replacement with a mechanism to describe total dose over any interval of time (e.g., day, week, month). Sponsors considering ECDOSTOT may want to consider using other dose amount variables (ECDOSE or ECDOSTXT) in combination with frequency (ECDOSFRQ) and timing variables to represent the data.
    2. Any Identifier variables, Timing variables, or Findings general-observation-class qualifiers may be added to the EC domain, but the following qualifiers would generally not be used in EC: --STAT, --REASND, --VAMT, and --VAMTU.

6.1.3.3 Exposure/Exposure as Collected Examples

Example

This is an example of a double-blind study comparing Drug X extended release (ER) (two 500-mg tablets once daily) vs. Drug Z (two 250-mg tablets once daily). Per example CRFs, Subject ABC1001 took 2 tablets from 2011-01-14 to 2011-01-28 and Subject ABC2001 took 2 tablets within the same timeframe but missed dosing on 2011-01-24.

Exposure CRF:

Subject: ABC1001

BottleNumber of Tablets Taken DailyReason for VariationStart DateEnd Date
A2
2011-01-142011-01-28

Subject: ABC2001

BottleNumber of Tablets Taken DailyReason for VariationStart DateEnd Datee
A2
2011-01-142011-01-23
A0Patient mistake2011-01-242011-01-24
A2
2011-01-252011-01-28

Upon unmasking, it became known that Subject ABC1001 received Drug X and Subject ABC2001 received Drug Z. The EC dataset shows the administrations of study treatment as collected.

Rows 1-2, 4:Show treatments administered.
Row 3:Shows that the zero for Number of Tablets Taken Daily on the CRF was represented as ECOCCUR = "N".

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECLNKIDECTRTECPRESPECOCCURECDOSEECDOSUECDOSFRQEPOCHECSTDTCECENDTCECSTDYECENDY
1ABCECABC10011A2-20110114BOTTLE AYY2TABLETQDTREATMENT2011-01-142011-01-28115
2ABCECABC20011A2-20110114BOTTLE AYY2TABLETQDTREATMENT2011-01-142011-01-23110
3ABCECABC20012A0-20110124BOTTLE AYN
TABLETQDTREATMENT2011-01-242011-01-241111
4ABCECABC20013A2-20110125BOTTLE AYY2TABLETQDTREATMENT2011-01-252011-01-281215

The reason for the ECOCCUR value of "N" was represented using a supplemental qualiifier.

suppec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCECABC2001ECSEQ2ECREASOCReason for Occur ValuePATIENT MISTAKECRF

The EX dataset shows the unmasked administrations. Two tablets from Bottle A became 1000 mg of Drug X extended release for Subject ABC1001, but 500 mg of Drug Z for Subject ABC2001. Note that there is no record in the EX dataset for non-occurrence of study treatment. The non-occurrence of study drug for subject ABC2001 is reflected in the gap in time between the two EX records.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXLNKIDEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEPOCHEXSTDTCEXENDTCEXSTDYEXENDY
1ABCEXABC10011A2-20110114DRUG X1000mgTABLET, EXTENDED RELEASEQDORALTREATMENT2011-01-142011-01-28115
2ABCEXABC20011A2-20110114DRUG Z500mgTABLETQDORALTREATMENT2011-01-142011-01-23110
3ABCEXABC20012A2-20110125DRUG Z500mgTABLETQDORALTREATMENT2011-01-252011-01-281215

The relrec.xpt example reflects a one-to-one dataset-level relationship between EC and EX using --LNKID.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABCEC
ECLNKID
ONE1
2ABCEX
EXLNKID
ONE1

Example

This example shows data from an open-label study. A subject received Drug X as a 20 mg/mL solution administered across 3 injection sites to deliver a total dose of 3 mg/kg. The subject's weight was 100 kg.

Exposure CRF

Visit3
Date2009-05-10
Injection 1
Volume Given (mL)5
LocationABDOMEN
SideLEFT
Injection 2
Volume Given (mL)5
LocationABDOMEN
SideCENTER
Injection 3
Volume Given (mL)5
LocationABDOMEN
SideRIGHT

The collected administration amounts, in mL, and their locations are represented in the EC dataset.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECSPIDECLNKIDECTRTECPRESPECOCCURECDOSEECDOSUECDOSFRMECDOSFRQECROUTEECLOCECLATVISITNUMVISITEPOCHECSTDTCECENDTCECSTDYECENDY
1ABCECABC30011INJ1V3DRUG XYY5mLINJECTIONONCESUBCUTANEOUSABDOMENLEFT3VISIT 3TREATMENT2009-05-102009-05-102121
2ABCECABC30012INJ2V3DRUG XYY5mLINJECTIONONCESUBCUTANEOUSABDOMENCENTER3VISIT 3TREATMENT2009-05-102009-05-102121
3ABCECABC30013INJ3V3DRUG XYY5mLINJECTIONONCESUBCUTANEOUSABDOMENRIGHT3VISIT 3TREATMENT2009-05-102009-05-102121

The sponsor considered the 3 injections to constitute a single administration, so the EX dataset shows the total dose given in the protocol-specified unit, mg/kg. EXLOC = "ABDOMEN" is included since this location was common to all injections, but EXLAT was not included. If the sponsor had chosen to represent laterality in the EX record, this would have been handled as described in Section 4.2.8.3, Multiple Values for a Non-Result Qualifier Variable

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXSPIDEXLNKIDEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEXLOCVISITNUMVISITEPOCHEXSTDTCEXENDTCEXSTDYEXENDY
1ABCEXABC30011
V3DRUG X3mg/kgINJECTIONONCESUBCUTANEOUSABDOMEN3VISIT 3TREATMENT2009-05-102009-05-102121

The relrec.xpt example reflects a many-to-one dataset-level relationship between EC and EX using --LNKID.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABCEC
ECLNKID
MANY1
2ABCEX
EXLNKID
ONE1

Example

The study in this example was a double-blind study comparing 10, 20, and 30 mg of Drug X once daily vs Placebo. Study treatment was given as one tablet each from Bottles A, B, and C taken together once daily. The subject in this example took:

  • 1 tablet from Bottles A, B and C from 2011-01-14 to 2011-01-20
  • 0 tablets from Bottle B on 2011-01-21, then 2 tablets on 2011-01-22
  • 1 tablet from Bottles A and C on 2011-01-21 and 2011-01-22
  • 1 tablet from Bottles A, B and C from 2011-01-23 to 2011-01-28

The EC dataset shows administrations as collected, in tablets.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECTRTECPRESPECOCCURECDOSEECDOSUECDOSFRQEPOCHECSTDTCECENDTCECSTDYECENDY
1ABCECABC40011BOTTLE AYY1TABLETQDTREATMENT2011-01-142011-01-28115
2ABCECABC40012BOTTLE CYY1TABLETQDTREATMENT2011-01-142011-01-28115
3ABCECABC40013BOTTLE BYY1TABLETQDTREATMENT2011-01-142011-01-2017
4ABCECABC40014BOTTLE BYN
TABLETQDTREATMENT2011-01-212011-01-2188
5ABCECABC40015BOTTLE BYY2TABLETQDTREATMENT2011-01-222011-01-2299
6ABCECABC40016BOTTLE BYY1TABLETQDTREATMENT2011-01-232011-01-281015

Upon unmasking, it became known that the subject was randomized to Drug X 20 mg and that:

  • Bottle A contained 10 mg/tablet.
  • Bottle B contained 10 mg/tablet.
  • Bottle C contained Placebo (i.e., 0 mg of active ingredient/tablet).

The EX dataset shows the doses administered in the protocol-specified unit (mg). The sponsor considered an administration to consist of the total amount for Bottles A, B, and C. The derivation of EX records from multiple EC records should be shown in the Define-XML document.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEPOCHEXSTDTCEXENDTCEXSTDYEXENDY
1ABCEXABC40011DRUG X20mgTABLETQDORALTREATMENT2011-01-142011-01-2017
2ABCEXABC40012DRUG X10mgTABLETQDORALTREATMENT2011-01-212011-01-2188
3ABCEXABC40013DRUG X30mgTABLETQDORALTREATMENT2011-01-222011-01-2299
4ABCEXABC40014DRUG X20mgTABLETQDORALTREATMENT2011-01-232011-01-281015

Example

The study in this example was an open-label study examining the tolerability of different doses of Drug A. Study drug was taken orally, daily for three months. Dose adjustments were allowed as needed in response to tolerability or efficacy issues.

The EX dataset shows administrations collected in the protocol-specified unit, mg. No EC dataset was needed since the open-label administrations were collected in the protocol-specified unit; EC would be an exact duplicate of the entire EX domain.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEXADJEPOCHEXSTDTCEXENDTC
137841EX378410011DRUG A20mgTABLETQDORAL
TREATMENT2002-07-012002-10-01
237841EX378410021DRUG A20mgTABLETQDORAL
TREATMENT2002-04-022002-04-21
337841EX378410022DRUG A15mgTABLETQDORALReduced due to toxicityTREATMENT2002-04-222002-07-01
437841EX378410031DRUG A20mgTABLETQDORAL
TREATMENT2002-05-092002-06-01
537841EX378410032DRUG A25mgTABLETQDORALIncreased due to suboptimal efficacyTREATMENT2002-06-022002-07-01
637841EX378410033DRUG A30mgTABLETQDORALIncreased due to suboptimal efficacyTREATMENT2002-07-022002-08-01

Example

This is an example of a double-blind study design comparing 10 and 20 mg of Drug X vs Placebo taken daily, morning and evening, for a week.

Subject ABC5001

BottleTime PointNumber of Tablets TakenStart DateEnd Date
AAM12012-01-012012-01-08
BPM12012-01-012012-01-08

Subject ABC5002

BottleTime PointNumber of Tablets TakenStart DateEnd Date
AAM12012-02-012012-02-08
BPM12012-02-012012-02-08

Subject ABC5003

BottleTime PointNumber of Tablets TakenStart DateEnd Date
AAM12012-03-012012-03-08
BPM12012-03-012012-03-08

The EC dataset shows the administrations as collected. The time point variables ECTPT and ECTPTNUM were used to describe the time of day of administration. This use of time point variables is novel, since it represents data about multiple time points, one on each day of administration, rather than data for a single time point.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECLNKIDECTRTECPRESPECOCCURECDOSEECDOSUECDOSFRQEPOCHECSTDTCECENDTCECSTDYECENDYECTPTECTPTNUM
1ABCECABC5001120120101-20120108-AMBOTTLE AYY1TABLETQDTREATMENT2012-01-012012-01-0818AM1
2ABCECABC5001220120101-20120108-PMBOTTLE BYY1TABLETQDTREATMENT2012-01-012012-01-0818PM2
3ABCECABC5002120120201-20120208-AMBOTTLE AYY1TABLETQDTREATMENT2012-02-012012-02-0818AM1
4ABCECABC5002220120201-20120208-PMBOTTLE BYY1TABLETQDTREATMENT2012-02-012012-02-0818PM2
5ABCECABC5003120120301-20120308-AMBOTTLE AYY1TABLETQDTREATMENT2012-03-012012-03-0818AM1
6ABCECABC5003220120301-20120308-PMBOTTLE BYY1TABLETQDTREATMENT2012-03-012012-03-0818PM2

The EX dataset shows the unmasked administrations in the protocol specified unit, mg. Amounts of placebo was represented as 0 mg. The sponsor chose to represent the administrations at the time point level.

Rows 1-2:Show administrations for a subject who was randomized to the 20 mg Drug X arm.
Rows 3-4:Show administrations for a subject who was randomized to the 10 mg Drug X arm.
Rows 5-6:Show administrations for a subject who was randomized to the Placebo arm.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXLNKIDEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEPOCHEXSTDTCEXENDTCEXSTDYEXENDYEXTPTEXTPTNUM
1ABCEXABC5001120120101-20120108-AMDRUG X10mgTABLETQDORALTREATMENT2012-01-012012-01-0818AM1
2ABCEXABC5001220120101-20120108-PMDRUG X10mgTABLETQDORALTREATMENT2012-01-012012-01-0818PM2
3ABCEXABC5002120120201-20120208-AMDRUG X10mgTABLETQDORALTREATMENT2012-02-012012-02-0818AM1
4ABCEXABC5002220120201-20120208-PMPLACEBO0mgTABLETQDORALTREATMENT2012-02-012012-02-0818PM2
5ABCEXABC5003120120301-20120308-AMPLACEBO0mgTABLETQDORALTREATMENT2012-03-012012-03-0818AM1
6ABCEXABC5003220120301-20120308-PMPLACEBO0mgTABLETQDORALTREATMENT2012-03-012012-03-0818PM2

The relrec.xpt example reflects a one-to-one dataset-level relationship between EC and EX using --LNKID.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABCEC
ECLNKID
ONE1
2ABCEX
EXLNKID
ONE1

Example

The study in this example was a single-crossover study comparing once daily oral administration of Drug A 20 mg capsules with Drug B 30 mg coated tablets. Study drug was taken for 3 consecutive mornings, 30 minutes prior to a standardized breakfast. There was a 6-day washout period between treatments.

The following CRFs show data for two subjects.

Subject 56789001

Period 1Period 2
DayBottle 1
# of capsules
Bottle 2
# of tablets
Start Date/TimeEnd Date/TimeDayBottle 1
# of capsules
Bottle 2
# of tablets
Start Date/TimeEnd Date/Time
1112002-07-01T07:302002-07-01T07:301112002-07-09T07:302002-07-09T07:30
2112002-07-02T07:302002-07-02T07:302112002-07-10T07:302002-07-10T07:30
3112002-07-03T07:322002-07-03T07:323112002-07-11T07:342002-07-11T07:34

Subject 56789003

Period 1Period 2
DayBottle 1
# of capsules
Bottle 2
# of tablets
Start Date/TimeEnd Date/TimeDayBottle 1
# of capsules
Bottle 2
# of tablets
Start Date/TimeEnd Date/Time
1112002-07-03T07:302002-07-03T07:301112002-07-11T07:302002-07-11T07:30
2112002-07-04T07:242002-07-04T07:242112002-07-12T07:432002-07-12T07:43
3112002-07-05T07:242002-07-05T07:243112002-07-13T07:382002-07-13T07:38

The EC dataset shows administrations as collected.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECTRTECPRESPECOCCURECDOSEECDOSUECDOSFRMECDOSFRQECROUTEEPOCHECSTDTCECENDTCECSTDYECENDYECTPTECELTMECTPTREF
156789EC567890011BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-01T07:302002-07-01T07:301130 MINUTES PRIOR-PT30MSTD BREAKFAST
256789EC567890012BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-01T07:302002-07-01T07:301130 MINUTES PRIOR-PT30MSTD BREAKFAST
356789EC567890013BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-02T07:302002-07-02T07:302230 MINUTES PRIOR-PT30MSTD BREAKFAST
456789EC567890014BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-02T07:302002-07-02T07:302230 MINUTES PRIOR-PT30MSTD BREAKFAST
556789EC567890015BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-03T07:322002-07-03T07:323330 MINUTES PRIOR-PT30MSTD BREAKFAST
656789EC567890016BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-03T07:322002-07-03T07:323330 MINUTES PRIOR-PT30MSTD BREAKFAST
756789EC567890017BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-09T07:302002-07-09T07:309930 MINUTES PRIOR-PT30MSTD BREAKFAST
856789EC567890018BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-09T07:302002-07-09T07:309930 MINUTES PRIOR-PT30MSTD BREAKFAST
956789EC567890019BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-10T07:302002-07-10T07:30101030 MINUTES PRIOR-PT30MSTD BREAKFAST
1056789EC5678900110BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-10T07:302002-07-10T07:30101030 MINUTES PRIOR-PT30MSTD BREAKFAST
1156789EC5678900111BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-11T07:342002-07-11T07:34111130 MINUTES PRIOR-PT30MSTD BREAKFAST
1256789EC5678900112BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-11T07:342002-07-11T07:34111130 MINUTES PRIOR-PT30MSTD BREAKFAST
1356789EC567890031BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-03T07:302002-07-03T07:301130 MINUTES PRIOR-PT30MSTD BREAKFAST
1456789EC567890032BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-03T07:302002-07-03T07:301130 MINUTES PRIOR-PT30MSTD BREAKFAST
1556789EC567890033BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-04T07:242002-07-04T07:242230 MINUTES PRIOR-PT30MSTD BREAKFAST
1656789EC567890034BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-04T07:242002-07-04T07:242230 MINUTES PRIOR-PT30MSTD BREAKFAST
1756789EC567890035BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 12002-07-05T07:242002-07-05T07:243330 MINUTES PRIOR-PT30MSTD BREAKFAST
1856789EC567890036BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 12002-07-05T07:242002-07-05T07:243330 MINUTES PRIOR-PT30MSTD BREAKFAST
1956789EC567890037BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-11T07:302002-07-11T07:309930 MINUTES PRIOR-PT30MSTD BREAKFAST
2056789EC567890038BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-11T07:302002-07-11T07:309930 MINUTES PRIOR-PT30MSTD BREAKFAST
2156789EC567890039BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-12T07:432002-07-12T07:43101030 MINUTES PRIOR-PT30MSTD BREAKFAST
2256789EC5678900310BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-12T07:432002-07-12T07:43101030 MINUTES PRIOR-PT30MSTD BREAKFAST
2356789EC5678900311BOTTLE 1YY1CAPSULECAPSULEQDORALTREATMENT 22002-07-13T07:382002-07-13T07:38111130 MINUTES PRIOR-PT30MSTD BREAKFAST
2456789EC5678900312BOTTLE 2YY1TABLET, COATEDTABLET, COATEDQDORALTREATMENT 22002-07-13T07:382002-07-13T07:38111130 MINUTES PRIOR-PT30MSTD BREAKFAST

The EX dataset shows the unblinded administrations.

Rows 1-12:Unblinding revealed that the first subject received placebo coated tablets during the first treatment epoch and placebo capsules during the second treatment epoch.
Rows 13-24:Unblinding revealed that the second subject received placebo capsules during the first treatment epoch and placebo coated tablets during the second treatment epoch.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEPOCHEXSTDTCEXENDTCEXSTDYEXENDYEXTPTEXELTMEXTPTREF
156789EX567890011DRUG A20mgCAPSULEQDORALTREATMENT 12002-07-01T07:30
1130 MINUTES PRIOR-PT30MSTD BREAKFAST
256789EX567890012PLACEBO0mgTABLET, COATEDQDORALTREATMENT 12002-07-01T07:30
1130 MINUTES PRIOR-PT30MSTD BREAKFAST
356789EX567890013DRUG A20mgCAPSULEQDORALTREATMENT 12002-07-02T07:30
2230 MINUTES PRIOR-PT30MSTD BREAKFAST
456789EX567890014PLACEBO0mgTABLET, COATEDQDORALTREATMENT 12002-07-02T07:30
2230 MINUTES PRIOR-PT30MSTD BREAKFAST
556789EX567890015DRUG A20mgCAPSULEQDORALTREATMENT 12002-07-03T07:32
3330 MINUTES PRIOR-PT30MSTD BREAKFAST
656789EX567890016PLACEBO0mgTABLET, COATEDQDORALTREATMENT 12002-07-03T07:32
3330 MINUTES PRIOR-PT30MSTD BREAKFAST
756789EX567890017PLACEBO0mgCAPSULEQDORALTREATMENT 22002-07-09T07:30
9930 MINUTES PRIOR-PT30MSTD BREAKFAST
856789EX567890018DRUG B30mgTABLET, COATEDQDORALTREATMENT 22002-07-09T07:30
9930 MINUTES PRIOR-PT30MSTD BREAKFAST
956789EX567890019PLACEBO0mgCAPSULEQDORALTREATMENT 22002-07-10T07:30
101030 MINUTES PRIOR-PT30MSTD BREAKFAST
1056789EX5678900110DRUG B30mgTABLET, COATEDQDORALTREATMENT 22002-07-10T07:30
101030 MINUTES PRIOR-PT30MSTD BREAKFAST
1156789EX5678900111PLACEBO0mgCAPSULEQDORALTREATMENT 22002-07-11T07:34
111130 MINUTES PRIOR-PT30MSTD BREAKFAST
1256789EX5678900112DRUG B30mgTABLET, COATEDQDORALTREATMENT 22002-07-11T07:34
111130 MINUTES PRIOR-PT30MSTD BREAKFAST
1356789EX567890031PLACEBO0mgCAPSULEQDORALTREATMENT 12002-07-03T07:30
1130 MINUTES PRIOR-PT30MSTD BREAKFAST
1456789EX567890032DRUG B30mgTABLET, COATEDQDORALTREATMENT 12002-07-03T07:30
1130 MINUTES PRIOR-PT30MSTD BREAKFAST
1556789EX567890033PLACEBO0mgCAPSULEQDORALTREATMENT 12002-07-04T07:24
2230 MINUTES PRIOR-PT30MSTD BREAKFAST
1656789EX567890034DRUG B30mgTABLET, COATEDQDORALTREATMENT 12002-07-04T07:24
2230 MINUTES PRIOR-PT30MSTD BREAKFAST
1756789EX567890035PLACEBO0mgCAPSULEQDORALTREATMENT 12002-07-05T07:24
3330 MINUTES PRIOR-PT30MSTD BREAKFAST
1856789EX567890036DRUG B30mgTABLET, COATEDQDORALTREATMENT 12002-07-05T07:24
3330 MINUTES PRIOR-PT30MSTD BREAKFAST
1956789EX567890037DRUG A20mgCAPSULEQDORALTREATMENT 22002-07-11T07:30
9930 MINUTES PRIOR-PT30MSTD BREAKFAST
2056789EX567890038PLACEBO0mgTABLET, COATEDQDORALTREATMENT 22002-07-11T07:30
9930 MINUTES PRIOR-PT30MSTD BREAKFAST
2156789EX567890039DRUG A20mgCAPSULEQDORALTREATMENT 22002-07-12T07:43
101030 MINUTES PRIOR-PT30MSTD BREAKFAST
2256789EX5678900310PLACEBO0mgTABLET, COATEDQDORALTREATMENT 22002-07-12T07:43
101030 MINUTES PRIOR-PT30MSTD BREAKFAST
2356789EX5678900311DRUG A20mgCAPSULEQDORALTREATMENT 22002-07-13T07:38
111130 MINUTES PRIOR-PT30MSTD BREAKFAST
2456789EX5678900312PLACEBO0mgTABLET, COATEDQDORALTREATMENT 22002-07-13T07:38
111130 MINUTES PRIOR-PT30MSTD BREAKFAST

Example

The study in this example involved weekly infusions of Drug Z 10 mg/kg. If a subject experienced a dose-limiting toxicity (DLT), the intended dose could be reduced to 7.5 mg/kg.

The example CRF below was for Subject ABC123-0201, who weighed 55 kg. The CRF shows that:

  • The subject's first administration of Drug Z was on 2009-02-13; the intended dose was 10 mg/kg, but the actual amount given was 99 mL at 5.5 mg/mL, so the actual dose was 9.9 mg/kg.
  • The subject's second administration of Drug Z occurred on 2009-02-20; the intended dose was reduced to 7.5 mg/kg due to dose-limiting toxicity, and the infusion was stopped early due to an injection site reaction. However, the actual amount given was 35 mL at a concentration of 4.12 mg/mL, so the calculated actual dose was 2.6 mg/kg.
  • The subject's third administration was intended to occur on 2009-02-27; the intended dose was 7.5 mg/kg but due to a personal reason, the administration did not occur.
Visit123
Intended Dose
  • 10 mg/kg
  • 7.5 mg/kg
  • 10 mg/kg
  • 7.5 mg/kg
  • 10 mg/kg
  • 7.5 mg/kg
Reason for Dose Adjustment
  • Dose-limiting toxicity
  • Dose-limiting toxicity
  • Dose-limiting toxicity
Dose Administered
  • Yes
  • No

If no, give reason:

  • Treatment discontinued due to disease progression
  • Other, specify: ________________________
  • Yes
  • No

If no, give reason:

  • Treatment discontinued due to disease progression
  • Other, specify: ________________________
  • Yes
  • No

If no, give reason:

  • Treatment discontinued due to disease progression
  • Other, specify: Personal reason
Date13-FEB-200920-FEB-200927-FEB-2009
Start Time (24 hour clock)10:0011:00
End Time (24 hour clock)10:4511:20
Amount (mL)99 mL35 mL0 mL
Concentration5.5 mg/mL4.12 mg/mL4.12 mg/mL
If dose was adjusted, what was the reason:
  • Injection site reaction
  • Adverse event
  • Other, specify: ______________________
  • Injection site reaction
  • Adverse event
  • Other, specify: ____________________
  • Injection site reaction
  • Adverse event
  • Other, specify: ______________________

The EC dataset shows both intended and actual doses of Drug Z, as collected.

Rows 1, 3, 5:Show the collected intended dose levels (mg/kg) and ECMOOD is "SCHEDULED". Scheduled dose is represented in mg/ML.
Rows 2, 4, 6:Show the collected actual administration amounts (mL) and ECMOOD is "PERFORMED". Actual doses are represented using dose in mL and concentration (pharmaceutical strength) in mg/mL.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECLNKIDECLNKGRPECTRTECMOODECPRESPECOCCURECDOSEECDOSUECPSTRGECPSTRGUECADJVISITNUMVISITEPOCHECSTDTCECENDTCECSTDYECENDY
1ABC123ECABC123-02011
V1DRUG ZSCHEDULED

10mg/kg


1VISIT 1TREATMENT2009-02-132009-02-1311
2ABC123ECABC123-0201220090213 T1000V1DRUG ZPERFORMEDYY99mL5.5mg/mL
1VISIT 1TREATMENT2009-02-13T10:002009-02-13T10:4511
3ABC123ECABC123-02013
V2DRUG ZSCHEDULED

7.5mg/kg

Dose limiting toxicity2VISIT 2TREATMENT2009-02-202009-02-2088
4ABC123ECABC123-0201420090220 T1100V2DRUG ZPERFORMEDYY35mL4.12mg/mL
2VISIT 2TREATMENT2009-02-20T11:002009-02-20T11:2088
5ABC123ECABC123-02015
V3DRUG ZSCHEDULED

7.5mg/kg


3VISIT 3TREATMENT2009-02-272009-02-271515
6ABC123ECABC123-0201620090227V3DRUG ZPERFORMEDYN
mL4.12mg/mL
3VISIT 3TREATMENT2009-02-272009-02-271515

The reason that ECOCCUR was "N" was represented in a supplemental qualifier.

suppec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCECABC123-0201ECSEQ6ECREASOCReason for Occur ValuePERSONAL REASONCRF

The EX dataset shows the administrations in protocol-specified unit (mg/kg). There is no record for the intended third dose that was not given. Intended doses in EC (records with EXMOOD = "SCHEDULED") can be compared with actual doses in EX.

Row 1:Shows the subject's first dose.
Row 2:Shows the subject's second dose. The collected explanation for the adjusted dose amount administered at Visit 2 is in EXADJ.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXLNKIDEXLNKGRPEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEXADJVISITNUMVISITEPOCHEXSTDTCEXENDTCEXSTDYEXENDY
1ABC123EXABC123-0201120090213T1000V1DRUG Z9.9mg/kgSOLUTIONCONTINUOUSINTRAVENOUS
1VISIT 1TREATMENT2009-02-13T10:002009-02-13T10:0011
2ABC123EXABC123-0201220090220T1100V2DRUG Z2.6mg/kgSOLUTIONCONTINUOUSINTRAVENOUSInjection site reaction2VISIT 2TREATMENT2009-02-20T11:002009-02-20T11:0088

The sponsor wished to represent the doses in mg, as well as in mg/kg. Since a dose includes both a numeric value and a unit, the data could not be represented in a supplemental qualifier, so was represented in an FA dataset. See Section 6.4.1, When to Use Findings About.

fa.xpt

RowSTUDYIDDOMAINUSUBJIDFASEQFALNKIDFATESTCDFATESTFAOBJFAORRESFAORRESUFASTRESCFASTRESNFASTRESUVISITNUMVISITEPOCH
1ABC123FAABC123-0201120090213T1000DOSEALTDose in Alternative UnitDRUG Z522.5mg522.5522.5mg1VISIT 1TREATMENT
2ABC123FAABC123-0201220090220T1100DOSEALTDose in Alternative UnitDRUG Z144.2mg144.2144.2mg2VISIT 2TREATMENT

The RELREC dataset represents relationships between EC, EX, and FA.

Rows 1-2:Represent the one-to-one relationship between "PERFORMED" records in EC and records in EX, using --LNKID.
Rows 3-4:Represent the many-to-one relationship between records (both "SCHEDULED" and "PERFORMED") in EC and records in EX, using --LNGRP.
Rows 5-6:Represent the one-to-one relationship between records in EX and records in FA, using LNKID.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABC123EC
ECLNKID
ONE1
2ABC123EX
EXLNKID
ONE1
3ABC123EC
ECLNKGRP
MANY2
4ABC123EX
EXLNKGRP
ONE2
5ABC123EX
EXLNKID
ONE3
6ABC123FA
FALNKID
ONE3

Example

In this example, a 100 mg tablet is scheduled to be taken daily. Start and end of dosing were collected,along with deviations from the planned daily dosing. Note: This method of data collection design is not consistent with current CDASH standards.

First Dose DateLast Dose Date
2012-01-132012-01-20
DateNumber of Doses Daily
If/When Deviated from Plan
2012-01-150
2012-01-162

The EC dataset shows administrations as collected.

Row 1:Shows the overall dosing interval from first dose date to last dose date.
Row 2:Shows the missed dose on 2012-01-15, which falls within the overall dosing interval.
Row 3:Shows a doubled dose on 2012-01-16, which also falls within the overall dosing interval.

ec.xpt

RowSTUDYIDDOMAINUSUBJIDECSEQECTRTECCATECPRESPECOCCURECDOSEECDOSUECDOSFRQEPOCHECSTDTCECENDTCECSTDYECENDY
1ABCECABC70011BOTTLE AFIRST TO LAST DOSE INTERVALYY1TABLETQDTREATMENT2012-01-132012-01-2018
2ABCECABC70012BOTTLE AEXCEPTION DOSEYN
TABLETQDTREATMENT2012-01-152012-01-1533
3ABCECABC70013BOTTLE AEXCEPTION DOSEYY2TABLETQDTREATMENT2012-01-162012-01-1644

The EX dataset shows the unmasked treatment for this subject, "DRUG X", and represents dosing in non-overlapping intervals of time. There is no EX record for the missed dose, but the missed dose is reflected in a gap between dates in the EX records.

Row 1:Shows the administration from first dose date to the day before the missed dose.
Row 2:Shows the doubled dose.
Row 3:Shows the remaining administrations to the last dose date.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXTRTEXDOSEEXDOSUEXDOSFRMEXDOSFRQEXROUTEEPOCHEXSTDTCEXENDTCEXSTDYEXENDY
1ABCEXABC70011DRUG X100mgTABLETQDORALTREATMENT2012-01-132012-01-1412
2ABCEXABC70012DRUG X200mgTABLETQDORALTREATMENT2012-01-162012-01-1644
3ABCEXABC70013DRUG X100mgTABLETQDORALTREATMENT2012-01-172012-01-2058

6.1.4 Meal Data

ML – Description/Overview

Information regarding the subject's meal consumption, such as fluid intake, amounts, form (solid or liquid state), frequency, etc., typically used for pharmacokinetic analysis.

ML – Specification

ml.xpt, Meal Data — Interventions, Version 1.0. One record per food product occurrence or constant intake interval per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

ML – Assumptions

  1. The ML Domain is used to represent consumption of any food or nutritional item that would not be represented in EC/EX, CM, AG, or SU. Examples of nutritional items that would be represented in other domains:

    • Investigational nutritional products represented in EC/EX
    • Food or drink used to treat hypoglycemic events represented in CM
    • Glucose given as part of a glucose tolerance test represented in AG
    • Caffeinated drinks represented in SU

    The nutritional items represented in ML may be prospectively defined within a protocol, collected retrospectively as potential precipitants of clinical events, and/or to describe nutritional intake.

  2. Additional Timing Variables
    1. Any additional Timing variables may be added to this domain.
    2. Consumption of a food product is considered to occur over an interval of time (as opposed to a point in time). If start and end date/times are collected, they should be represented in MLSTDTC and MLENDTC, respectively. If only a start date/time is collected, it should not be copied to MLENDTC.
  3. Any Identifier variables, Timing variables, or Findings general-observation-class qualifiers may be added to the ML domain, but the following qualifiers would generally not be used in ML: --MOOD, --LOT, --LOC, --LAT, --DIR, --PORTOT.

ML – Examples

Example

This example shows meal data collected in an effort to understand the causes of two different kinds of event.

  • Data was collected about the last meal before each hypoglycemic event
  • Data was collected about the occurrence of of pre-specified foods prior to a suspected event of drug-induced liver injury (DILI).

Meal Log CRF

Record the last type of meal/food consumption prior to the hypoglycemic event:

TypeIf Nutritional Drink, volume (ounces)Start DateStart TimeEvent ID
X SnackNutritional drinkMeal
2015 Jun 0314:15CE001
SnackX Nutritional drinkMeal8 oz2015 Sep 038:30CE002
SnackNutritional drinkX Meal
2015 Dec 3119:00CE003
Click here to add a row:

DILI Meal CRF

If suspected DILI, did you consume any of the following in the past week?

TypeOccurrenceIf yes, Date
Wild mushroomsX YesNo2015 DEC 24
Ackee fruitYesX No
Cycad seedsYesX No

Note that in this example MLENDTC is null. Since no end date was collected, the meal was represented as a point-in-time event, as described in Assumption 2b.

Rows 1-3:Show the last meal data for three hypoglycemic events.
Rows 4-6:Show the meal data collected relative to the suspected DILI.

ml.xpt

RowSTUDYIDDOMAINUSUBJIDMLSEQMLTRTMLCATMLPRESPMLOCCURMLDOSEMLDOSUMLDTCMLSTDTCMLENDTCMLEVLINTRELMIDSMIDSMIDSDTC
1XYZMLXYZ-001-0011SNACKHYPOGLYCEMIA EVALUATIONYY


2015-06-03T14:15

LAST MEAL PRIOR TOHYPO12015-06-03T19:20
2XYZMLXYZ-001-0012NUTRITIONAL DRINKHYPOGLYCEMIA EVALUATIONYY8oz
2015-09-03T08:30

LAST MEAL PRIOR TOHYPO22015-09-03T17:00
3XYZMLXYZ-001-0013MEALHYPOGLYCEMIA EVALUATIONYY


2015-12-31T19:00

LAST MEAL PRIOR TOHYPO32016-01-01T10:30
4XYZMLXYZ-001-0014WILD MUSHROOMSDILI EVALUATIONYY

2015-12-272015-12-24
-P1W


5XYZMLXYZ-001-0015ACKEE FRUITDILI EVALUATIONYN

2015-12-27

-P1W


6XYZMLXYZ-001-0016CYCAD SEEDSDILI EVALUATIONYN

2015-12-27

-P1W


Example

This example describes a study that examines the impact of physical modifications in a cafeteria on selection/consumption among school students.

GroupArmsDetails
1ControlStudents received standard meals in a standard cafeteria environment.
2Experimental: choice architectureStudents were exposed to modifications to the physical environment in the cafeteria to "nudge" students towards healthier choices. Physical modifications included:
  • Placing vegetables at the beginning of the lunch line.
  • Placing fruits in attractive bowls, trays lined with appealing fabric, and fruit options next to cash registers.
  • Promote fruits and vegetables with prominently displayed signage and images.
  • Place white milk selection more predominantly that chocolate milk (e.g., display white milk in front of chocolate milk).

Food-card data was collected over a 7-month period by students receiving a school meal one day week. Students who brought a lunch from home or those not eating lunch in the cafeteria on a study day were excluded.

The dataset below shows the food-card data collected for the first 3 weeks for a subject.

ml.xpt

RowSTUDYIDDOMAINUSUBJIDMLSEQMLTRTVISITNUMVISITMLSTDTC
1ABC123MLABC123-0011FRUIT ROLLUP1WEEK 12015-09-09
2ABC123MLABC123-0012WHTE MILK1WEEK 12015-09-09
3ABC123MLABC123-0013PEANUT BUTTER SANDWICH1WEEK 12015-09-09
4ABC123MLABC123-0014BANANA2WEEK 22015-09-17
5ABC123MLABC123-0015CHOCOLATE MILK2WEEK 22015-09-17
6ABC123MLABC123-0016PIZZA2WEEK 22015-09-17
7ABC123MLABC123-0017APPLE3WEEK 32015-09-22
8ABC123MLABC123-0018WHITE MILK3WEEK 32015-09-22
9ABC123MLABC123-0019SALAD3WEEK 32015-09-22

6.1.5 Procedures

PR – Description/Overview

An interventions domain that contains interventional activity intended to have diagnostic, preventive, therapeutic, or palliative effects.

PR – Specification

pr.xpt, Procedures — Interventions, Version 3.3. One record per recorded procedure per occurrence per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

PR – Assumptions

  1. Some example procedures by type include the following:
    1. Disease screening (e.g., mammogram, pap smear)
    2. Endoscopic examinations (e.g., arthroscopy, diagnostic colonoscopy, therapeutic colonoscopy, diagnostic laparoscopy, therapeutic laparoscopy)
    3. Diagnostic tests (e.g., amniocentesis, biopsy, catheterization, cutaneous oximetry, finger stick, fluorophotometry, imaging techniques (e.g., DXA scan, CT scan, MRI), phlebotomy, pulmonary function test, skin test, stress test, tympanometry)
    4. Therapeutic procedures (e.g., ablation therapy, catheterization, cryotherapy, mechanical ventilation, phototherapy, radiation therapy/radiotherapy, thermotherapy)
    5. Surgical procedures (e.g., curative surgery, diagnostic surgery, palliative surgery, therapeutic surgery, prophylactic surgery, resection, stenting, hysterectomy, tubal ligation, implantation)

    The Procedures domain is based on the Interventions Observation Class. The extent of physiological effect may range from observable to microscopic. Regardless of the extent of effect or whether it is collected in the study, all collected procedures are represented in this domain. The protocol design should pre-specify whether procedure information will be collected.

    Measurements obtained from procedures are to be represented in their respective Findings domain(s). For example, a biopsy may be performed to obtain a tissue sample that is then evaluated histopathologically. In this case, details of the biopsy procedure can be represented in the PR domain and the histopathology findings in the MI domain. Describing the relationship between PR and MI records (in RELREC) in this example is dependent on whether the relationship is collected, either explicitly or implicitly.

  2. In the Findings Observation Class, the test method is represented in the --METHOD variable (e.g., electrophoresis, gram stain, polymerase chain reaction). At times, the test method overlaps with diagnostic/therapeutic procedures (e.g., ultrasound, MRI, X-ray) in-scope for the PR domain. The following is recommended: If timing (start, end or duration) or an indicator populating PROCCUR, PRSTAT, or PRREASND is collected, then a PR record should be created. If only the findings from a procedure are collected, then --METHOD in the Findings domain(s) may be sufficient to reflect the procedure and a related PR record is optional. It is at the sponsor's discretion whether to represent the procedure as both a test method (--METHOD) and related PR record.
  3. PRINDC is used to represent a medical indication, a medical condition which makes a treatment advisable. The reason for a procedure may be something other than a medical indication. For example, an X-ray might be taken to determine whether a fracture was present. Reasons other than medical indications should be represented using the supplemental qualifier PRREAS (see Appendix C2, Supplemental Qualifiers Name Codes).
  4. Any Identifier variables, Timing variables, or Interventions general-observation-class qualifiers may be added to the PR domain, but the following qualifiers would generally not be used in PR: --MOOD, --LOT.

PR – Examples

Example

A procedures log CRF may collect verbatim values (procedure names) and dates performed. This example shows a subject who had five procedures collected and represented in the PR domain.

pr.xpt

RowSTUDYIDDOMAINUSUBJIDPRSEQPRTRTPRSTDTCPRENDTC
1XYZPRXYZ789-0021Wisdom Teeth Extraction2010-06-082010-06-08
2XYZPRXYZ789-0022Reset Broken Arm2010-08-062010-08-06
3XYZPRXYZ789-0023Prostate Examination2010-12-122010-12-12
4XYZPRXYZ789-0024Endoscopy2010-12-122010-12-12
5XYZPRXYZ789-0025Heart Transplant2011-08-292011-08-29

Example

This example shows data from a 24-hour Holter monitor, an ambulatory electrocardiography device that records a continuous electrocardiographic rhythm pattern.

The start and end of the Holter monitoring procedure are represented in the PR domain.

pr.xpt

RowSTUDYIDDOMAINUSUBJIDPRSEQPRLNKIDPRTRTPRPRESPPROCCURPRSTDTCPRENDTC
1ABC123PRABC123-001120110101_2011010224-HOUR HOLTER MONITORYY2011-01-01T08:002011-01-02T09:45

The heart rate findings from the procedure are represented in the EG domain.

eg.xpt

RowSTUDYIDDOMAINUSUBJIDEGSEQEGLNKIDEGTESTCDEGTESTEGORRESEGORRESUEGMETHODEGDTCEGENDTC
1ABC123EGABC123-001120110101_20110102EGHRMINECG Minimum Heart Rate70beats/minHOLTER CONTINUOUS ECG RECORDING2011-01-01T08:002011-01-02T09:45
2ABC123EGABC123-001220110101_20110102EGHRMAXECG Maximum Heart Rate100beats/minHOLTER CONTINUOUS ECG RECORDING2011-01-01T08:002011-01-02T09:45
3ABC123EGABC123-001320110101_20110102EGHRMEANECG Mean Heart Rate75beats/minHOLTER CONTINUOUS ECG RECORDING2011-01-01T08:002011-01-02T09:45

The relrec.xpt reflects a one-to-many dataset-level relationship between PR and EG using --LNKID.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABC123PR
PRLNKID
ONE1
2ABC123EG
EGLNKID
MANY1

Example

Data for three subjects who had on-study radiotherapy are below. Dose, dose unit, location, and timing are represented.

pr.xpt

RowSTUDYIDDOMAINUSUBJIDPRSEQPRTRTPRDOSEPRDOSUPRLOCPRLATPRSTDTCPRENDTC
1ABC123PRABC123-10011External beam radiation therapy70GyBREASTRIGHT2011-06-012011-06-25
2ABC123PRABC123-20021Brachytherapy25GyPROSTATE
2011-07-152011-07-15
3ABC123PRABC123-30031Radiotherapy300cGyBONE
2011-08-192011-08-22

6.1.6 Substance Use

SU – Description/Overview

An interventions domain that contains substance use information that may be used to assess the efficacy and/or safety of therapies that look to mitigate the effects of chronic substance use.

SU – Specification

su.xpt, Substance Use — Interventions, Version 3.3. One record per substance type per reported occurrence per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

SU – Assumptions

  1. SU Definition
    1. This information may be independent of planned study evaluations, or may be a key outcome (e.g., planned evaluation) of a clinical trial.
    2. In many clinical trials, detailed substance use information as provided for in the domain model above may not be required (e.g., the only information collected may be a response to the question "Have you ever smoked tobacco?"); in such cases, many of the Qualifier variables would not be submitted.
    3. SU may contain responses to questions about use of pre-specified substances as well as records of substance use collected as free text.
  2. Substance Use Description and Coding
    1. SUTRT captures the verbatim or the pre-specified text collected for the substance. It is the topic variable for the SU dataset. SUTRT is a required variable and must have a value.
    2. SUMODIFY is a permissible variable and should be included if coding is performed and the sponsor's procedure permits modification of a verbatim substance use term for coding. The modified term is listed in SUMODIFY. The variable may be populated as per the sponsor's procedures.
    3. SUDECOD is the preferred term derived by the sponsor from the coding dictionary if coding is performed. It is a permissible variable. Where deemed necessary by the sponsor, the verbatim term (SUTRT) should be coded using a standard dictionary such as WHO Drug. The sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
  3. Additional Categorization and Grouping
    1. SUCAT and SUSCAT should not be redundant with the domain code or dictionary classification provided by SUDECOD, or with SUTRT. That is, they should provide a different means of defining or classifying SU records. For example, a sponsor may be interested in identifying all substances that the investigator feels might represent opium use, and to collect such use on a separate CRF page. This categorization might differ from the categorization derived from the coding dictionary.
    2. SUGRPID may be used to link (or associate) different records together to form a block of related records within SU at the subject level (see Section 4.2.6, Grouping Variables and Categorization). It should not be used in place of SUCAT or SUSCAT.
  4. Timing Variables
    1. SUSTDTC and SUENDTC may be populated as required.
    2. If substance use information is collected more than once within the CRF (indicating that the data are visit-based) then VISITNUM would be added to the domain as an additional timing variable. VISITDY and VISIT would then be permissible variables.
  5. Additional Permissible Interventions Qualifiers
    1. Any additional Qualifiers from the Interventions Class may be added to this domain, but the following qualifiers would generally not be used in SU: --MOOD, --LOT.

SU – Examples

Example

The example below illustrates how typical substance use data could be populated. Here, the CRF collected:

  • Smoking data
    • Smoking status of "previous", "current", or "never"
    • If a current or past smoker, how many packs per day
    • If a former smoker, what year did the subject quit
  • Current caffeine use
    • What caffeine drinks have been consumed today
    • How many cups today

SUCAT allows the records to be grouped into smoking-related data and caffeine-related data. In this example, the treatments are pre-specified on the CRF page, so SUTRT does not require a standardized SUDECOD equivalent.

Not shown: A subject who never smoked does not have a tobacco record. Alternatively, a row for the subject could have been included with SUOCCUR = "N" and null dosing and timing fields; the interpretation would be the same. A subject who did not drink any caffeinated drinks on the day of the assessment does not have any caffeine records. A subject who never smoked and did not drink caffeinated drinks on the day of the assessment does not appear in the dataset.

Row 1:This subject is a 2-pack/day current smoker. "Current" implies that smoking started sometime before the time the question was asked (SUSTTPT = "2006-01-01", SUSTRTPT = "BEFORE") and had not ended as of that date (SUENTTP = "2006-01-01", SUENRTPT = "ONGOING"). See Section 4.4.7, Use of Relative Timing Variables for the use of these variables. Both the beginning and ending reference time points for this question are the date of the assessment.
Row 2:The same subject drank three cups of coffee on the day of the assessment.
Row 3:A second subject is a former smoker. The date the subject began smoking is unknown, but we know that it was sometime before the assessment date. This is shown by the values of SUSTTPT and SUSTRTPT. The end date of smoking was collected, so SUENTPT and SUENRTPT are not populated. Instead, the end date is in SUENDTC.
Row 4:This second subject drank tea on the day of the assessment.
Row 5:This second subject drank coffee on the day of the assessment.
Row 6:A third subject had missing data for the smoking questions. This is indicated by SUSTAT = "NOT DONE". The reason is in SUREASND.
Row 7:This third subject also had missing data for all of the caffeine questions.

su.xpt

RowSTUDYIDDOMAINUSUBJIDSUSEQSUTRTSUCATSUSTATSUREASNDSUDOSESUDOSUSUDOSFRQSUSTDTCSUENDTCSUSTTPTSUSTRTPTSUENTPTSUENRTPT
11234SU12340051CIGARETTESTOBACCO

2PACKPER DAY

2006-01-01BEFORE2006-01-01ONGOING
21234SU12340052COFFEECAFFEINE

3CUPPER DAY2006-01-012006-01-01



31234SU12340061CIGARETTESTOBACCO

1PACKPER DAY
20032006-03-15BEFORE

41234SU12340062TEACAFFEINE

1CUPPER DAY2006-03-152006-03-15



51234SU12340063COFFEECAFFEINE

2CUPPER DAY2006-03-152006-03-15



61234SU12340071CIGARETTESTOBACCONOT DONESubject left office before CRF was completed








71234SU12340072CAFFEINECAFFEINENOT DONESubject left office before CRF was completed








6.2 Models for Events Domains

Most subject-level observations collected during the study should be represented according to one of the three SDTM general observation classes. This is the list of domains corresponding to the Events class.

Domain CodeDomain Description
AE

Adverse Events

An events domain that contains data describing untoward medical occurrences in a patient or subjects that are administered a pharmaceutical product and which may not necessarily have a causal relationship with the treatment.

CE

Clinical Events

An events domain that contains clinical events of interest that would not be classified as adverse events.

DS

Disposition

An events domain that contains information encompassing and representing data related to subject disposition.

DV

Protocol Deviations

An events domain that contains protocol violations and deviations during the course of the study.

HO

Healthcare Encounters

A events domain that contains data for inpatient and outpatient healthcare events (e.g., hospitalization, nursing home stay, rehabilitation facility stay, ambulatory surgery).

MH

Medical History

The medical history dataset includes the subject's prior history at the start of the trial. Examples of subject medical history information could include general medical history, gynecological history, and primary diagnosis.

6.2.1 Adverse Events

AE – Description/Overview

An events domain that contains data describing untoward medical occurrences in a patient or subjects that are administered a pharmaceutical product and which may not necessarily have a causal relationship with the treatment.

AE – Specification

ae.xpt, Adverse Events — Events, Version 3.3. One record per adverse event per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

AE – Assumptions

  1. The Adverse Events dataset includes clinical data describing "any untoward medical occurrence in a patient or clinical investigation subject administered a pharmaceutical product and which does not necessarily have to have a causal relationship with this treatment" (ICH E2A). In consultation with regulatory authorities, sponsors may extend or limit the scope of adverse event collection (e.g., collecting pre-treatment events related to trial conduct, not collecting events that are assessed as efficacy endpoints). The events included in the AE dataset should be consistent with the protocol requirements. Adverse events may be captured either as free text or via a pre-specified list of terms.
  2. Adverse Event Description and Coding
    1. AETERM captures the verbatim term collected for the event. It is the topic variable for the AE dataset. AETERM is a required variable and must have a value.
    2. AEMODIFY is a permissible variable and should be included if the sponsor's procedure permits modification of a verbatim term for coding. The modified term is listed in AEMODIFY. The variable should be populated as per the sponsor's procedures.
    3. AEDECOD is the preferred term derived by the sponsor from the coding dictionary. It is a required variable and must have a value. It is expected that the reported term (AETERM) will be coded using a standard dictionary such as MedDRA. The sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
    4. AEBODSYS is the system organ class from the coding dictionary associated with the adverse event by the sponsor. This value may differ from the primary system organ class designated in the coding dictionary's standard hierarchy. It is expected that this variable will be populated.
    5. Sponsors may include the values of additional levels from the coding dictionary's hierarchy (i.e., High-Level Group Term, High-Level Term, Lower-Level Term) in the SUPPAE dataset as described in Appendix C2, Supplemental Qualifiers Name Codes and in Section 8.4, Relating Non-Standard Variables Values to a Parent Domain.
  3. Additional Categorization and Grouping
    1. AECAT and AESCAT should not be redundant with the domain code or dictionary classification provided by AEDECOD and AEBODSYS (i.e., they should provide a different means of defining or classifying AE records). AECAT and AESCAT are intended for categorizations that are defined in advance. For example, a sponsor may have a separate CRF page for AEs of special interest and then another page for all other AEs. AECAT and AESCAT should not be used for after-the-fact categorizations such as clinically significant. In cases where a category of AEs of special interest resembles a part of the dictionary hierarchy (e.g., "CARDIAC EVENTS"), the categorization represented by AECAT and AESCAT may differ from the categorization derived from the coding dictionary.
    2. AEGRPID may be used to link (or associate) different records together to form a block of related records at the subject level within the AE domain. See Section 4.2.6, Grouping Variables and Categorization for discussion of grouping variables.
  4. Pre-Specified Terms; Presence or Absence of Events
    1. Adverse events are generally collected in two different ways, either by recording free text or using a pre-specified list of terms. In the latter case, the solicitation of information on specific adverse events may affect the frequency at which they are reported; therefore, the fact that a specific adverse event was solicited may be of interest to reviewers. An AEPRESP value of "Y" is used to indicate that the event in AETERM was pre-specified on the CRF.
    2. If it is important to know which adverse events from a pre-specified list were not reported as well as those that did occur, these data should be submitted in a Findings class dataset such as Findings About Events and Interventions (see Section 6.4, Findings About Events or Interventions). A record should be included in that Findings dataset for each pre-specified adverse-event term. Records for adverse events that actually occurred should also exist in the AE dataset with AEPRESP set to "Y."
    3. If a study collects both pre-specified adverse events as well as free-text events, the value of AEPRESP should be "Y" for all pre-specified events and null for events reported as free text. AEPRESP is a permissible field and may be omitted from the dataset if all adverse events were collected as free text.
    4. When adverse events are collected with the recording of free text, a record may be entered into the sponsor's data management system to indicate "no adverse events" for a specific subject. For these subjects, do not include a record in the AE submission dataset to indicate that there were no events. Records should be included in the submission AE dataset only for adverse events that have actually occurred.
  5. Timing Variables
    1. Relative timing assessment "Ongoing" is common in the collection of Adverse Event information. AEENRF may be used when this relative timing assessment is made coincident with the end of the study reference period for the subject represented in the Demographics dataset (RFENDTC). AEENRTPT with AEENTPT may be used when "Ongoing" is relative to another date, such as the final safety follow-up visit date. See Section 4.4.7, Use of Relative Timing Variables.
    2. Additional timing variables (such as AEDTC) may be used when appropriate.
  6. Other Qualifier Variables
    1. If categories of serious events are collected secondarily to a leading question, as in the example below, the values of the variables that capture reasons an event is considered serious (i.e., AESCAN, AESCONG, etc.) may be null.

      For example, if Serious is answered "No", the values for these variables may be null. However, if Serious is answered "Yes", at least one of them will have a "Y" response. Others may be "N" or null, according to the sponsor's convention.

      Serious?
      If yes, check all that apply

      On the other hand, if the CRF is structured so that a response is collected for each seriousness category, all category variables (e.g., AESDTH, AESHOSP) would be populated and AESER would be derived.

    2. The serious categories "Involves cancer" (AESCAN) and "Occurred with overdose" (AESOD) are not part of the ICH definition of a serious adverse event, but these categories are available for use in studies conducted under guidelines that existed prior to the FDA's adoption of the ICH definition.
    3. When a description of Other Medically Important Serious Adverse Events category is collected on a CRF, sponsors should place the description in the SUPPAE dataset using the standard supplemental qualifier name code AESOSP as described in Section 8.4, Relating Non-Standard Variables Values to a Parent Domain and in Appendix C2, Supplemental Qualifiers Name Codes.
    4. In studies using toxicity grade according to a standard toxicity scale such as Common Terminology Criteria for Adverse Events v3.0 (CTCAE), published by the NCI (National Cancer Institute) at https://ctep.cancer.gov/protocoldevelopment/electronic_applications/docs/ctcaev3.pdf, AETOXGR should be used instead of AESEV. In most cases, either AESEV or AETOXGR is populated but not both. There may be cases when a sponsor may need to populate both variables. The sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
    5. AE Structure
      The structure of the AE domain is one record per adverse event per subject. It is the sponsor's responsibility to define an event. This definition may vary based on the sponsor's requirements for characterizing and reporting product safety and is usually described in the protocol. For example, the sponsor may submit one record that covers an adverse event from start to finish. Alternatively, if there is a need to evaluate AEs at a more granular level, a sponsor may submit a new record when severity, causality, or seriousness changes or worsens. By submitting these individual records, the sponsor indicates that each is considered to represent a different event. The submission dataset structure may differ from the structure at the time of collection. For example, a sponsor might collect data at each visit in order to meet operational needs, but submit records that summarize the event and contain the highest level of severity, causality, seriousness, etc. Examples of dataset structure:
      1. One record per adverse event per subject for each unique event. Multiple adverse event records reported by the investigator are submitted as summary records "collapsed" to the highest level of severity, causality, seriousness, and the final outcome.
      2. One record per adverse event per subject. Changes over time in severity, causality, or seriousness are submitted as separate events. Alternatively, these changes may be submitted in a separate dataset based on the Findings About Events and Interventions model (see Section 6.4, Findings About Events or Interventions).
      3. Other approaches may also be reasonable as long as they meet the sponsor's safety evaluation requirements and each submitted record represents a unique event. The domain-level metadata (see Section 3.2, Using the CDISC Domain Models in Regulatory Submissions — Dataset Metadata) should clarify the structure of the dataset.
  7. Use of EPOCH and TAETORD
    When EPOCH is included in the Adverse Event domain, it should be the Epoch of the start of the adverse event. In other words, it should be based on AESTDTC, rather than AEENDTC. The computational method for EPOCH in the Define-XML document should describe any assumptions made to handle cases where an adverse event starts on the same day that a subject starts an Epoch, if AESTDTC and SESTDTC are not captured with enough precision to determine the epoch of the onset of the adverse event unambiguously. Similarly, if TAETORD is included in the Adverse Events domain, it should be the value for the start of the adverse event, and the computational method in the Define-XML document should describe any assumptions.
  8. Any additional Identifier variables may be added to the AE domain.
  9. Additional Events Qualifiers
    The following Qualifiers would not be used in AE: --OCCUR, --STAT, and--REASND. They are the only Qualifiers from the SDTM Events Class not in the AE domain. They are not permitted because the AE domain contains only records for adverse events that actually occurred. See Assumption 4b above for information on how to deal with negative responses or missing responses to probing questions for pre-specified adverse events.
  10. Variable order in the domain should follow the rules as described in Section 4.1.4, Order of the Variables and the order described in Section 1.1, Purpose.
  11. The addition of AELLT, AELLTCD, AEPTCD, AEHLT, AEHLTCD, AEHLGT, AEHLGTCD, AEBDSYCD, AESOC, and AESOCCD is applicable to submissions coded in MedDRA only. Data items are not expected for non-MedDRA coding.

AE – Examples

Example

This example illustrates data from an AE CRF that collected AE terms as free text. AEs were coded using MedDRA, and the sponsor's procedures include the possibility of modifying the reported term to aid in coding. The CRF was structured so that seriousness category variables (e.g., AESDTH, AESHOSP) were checked only when AESER is answered "Y." In this study, the study reference period started at the start of study treatment. Three AEs were reported for this subject.

Rows 1-2:Show examples of modifying the reported term for coding purposes, with the modified term in AEMODIFY. These adverse events were not serious, so the seriousness criteria variables are null. Note that for the event in row 2, AESTDY = "1". Since Day 1 was the day treatment started, the AE start and end times, as well as dates, were collected to allow comparison of the AE timing to the start of treatment.
Row 3:Shows an example of the overall seriousness question AESER answered with "Y" and the relevant corresponding seriousness category variables (AESHOSP and AESLIFE) answered "Y". The other seriousness category variables are left blank. This row also shows AEENRF being populated because the AE was marked as "Continuing" as of the end of the study reference period for the subject (see Section 4.4.7, Use of Relative Timing Variables).

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAEMODIFYAEDECODAEBODSYSAESEVAESERAEACNAERELAEOUTAESCONGAESDISABAESDTHAESHOSPAESLIFEAESMIEEPOCHAESTDTCAEENDTCAESTDYAEENDYAEENRF
1ABC123AE1231011POUNDING HEADACHEHEADACHEHeadacheNervous system disordersSEVERENNOT APPLICABLEDEFINITELY NOT RELATEDRECOVERED/RESOLVED





SCREENING2005-10-122005-10-12-1-1
2ABC123AE1231012BACK PAIN FOR 6 HOURSBACK PAINBack painMusculoskeletal and connective tissue disordersMODERATENDOSE REDUCEDPROBABLY RELATEDRECOVERED/RESOLVED





TREATMENT2005-10-13T13:052005-10-13T19:0011
3ABC123AE1231013PULMONARY EMBOLISM
Pulmonary embolismVascular disordersMODERATEYDOSE REDUCEDPROBABLY NOT RELATEDRECOVERING/RESOLVING


YY
TREATMENT2005-10-21
9
AFTER

Example

In this example, a CRF module included at several visits asked whether nausea, vomiting, or diarrhea occurred. The responses to the probing questions ("Yes", "No", or "Not Done") were represented in the Findings About (FA) domain (see Section 6.4, Findings About Events or Interventions). If "Yes", the investigator was instructed to complete the Adverse Event CRF. In the Adverse Events dataset, data on AEs solicited by means of pre-specified on the CRF have an AEPRESP value of "Y". For AEs solicited by a general question, AEPRESP is null. RELREC may be used to relate AE records and FA records.

Rows 1-2:Show that nausea and vomiting were pre-specified on a CRF, as indicated by AEPRESP = "Y". The subject did not experience diarrhea, so no record for that term exists in the AE dataset.
Row 3:Shows an example of an AE (headache) that was not pre-specified on a CRF as indicated by a null value for AEPRESP.

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAEDECODAEPRESPAEBODSYSAESEVAESERAEACNAERELAEOUTEPOCHAESTDTCAEENDTCAESTDYAEENDY
1ABC123AE1231011NAUSEANauseaYGastrointestinal disordersSEVERENDOSE REDUCEDRELATEDRECOVERED/RESOLVEDTREATMENT2005-10-122005-10-1323
2ABC123AE1231012VOMITINGVomitingYGastrointestinal disordersMODERATENDOSE REDUCEDRELATEDRECOVERED/RESOLVEDTREATMENT2005-10-13T13:002005-10-13T19:0033
3ABC123AE1231013HEADACHEHeadache
Nervous system disordersMILDNDOSE NOT CHANGEDPOSSIBLY RELATEDRECOVERED/RESOLVEDTREATMENT2005-10-212005-10-211111

Example

In this example, a CRF module that asked whether or not nausea, vomiting, or diarrhea occurred was included in the study only once. In the context of this study, the conditions that occurred were reportable as Adverse Events. No additional data about these events was collected. No other adverse event information was collected via general questions. The responses to the probing questions ("Yes", "No", or "Not Done") were represented in the Findings About (FA) domain (see Section 6.4, Findings About Events or Interventions). This is an example of unusually sparse AE data collection; the AE dataset is populated with the term and the flag indicating that it was pre-specified, but timing information is limited to the date of collection, and other expected qualifiers are not available. RELREC may be used to relate AE records and FA records.

The subject shown in this example experienced nausea and vomiting. The subject did not experience diarrhea, so no record for that term exists in the AE dataset.

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAEDECODAEPRESPAEBODSYSAESERAEACNAERELAEDTCAESTDTCAEENDTCAEDY
1ABC123AE1231011NAUSEANauseaYGastrointestinal disorders


2005-10-29

19
2ABC123AE1231012VOMITINGVomitingYGastrointestinal disorders


2005-10-29

19

Example

In this example, the investigator was instructed to create a new adverse-event record each time the severity of an adverse event changed. The sponsor used AEGRPID to identify the group of records related to a single event for a subject.

Row 1:Shows an adverse event of nausea, whose severity was moderate.
Rows 2-4:Show AEGRPID used to group records related to a single event of "VOMITING".
Rows 5-6:Show AEGRPID used to group records related to a single event of "DIARRHEA".

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAEGRPIDAETERMAEBODSYSAESEVAESERAEACNAERELAESTDTCAEENDTC
1ABC123AE1231011
NAUSEAGastrointestinal disordersMODERATENDOSE NOT CHANGEDRELATED2005-10-132005-10-14
2ABC123AE12310121VOMITINGGastrointestinal disordersMILDNDOSE NOT CHANGEDPOSSIBLY RELATED2005-10-142005-10-16
3ABC123AE12310131VOMITINGGastrointestinal disordersSEVERENDOSE NOT CHANGEDPOSSIBLY RELATED2005-10-162005-10-17
4ABC123AE12310141VOMITINGGastrointestinal disordersMILDNDOSE NOT CHANGEDPOSSIBLY RELATED2005-10-172005-10-20
5ABC123AE12310152DIARRHEAGastrointestinal disordersSEVERENDOSE NOT CHANGEDPOSSIBLY RELATED2005-10-162005-10-17
6ABC123AE12310162DIARRHEAGastrointestinal disordersMODERATENDOSE NOT CHANGEDPOSSIBLY RELATED2005-10-172005-10-21

6.2.2 Clinical Events

CE – Description/Overview

An events domain that contains clinical events of interest that would not be classified as adverse events.

CE – Specification

ce.xpt, Clinical Events — Events, Version 3.3. One record per event per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

CE – Assumptions

  1. The determination of events to be considered clinical events versus adverse events should be done carefully and with reference to regulatory guidelines or consultation with a regulatory review division. Note that all reportable adverse events that would contribute to AE incidence tables in a clinical study report must be included in the AE domain.
    1. Events considered to be clinical events may include episodes of symptoms of the disease under study (often known as signs and symptoms), or events that do not constitute adverse events in themselves, though they might lead to the identification of an adverse event. For example, in a study of an investigational treatment for migraine headaches, migraine headaches may not be considered to be adverse events per protocol. The occurrence of migraines or associated signs and symptoms might be reported in CE.
    2. In vaccine trials, certain serious adverse events may be considered to be signs or symptoms and accordingly determined to be clinical events. In this case the serious variable (--SER) and the serious adverse event flags (--SCAN, --SCONG, --SDTH, --SHOSP, --SDISAB, --SLIFE, --SOD, --SMIE) would be required in the CE domain.
    3. Other studies might track the occurrence of specific events as efficacy endpoints. For example, in a study of an investigational treatment for prevention of ischemic stroke, all occurrences of TIA, stroke and death might be captured as clinical events and assessed as to whether they meet endpoint criteria. Note that other information about these events may be reported in other datasets. For example, the event leading to death would be reported in AE; death would also be a reason for study discontinuation in DS.
  2. CEOCCUR and CEPRESP are used together to indicate whether the event in CETERM was pre-specified and whether it occurred. CEPRESP can be used to separate records that correspond to probing questions for pre-specified events from those that represent spontaneously reported events, while CEOCCUR contains the responses to such questions. The table below shows how these variables are populated in various situations.
    SituationValue of
    CEPRESP
    Value of
    CEOCCUR
    Value of
    CESTAT
    Spontaneously reported event occurrence


    Pre-specified event occurredYY
    Pre-specified event did not occurYN
    Pre-specified event has not responseY
    NOT DONE
  3. The collection of write-in events on a Clinical Events CRF should be considered with caution. Sponsors must ensure that all adverse events are recorded in the AE domain.
  4. Any identifier variable may be added to the CE domain.
  5. Timing variables
    1. Relative timing assessments "Prior" or "Ongoing" are common in the collection of Clinical Event information. CESTRF or CEENRF may be used when this timing assessment is relative to the study reference period for the subject represented in the Demographics dataset (RFENDTC). CESTRTPT with CESTTPT, and/or CEENRTPT with CEENTPT may be used when "Prior" or "Ongoing" are relative to specific dates other than the start and end of the study reference period. See Section 4.4.7, Use of Relative Timing Variables.
    2. Additional Timing variables may be used when appropriate.
  6. The clinical events domain is based on the Events general observation class and thus can use any variables in the Events class, including those found in the Adverse Events (AE) domain specification table.

CE – Examples

Example

In this example:

  • Data were collected about pre-specified events that, in the context of this study, were not reportable as Adverse Events.
  • The data collected included the "event-like" timing variable start date.
  • Data about pre-specified clinical events were collected in a log independent of visits, rather than in visit-based CRF modules.
  • No "Yes/No" data on the occurrence of the event was collected.

CRF:

Record start dates of any of the following signs that occur.
Clinical SignStart Date
Rash
Wheezing
Edema
Conjunctivitis

This example shows records for clinical events for which start dates were recorded. Since conjunctivitis was not observed, no start date was recorded and there is no CE record.

ce.xpt

RowSTUDYIDDOMAINUSUBJIDCESEQCETERMCEPRESPCEOCCURCESTDTC
1ABC123CE1231RashYY2006-05-03
2ABC123CE1232WheezingYY2006-05-03
3ABC123CE1233EdemaYY2006-05-03

Example

In this example:

  • The CRF included both questions about pre-specified clinical events (events not reportable as AEs in the context of this study) and spaces for the investigator to write in additional clinical events.
  • Data collected are start and end dates, which are "event-like," and severity, which is a Qualifier in the Events general observation class.

CRF:

EventDate StartedDate EndedSeverity
Nausea
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
Vomit
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
Diarrhea
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
Other, Specify:______________ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
_ _ / _ _ _ / _ _ _ _
(dd/mmm/yyyy)
Row 1:Shows a record for the pre-specified clinical event "Nausea". The CEPRESP value of "Y" indicates that there was a probing question; the response to the probe (CEOCCUR) was "Yes". The record includes additional data about the event.
Row 2:Shows a record for the pre-specified clinical event "Vomit". The CEPRESP value of "Y" indicates that there was a probing question; the response to the question (CEOCCUR) was "No".
Row 3:Shows a record for the pre-specified clinical event "Diarrhea." The value "Y" for CEPRESP indicates it was pre-specified. The CESTAT value of NOT DONE indicates that the probing question was not asked or that there was no answer.
Row 4:Shows a record for a write-in Clinical Event recorded in the "Other, Specify" space. Because this event was not pre-specified, CEPRESP and CEOCCUR are null. See Section 4.2.7, Submitting Free Text from the CRF for further information on populating the Topic variable when "Other, Specify" is used on the CRF).

ce.xpt

RowSTUDYIDDOMAINUSUBJIDCESEQCETERMCEPRESPCEOCCURCESTATCESEVCESTDTCCEENDTC
1ABC123CE1231NAUSEAYY
MODERATE2005-10-122005-10-15
2ABC123CE1232VOMITYN



3ABC123CE1233DIARRHEAY
NOT DONE


4ABC123CE1234SEVERE HEAD PAIN


SEVERE2005-10-092005-10-11

6.2.3 Disposition

DS – Description/Overview

An events domain that contains information encompassing and representing data related to subject disposition.

DS – Specification

ds.xpt, Disposition — Events, Version 3.3. One record per disposition status or protocol milestone per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

DS – Assumptions

  1. DS Definition
    The Disposition dataset provides an accounting for all subjects who entered the study and may include protocol milestones, such as randomization, as well as the subject's completion status or reason for discontinuation for the entire study or each phase or segment of the study, including screening and post-treatment follow-up. Sponsors may choose which disposition events and milestones to submit for a study. See ICH E3: Section 10.1 for information about disposition events.
  2. Categorization
    1. DSCAT is used to distinguish between disposition events, protocol milestones and other events. The controlled terminology for DSCAT consists of "DISPOSITION EVENT", "PROTOCOL MILESTONE", and "OTHER EVENT".
    2. An event with DSCAT = "DISPOSITION EVENT" describes either disposition of study participation or of a study treatment. It describes whether a subject completed study participation or a study treatment, and if not, the reason they did not complete it. Dispositions may be described for each Epoch (e.g., screening, initial treatment, washout, cross-over treatment, follow-up) or for the study as a whole. If disposition events for both study participation and study treatment(s) are to be represented, then DSSCAT provides this distinction. The value of DSSCAT is based on the sponsor's controlled terminology, however for records with DSCAT = "DISPOSITION EVENT",
      1. DSSCAT = "STUDY PARTICIPATION" is used to represent disposition of study participation.
      2. DSSCAT = "STUDY TREATMENT" can be used as a generic identifier when a study has only a single treatment.
      3. If a study has multiple treatments, then DSSCAT should name the individual treatment.
    3. An event with DSCAT = "PROTOCOL MILESTONE" is a protocol-specified, "point-in-time" event. Common protocol milestones include "INFORMED CONSENT OBTAINED" and "RANDOMIZED." DSSCAT may be used for subcategories of protocol milestones.
    4. An event with DSCAT = "OTHER EVENT" is another important event that occured during a trial, but was not driven by protocol requirements and was not captured in another Events or Interventions class dataset. "TREATMENT UNBLINDED" is an example of an event that would be represented with DSCAT = "OTHER EVENT".
  3. DS Description and Coding
    1. DSDECOD values are drawn from controlled terminology. The controlled terminology depends on the value of DSCAT.
    2. When DSCAT = "DISPOSITION EVENT" DSTERM contains either "COMPLETED" or, if the subject did not complete, specific verbatim information about the reason for non-completion.
      1. When DSTERM = "COMPLETED", DSDECOD is the term "COMPLETED" from the controlled terminology codelist NCOMPLT.
      2. When DSTERM contains verbatim text, DSDECOD will use the extensible controlled terminology codelist NCOMPLT. For example, DSTERM = "Subject moved" might be coded to DSDECOD = "LOST TO FOLLOW-UP".
    3. When DSCAT = "PROTOCOL MILESTONE", DSTERM contains the verbatim (as collected) and/or standardized text, DSDECOD will use the extensible controlled terminology codelist PROTMLST.
    4. When DSCAT = "OTHER EVENT", DSDECOD uses sponsor terminology.
      1. If a reason for the event was collected, the reason for the event is in DSTERM and the DSDECOD is a term from sponsor terminology. For example if treatment was unblinded due to investigator error, this might be represented in a record with DSTERM = "INVESTIGATOR ERROR" and DSDECOD = "TREATMENT UNBLINDED".
      2. If no reason was collected then DSTERM should be populated with the value in DSDECOD.
  4. Timing Variables
    1. DSSTDTC is expected and is used for the date/time of the disposition event. Events represented in the DS domain do not have end dates since disposition events do not span an interval but occur at a single date/time (e.g., randomization date, disposition of study paraticipation or study treatment).
    2. DSSTDTC documents the date/time that a protocol milestone, disposition event, or other event occurred. For an event with DSCAT = "DISPOSITION EVENT" where DSTERM is not "COMPLETED", the reason for non-completion may be related to an observation reported in another dataset. DSSTDTC is the date/time that the Epoch was completed and is not necessarily the same as the date/time, start date/time, or end date/time of the observation that led to discontinuation.

      For example, a subject reported severe vertigo on June 1, 2006 (AESTDTC). After ruling out other possible causes, the investigator decided to discontinue study treatment on June 6, 2006 (DSSTDTC). The subject reported that the vertigo had resolved on June 8, 2006 (AEENDTC).

    3. EPOCH may be included as a timing variable as in other general-observation-class domains. In DS, EPOCH is based on DSSTDTC. The values of EPOCH are drawn from the Trial Arms (TA) dataset (Section 7.2.1, Trial Arms).
  5. Reasons for Termination: ICH E3: Section 10.1 indicates that "the specific reason for discontinuation" should be presented, and that summaries should be "grouped by treatment and by major reason." The CDISC SDS Team interprets this guidance as requiring one standardized disposition term (DSDECOD) per disposition event. If multiple reasons are reported, the sponsor should identify a primary reason and use that to populate DSTERM and DSDECOD. Additional reasons should be submitted in SUPPDS.

    For example, in a case where DSTERM = "SEVERE NAUSEA" and DSDECOD = "ADVERSE EVENT" the supplemental qualifiers dataset might include records with

    SUPPDS QNAM = "DSTERM1", SUPPDS QLABEL = "Reported Term for Disposition Event 1", and SUPPDS QVAL = "SUBJECT REFUSED FURTHER TREATMENT"

    SUPPDS QNAM = "DSDECOD1", SUPPDS QLABEL = "Standardized Disposition Term 1", and SUPPDS QVAL = "WITHDREW CONSENT"

  6. Any Identifier variables, Timing variables, or Events general-observation-class qualifiers may be added to the DS domain, but the following Qualifiers would generally not be used in DS: --PRESP, --OCCUR, --STAT, --REASND, --BODSYS, --LOC, --SEV, --SER, --ACN, --ACNOTH, --REL, --RELNST, --PATT, --OUT, --SCAN, --SCONG, --SDISAB, --SDTH, --SHOSP, --SLIFE, --SOD, --SMIE, --CONTRT, --TOXGR.

DS – Examples

Example

In this example, disposition of study participation was collected for each EPOCH of a trial. Disposition of study participation is indicated by DSCAT = "DISPOSITION EVENT". EPOCH was taken from the case report form, which asked about completion of each epoch of the study. Data about disposition of study treatment was not collected, but the sponsor populated DSSCAT with "STUDY PARTICIPATION" to emphasize that these represent disposition of study participation.

Data were also collected about several protocol milestones represented with DSCAT = "PROTOCOL MILESTONE".

Rows 1, 2, 6, 8, 9, 12, 13, 17, 18:Show records for protocol milestones. DSTERM and DSDECOD are populated with the same value, the name of the milestone. Note that for randomization events, EPOCH = "SCREENING", since randomization occurred before the start of treatment, during the screening epoch.
Rows 3-5:Show three records for a subject who completed three stages of the study, "SCREENING", "TREATMENT", and "FOLLOW-UP".
Row 7:Shows disposition of a subject who was a screen failure. The verbatim reason the subject was a screen failure is represented in DSTERM. Since the subject did not complete the screening epoch, DSDECOD is not "COMPLETED" but another appropriate controlled term, "PROTOCOL VIOLATION". The date of discontinuation is in DSSTDTC. The protocol deviation event itself would be represented in the DV dataset.
Rows 10-11:Show disposition of a subject who completed the screening stage but did not complete the treatment stage. For completed epochs, both DSTERM and DSDECOD are "COMPLETED". For epochs that were not completed, the verbatim reason for non-completion of the treatment epoch is in DSTERM, while the value from controlled terminology is in DSDECOD.
Rows 14-16:Show disposition of a subject who completed treatment, but did not complete follow-up. Note that for final disposition event, the date of collection of the event information, DSDTC, was different from the date of the disposition event (the subject's death), DSSTDTC.
Rows 19-21:Show disposition of a subject who discontinued the treatment epoch due to an AE, but who went on to complete the follow-up phase of the trial.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATEPOCHDSDTCDSSTDTC
1ABC123DS1231011INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
SCREENING2003-09-212003-09-21
2ABC123DS1231012RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
SCREENING2003-09-302003-09-30
3ABC123DS1231013COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONSCREENING2003-09-302003-09-29
4ABC123DS1231014COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONTREATMENT2003-10-312003-10-31
5ABC123DS1231015COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONFOLLOW-UP2003-11-152003-11-15
6ABC123DS1231021INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
SCREENING2003-11-212003-11-21
7ABC123DS1231022SUBJECT DENIED MRI PROCEDUREPROTOCOL VIOLATIONDISPOSITION EVENTSTUDY PARTICIPATIONSCREENING2003-11-222003-11-20
8ABC123DS1231031INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
SCREENING2003-09-152003-09-15
9ABC123DS1231032RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
SCREENING2003-09-302003-09-30
10ABC123DS1231033COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONSCREENING2003-09-302003-09-22
11ABC123DS1231034SUBJECT MOVEDLOST TO FOLLOW-UPDISPOSITION EVENTSTUDY PARTICIPATIONTREATMENT2003-10-312003-10-31
12ABC123DS1231041INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
SCREENING2003-09-152003-09-15
13ABC123DS1231043RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
SCREENING2003-09-302003-09-30
14ABC123DS1231042COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONSCREENING2003-09-302003-09-22
15ABC123DS1231044COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONTREATMENT2003-10-152003-10-15
16ABC123DS1231045AUTOMOBILE ACCIDENTDEATHDISPOSITION EVENTSTUDY PARTICIPATIONFOLLOW-UP2003-10-312003-10-29
17ABC123DS1231051INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
SCREENING2003-09-282003-09-28
18ABC123DS1231052RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
SCREENING2003-10-022003-10-02
19ABC123DS1231053COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONSCREENING2003-10-022003-10-02
20ABC123DS1231054ANEMIAADVERSE EVENTDISPOSITION EVENTSTUDY PARTICIPATIONTREATMENT2003-10-172003-10-17
21ABC123DS1231055COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATIONFOLLOW-UP2003-11-022003-11-02

Example

In this example, the sponsor has chosen to simply submit whether or not the subject completed the study, so there is only one record per subject. The sponsor did not collect disposition of treatment and did not include DSSCAT. EPOCH was populated as a timing variable, and represents the epoch during which the subject discontinued.

Row 1:Subject who completed the study. EPOCH = "FOLLOW-UP" since that was the last epoch in the design of this study.
Rows 2-3:Subjects who discontinued. Both discontinued participation during the treatment epoch.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATEPOCHDSSTDTC
1ABC456DS4561011COMPLETEDCOMPLETEDDISPOSITION EVENTFOLLOW-UP2003-09-21
2ABC456DS4561021SUBJECT TAKING STUDY MED ERRATICALLYPROTOCOL VIOLATIONDISPOSITION EVENTTREATMENT2003-09-29
3ABC456DS4561031LOST TO FOLLOW-UPLOST TO FOLLOW-UPDISPOSITION EVENTTREATMENT2003-10-15

Example

In this study, disposition of study participation was collected for the treatment and follow-up epochs. For these records, the value in EPOCH was taken from the CRF. Data on screen failures were not submitted for this study, so all submitted subjects completed screening; the sponsor chose not to data on disposition of the screening epoch.

Data on protocol milestones were not collected, but data were collected if a subject's treatment was unblinded. For these records, EPOCH represents the epoch during which the blind was broken.

Rows 1, 2:Subject completed the treatment and follow-up phase.
Rows 3, 5:Subject did not complete the treatment phase but did complete the follow-up phase.
Row 4:Subject's treatment is unblinded. The date of the unblinding is represented in DSSTDTC. Maintaining the blind as per protocol is not considered to be an event since there is no change in the subject's state.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATEPOCHDSSTDTC
1ABC789DS7891011COMPLETEDCOMPLETEDDISPOSITION EVENTTREATMENT2004-09-12
2ABC789DS7891012COMPLETEDCOMPLETEDDISPOSITION EVENTFOLLOW-UP2004-12-20
3ABC789DS7891021SKIN RASHADVERSE EVENTDISPOSITION EVENTTREATMENT2004-09-30
4ABC789DS7891022SUBJECT HAD SEVERE RASHTREATMENT UNBLINDEDOTHER EVENTTREATMENT2004-10-01
5ABC789DS7891023COMPLETEDCOMPLETEDDISPOSITION EVENTFOLLOW-UP2004-12-28

Example

In this example, the CRF included collection of an AE number when study participation was incomplete due to an adverse event. The relationship between the DS record and in the AE record was represented in a RELREC dataset.

The DS domains represents the end of the subject's participation in the study, due to their death from heart failure. In this case, the disposition was collected (DSDTC) on the same day that death occurred and the subject's study participation ended. (DSDTDTC).

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATEPOCHDSDTCDSSTDTC
1ABC123DS1231021Heart FailureDEATHDISPOSITION EVENTTREATMENT2003-09-292003-09-29

The heart failure is represented as an adverse event. In order to save space, only two of the MedDRA coding variables for the adverse event have been included.

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAESTDTCAEENDTCAEDECODAESOCAESEVAESERAEACNAERELAEOUTAESCANAESCONGAESDISABAESDTHAESHOSPAESLIFEAESODAESMIE
1ABC123AE1231021Heart Failure2003-09-292003-09-29HEART FAILURECARDIOVASCULAR SYSTEMSEVEREYNOT APPLICABLEDEFINITELY NOT RELATEDFATALNNNYNNNN

The relationship between the DS and AE records is represented in RELREC.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABC123DS123102DSSEQ1
1
2ABC123AE123102AESEQ1
1

The subject's DM record is not shown, but included DTHFL = "Y" and the date of death.

Example

The example below represents a multi-drug (isoniazid and levofloxacin) investigational treatment trial for multidrug-resistant tuberculosis (MDR-TB). The protocol allows for a subject to discontinue levofloxacin and continue single treatment of isoniazid throughout the remainder of the study. Disposition of study participation and disposition of each drug was collected. Whether a record with DSCAT = "DISPOSITION EVENT" represents disposition of the subject's participation in the study or disposition of a study treatment is represented in DSSCAT. In this example, disposition of the study and of each drug a subject received for each of the study's two treatment epochs.

Row 1:Indicates that the physician, per protocol, removed levofloxacin treatment due to high-level positive cultures. This record represents the treatment discontinuation for levofloxacin, for the first treatment epocch. Note that since this subject did not receive levofloxacin during the second treatment epoch, there is no record for DSSCAT = "LEVOFLOXACIN" with EPOCH = "TREATMENT 2".
Rows 2, 4:Represent the treatment continuation and completion for isoniazid each treatment epoch, as indicated by DSSCAT = "ISONIAZID".
Rows 3, 5:Represent the study disposition for each treatment epoch, as indicated by DSSCAT = "STUDY PARTICIPATION".

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCEPOCH
1XXXDSXXX-767-0011PERSISTENT HIGH-LEVEL POSITIVE CULTURES, PER PROTOCOL, LEVOFLOXACIN REMOVAL RECOMMENDEDPHYSICIAN DECISIONDISPOSITION EVENTLEVOFLOXACIN2016-02-15TREATMENT 1
2XXXDSXXX-767-0012COMPLETEDCOMPLETEDDISPOSITION EVENTISONIAZID2016-02-15TREATMENT 1
3XXXDSXXX-767-0013COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-02-25TREATMENT 1
4XXXDSXXX-767-0014COMPLETEDCOMPLETEDDISPOSITION EVENTISONIAZID2016-03-14TREATMENT 2
5XXXDSXXX-767-0015COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-03-24TREATMENT 2

Example

This example is for a study of a multi-drug (isoniazid and levofloxacin) investigational treatment for multidrug-resistant tuberculosis (MDR-TB). The protocol allowed a subject to discontinue levofloxacin and continue single treatment of isoniazid throughout the remainder of the study. Disposition of study participation and of each study treatment was collected. For records of disposition of the subject's participation in the study DSSCAT = "STUDY PARTICIPATION", while for records of disposition of a study treatment DSSCAT is the name of the treatment.

Row 1:Represents the final treatment disposition for levofloxacin, as indicated by DSSCAT = "LEVOFLOXACIN". The physician removed levofloxacin treatment due to high-level positive cultures, as allowed by the protocol.
Row 2:Represents the final treatment completion of isoniazid within the trial, which is indicated by DSSCAT = "ISONIAZID".
Row 3:Represents the final study completion within the trial, which is indicated by DSSCAT = "STUDY PARTICIPATION".

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCEPOCH
1XXXDSXXX-767-0011PERSISTENT HIGH-LEVEL POSITIVE CULTURES, PER PROTOCOL, LEVOFLOXACIN REMOVAL RECOMMENDEDPHYSICIAN DECISIONDISPOSITION EVENTLEVOFLOXACIN2016-02-15TREATMENT 1
2XXXDSXXX-767-0012COMPLETEDCOMPLETEDDISPOSITION EVENTISONIAZID2016-03-14TREATMENT 2
3XXXDSXXX-767-0013COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-03-24TREATMENT 2

Example

The example below is for a trial with a single investigative treatment. The sponsor used the generic DSSCAT value "STUDY TREATMENT" rather than the name of the treatment. This subject discontinued both treatment and study participation due to an adverse event.

Rows 1, 3:Represent the disposition of treatment for each treatment epoch, as indicated by DSSCAT = "STUDY TREATMENT".
Rows 2, 4:Represent the disposition of study participation continuation for each treatment epoch, as indicated by DSSCAT = "STUDY PARTICIPATION".

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCEPOCH
1XXXDSXXX-767-0011COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY TREATMENT2016-02-15TREATMENT 1
2XXXDSXXX-767-0012COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-02-15TREATMENT 1
3XXXDSXXX-767-0013SKIN RASHADVERSE EVENTDISPOSITION EVENTSTUDY TREATMENT2016-03-14TREATMENT 2
4XXXDSXXX-767-0014SKIN RASHADVERSE EVENTDISPOSITION EVENTSTUDY PARTICIPATION2016-03-14TREATMENT 2

Example

The example below represents data for an ongoing blinded study in which each subject received two treatments, identified by the sponsor as "BLINDED DRUG A" and "BLINDED DRUG B". Disposition of study participation and of each of the two blinded treatments was collected for each of the two treatment epochs in the study. The subject in this example completed study participation and both treatments for both treatment epochs.

Rows 1, 2, 4, 5:Represent the disposition of the blinded treatments for each of the two treatment epochs for each of the two treatments, indicated by DSSCAT = "BLINDED DRUG A" and DSSCAT = "BLINDED DRUG B".
Rows 3, 6:Represent the disposition of study participation for each of the two treatment epochs, as indicated by DSSCAT = "STUDY PARTICIPATION".

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCEPOCH
1XXXDSXXX-767-0011COMPLETEDCOMPLETEDDISPOSITION EVENTBLINDED DRUG A2016-02-15TREATMENT 1
2XXXDSXXX-767-0012COMPLETEDCOMPLETEDDISPOSITION EVENTBLINDED DRUG B2016-02-15TREATMENT 1
3XXXDSXXX-767-0013COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-02-25TREATMENT 1
4XXXDSXXX-767-0014COMPLETEDCOMPLETEDDISPOSITION EVENTBLINDED DRUG A2016-03-14TREATMENT 2
5XXXDSXXX-767-0015COMPLETEDCOMPLETEDDISPOSITION EVENTBLINDED DRUG B2016-03-14TREATMENT 2
6XXXDSXXX-767-0016COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-03-24TREATMENT 2

Example

This example is for a study in which multiple informed consents were collected. DSTERM is populated with a full description of the informed consent; DSDECOD is populated with the standardized value "INFORMED CONSENT OBTAINED" from the codelist "Completion/Reason for Non-Completion" (NCOMPLT). For all informed consent records, DSCAT = "PROTOCOL MILESTONE". The sponsor chose to include the EPOCH timing variable, to indicate the epoch during which each protocol milestone occurred.

Row 1:Shows the obtaining of the initial study informed consent.
Row 2:Shows randomization, another event with DSCAT = "PROTOCOL MILESTONE".
Rows 3-5:Show three additional informed consents obtained during the trial.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATEPOCHDSSTDTC
1XXXDSXXX-767-0011INFORMED CONSENT FOR STUDY ENROLLMENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONESCREENING2016-02-22
2XXXDSXXX-767-0012RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONESCREENING2016-02-26
3XXXDSXXX-767-0013INFORMED CONSENT FOR AMENDMENT ONE OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONETREATMENT 12016-04-12
4XXXDSXXX-767-0014INFORMED CONSENT FOR PHARMACOGENETIC RESEARCH OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONETREATMENT 22016-06-08
5XXXDSXXX-767-0015INFORMED CONSENT FOR PK SUB-STUDY OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONETREATMENT 22016-06-23

Example

The example represents data for two subjects who participated in a study with multiple treatment periods. During the first treatment period, subjects were randomized to "DRUG1" or "DRUG2". The second treatment phase added the investigational drug to "DRUG1" and "DRUG2". Disposition of study drugs and study participation was collected at the end of each epoch. DSSCAT was used to distinguish between disposition of study drugs vs. study participation. The supporting Demographics (DM), Exposure (EX), Trial Elements (TE), Trial Arms (TA) and Subject Elements (SE) have been provided for additional context. Not all records are shown in the supporting example datasets.

The elements used in the TA dataset are defined in the TE dataset.

Row 1:Shows the screening element.
Rows 2, 3:Show the elements for treatment with either "DRUG1" or "DRUG2". These appear in the first treatment epoch in the TA dataset.
Rows 4, 5:Show the elements for treatment with either "DG1INDG" or "DG2INDG". These appear in the second treatment epoch in the TA dataset.
Row 6:Shows the follow-up element.

te.xpt

RowSTUDYIDDOMAINETCDELEMENTTESTRLTEENRLTEDUR
1XYZTESCRNScreenInformed Consent1 week after start of ElementP7D
2XYZTEDRUG1Drug 1First dose of Drug 14 weeks after start of ElementP28D
3XYZTEDRUG2Drug 2First dose of Drug 24 weeks after start of ElementP28D
4XYZTEDG1INDGDrug 1 + Investigation DrugFirst dose of Investigational Drug, where Investigational Drug is given with Drug 1.1 week after start of ElementP7D
5XYZTEDG2INDGDrug 2 + Investigation DrugFirst dose of Investigational Drug, where Investigational Drug is given with Drug 2.1 week after start of ElementP7D
6XYZTEFUFollow-upOne day after last administration of study drug.

The TA dataset describes the design of the study.

Rows 1, 5:Screening portion of the trial arm.
Rows 2, 6:Represents the planned initial treatment arm of either "DRUG1" or "DRUG2".
Rows 3, 7:Represents the planned second treatment arm of either "DG1INDG" or " DG2INDG" .
Rows 4, 8:Follow-up portion of the trial arm.

ta.xpt

RowSTUDYIDDOMAINARMCDARMTAETORDETCDELEMENTTABRANCHTATRANSEPOCH
1XYZTADG1INDGDrug-1+Investigation-Drug1SCRNScreenRandomized to DG1INDG
SCREENING
2XYZTADG1INDGDrug-1+Investigation-Drug2DRUG1Drug-1

TREATMENT 1
3XYZTADG1INDGDrug-1+Investigation-Drug3DG1INDGDrug 1 + Investigation Drug

TREATMENT 2
4XYZTADG1INDGDrug-1+Investigation-Drug4FUFollow-up

FOLLOW-UP
5XYZTADG2INDGDrug-2+Investigation-Drug1SCRNScreenRandomized to DG2INDG
SCREENING
6XYZTADG2INDGDrug-2+Investigation-Drug2DRUG2Drug-2

TREATMENT 1
7XYZTADG2INDGDrug-2+Investigation-Drug3DG2INDGDrug 2 + Investigation Drug

TREATMENT 2
8XYZTADG2INDGDrug-2+Investigation-Drug4FUFollow-up

FOLLOW-UP

The Demographics (DM) dataset includes the arm to which the subjects were randomized, and the dates of informed consent, start of study treatment, end of study treatment, and end of study participation.

dm.xpt

RowSTUDYIDDOMAINUSUBJIDSUBJIDRFXSTDTCRFXENDTCRFICDTCRFPENDTCSITEIDINVNAMARMCDARMACTARMCDACTARMARMNRSACTARMUD
1XYZDMXYZ-767-0010012016-02-142016-04-192016-02-022016-04-2401ADAMS, MDG1INDGDrug-1+Investigation-DrugDG1INDGDrug-1+Investigation-Drug

3XYZDMXYZ-767-0020022016-02-212016-04-242016-02-042016-04-2901ADAMS, MDG2INDGDrug-2+Investigation-DrugDG2INDGDrug-2+Investigation-Drug

The Exposure (EX) dataset shows the administration of study treatments.

ex.xpt

RowSTUDYIDDOMAINUSUBJIDEXSEQEXTRTEXDOSEEXDOSUEPOCHEXSTDTCEXENDTC
1XYZEXXYZ-767-0011Drug 1500mgTREATMENT 12016-02-142016-03-13
2XYZEXXYZ-767-0012Drug 1500mgTREATMENT 22016-03-142016-04-19
3XYZEXXYZ-767-0013Investigational Drug1000mgTREATMENT 22016-03-142016-04-19
4XYZEXXYZ-767-0021Drug 2500mgTREATMENT 12016-02-212016-03-23
5XYZEXXYZ-767-0022Drug 2500mgTREATMENT 22016-03-242016-04-24
6XYZEXXYZ-767-0023Investigational Drug1000mgTREATMENT 22016-03-242016-04-24

The Subject Elements (SE) dataset shows the dates for the elements for each subject.

Rows 1, 5:Represent the subjects' actual screening elements.
Rows 2, 6:Represent the subjects' actual first treatment epochs. The two subjects were in different elements in the first treatment epoch.
Rows 3, 7:Represent the subjects' actual second treatment epochs.
Rows 4, 8:Represent the subjects' actual follow-up elements.

se.xpt

RowSTUDYIDDOMAINUSUBJIDSDSEQETCDELEMENTSESTDTCSEENDTCTAETORDEPOCH
1XYZSEXYZ-767-0011SCREENScreen2016-02-022016-02-141SCREENING
2XYZSEXYZ-767-0012DRUG1Drug-12016-02-142016-03-142TREATMENT 1
3XYZSEXYZ-767-0013DG1INDGDrug 1 + Investigational Drug2016-03-142016-04-243TREATMENT 2
4XYZSEXYZ-767-0014FUFollow-up2016-04-242016-04-244FOLLOW-UP
5XYZSEXYZ-767-0021SCREENScreen2016-02-042016-02-211SCREENING
6XYZSEXYZ-767-0022DRUG2Drug-22016-02-212016-03-242TREATMENT 1
7XYZSEXYZ-767-0023DG2INDGDrug 2 + Investigational Drug2016-03-242016-04-293TREATMENT 2
8XYZSEXYZ-767-0024FUFollow-up2016-04-292016-04-294FOLLOW-UP

The Dispostion (DS) dataset shows the disposition events and protocol milestones for each subject.

Rows 1, 8:Show randomization to either "DRUG1" or "DRUG2" in the study.
Rows 2, 9:Represent the completion of the screening phase of the study. Note that although a form describing disposition of the screening epoch may be filled out before treatment starts, the screening epoch does not end until treatment begins.
Rows 3, 5, 10, 12:Represent the completion of drug for each EPOCH, where DSSCAT has the name of the drug(s). The DSSTDTC is the end date of study treatment for the EPOCH.
Rows 4, 6, 11, 13:Represent the completion of study participation for each EPOCH, where DSSCAT has the name of "STUDY PARTICIPATION". The DSSTDTC is the end date of study particaption for the EPOCH. There's a one day evaluation post treatment.
Rows 7, 14:Represent the completion of study participation follow-up EPOCH, where DSSCAT has the name of "STUDY PARTICIPATION". The DSSTDTC is the end date of study particaption for the EPOCH.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCEPOCH
1XYZDSXYZ-767-0011RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
2016-02-13SCREENING
2XYZDSXYZ-767-0012COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-02-13SCREENING
3XYZDSXYZ-767-0013COMPLETEDCOMPLETEDDISPOSITION EVENTDRUG12016-03-13TREATMENT 1
4XYZDSXYZ-767-0014COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-03-14TREATMENT 1
5XYZDSXYZ-767-0015COMPLETEDCOMPLETEDDISPOSITION EVENTDG1INDG2016-04-19TREATMENT 2
6XYZDSXYZ-767-0016COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-04-20TREATMENT 2
7XYZDSXYZ-767-0017COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-04-24FOLLOW-UP
8XYZDSXYZ-767-0021RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
2016-02-20SCREENING
9XYZDSXYZ-767-0022COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-02-20SCREENING
10XYZDSXYZ-767-0023COMPLETEDCOMPLETEDDISPOSITION EVENTDRUG22016-03-23TREATMENT 1
11XYZDSXYZ-767-0024COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-03-24TREATMENT 1
12XYZDSXYZ-767-0025COMPLETEDCOMPLETEDDISPOSITION EVENTDG2INDG2016-04-24TREATMENT 2
13XYZDSXYZ-767-0026COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-04-25TREATMENT 2
14XYZDSXYZ-767-0027COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2016-04-29FOLLOW-UP

Example

The study in this example had four cycles of treatment within the treatment epoch, and each cycle was represented as an element. While not a general requirement that each cycle is represented as a distinct element, doing so was important for this study. The study compared a current standard treatment with Drugs A and B to treatment with Drugs A, B, and C. The protocol allowed for drug doses to be reduced under specified criteria. For Drug C, these dose modifications could include dropping the drug. When Drug C is dropped, the subject may transition to treatment with Drugs A and B or to Follow-up.

The TE dataset shows the elements of the trial.

te.xpt

RowSTUDYIDDOMAINETCDELEMENTTESTRLTEENRLTEDUR
1DS10TESCRNScreenInformed ConsentScreening assessments are complete, up to 2 weeks after start of Element
2DS10TEABTrt ABFirst dose of treatment Element, where treatment is AB4 weeks after start of ElementP4W
3DS10TEABCTrt ABCFirst dose of treatment Element, where treatment AB +C4 weeks after start of ElementP4W
4DS10TEFUFollow-upFour weeks after start of last treatment elementDeath, withdrawal of consent, or loss to follow-up.

The TA dataset shows the trial design. The sponsor has chosen to number elements starting with zero for the screening element. For the AB Arm, the TAETORD values match the cycle numbers. For the ABC Arm, if Drug C is dropped, the subject may transition to an AB element or Follow-up. TAETORD values are not chronological for this Arm such that elements with TAETORD values of "2" or "5" would be during "Cycle 2", elements with TAETORD values of "3" or "6" would be during "Cycle 3", and elements with TAETORD values of "4" or "7" would be during "Cycle 4".

ta.xpt

RowSTUDYIDDOMAINARMCDARMTAETORDETCDELEMENTTABRANCHTATRANSEPOCH
1DS10TAABAB0SCRNScreenRandomized to AB
SCREENING
2DS10TAABAB1ABTrt AB
If disease progression, go to follow-up epoch.TREATMENT
3DS10TAABAB2ABTrt AB
If disease progression, go to follow-up epoch.TREATMENT
4DS10TAABAB3ABTrt AB
If disease progression, go to follow-up epoch.TREATMENT
5DS10TAABAB4ABTrt AB

TREATMENT
6DS10TAABAB5FUFollow-up

FOLLOW-UP
7DS10TAABCABC0SCRNScreenRandomized to ABC
SCREENING
8DS10TAABCABC1ABCTrt ABC
If disease progression, go to follow-up epoch. If drug C is dropped, go to element with TAETORD = "5".TREATMENT
9DS10TAABCABC2ABCTrt ABC
If disease progression, go to follow-up epoch. If drug C is dropped, go to element with TAETORD = "6".TREATMENT
10DS10TAABCABC3ABCTrt ABC
If disease progression, go to follow-up epoch. If drug C is dropped, go to element with TAETORD = "7".TREATMENT
11DS10TAABCABC4ABCTrt ABC
Go to follow-up epoch.TREATMENT
12DS10TAABCABC5ABTrt AB

TREATMENT
13DS10TAABCABC6ABTrt AB

TREATMENT
14DS10TAABCABC7ABTrt AB

TREATMENT
15DS10TAABCABC8FUFollow-up

FOLLOW-UP

This example shows data for a subject who was randomized to Treatment ABC. Drug C was dropped after Cycle 2 due to toxicity associated with Drug C. Treatment with Drugs A and B was stopped after Cycle 3 due to disease progression. The subject died during follow-up.

The SE dataset records the elements this subject experienced.

Rows 1-4:The subject participated in the screening epoch and three elements of the treatment epoch.
Row 5:The subject's fifth element was not "ABC" or "AB", as would have been expected if they recieved all four cycles of therapy, but "FU".

se.xpt

RowSTUDYIDDOMAINUSUBJIDSESEQETCDSESTDTCSEENDTCSEUPDESTAETORDEPOCH
1DS10SE1011SCRN2015-01-212015-02-01
0SCREENING
2DS10SE1012ABC2015-02-012015-03-01
1TREATMENT
3DS10SE1013ABC2015-03-012015-03-29
2TREATMENT
4DS10SE1014AB2015-03-292015-04-26
6TREATMENT
5DS10SE1015FU2015-04-262015-09-19
8FOLLOW-UP

In this study, disposition of each treatment was collected, and disposition of study participation was collected for each epoch of the trial. The date of disposition for study treatment was defined as the date of the last dose of that treatment.

Rows 1-2:Show that informed consent was obtained and randomization occurred during the screening epoch.
Row 3:Shows disposition of study participation for the screening epoch. The subject completed this epoch.
Row 4:Shows that Drug C was ended during the second cycle (TAETORD = "2") of the treatment epoch.
Rows 5-6:Show that Drugs A and B were ended on the same day during the third cycle (TAETORD = "6") of the treatment epoch.
Row 7:Shows disposition of study participation in the treatment epoch. The subject did not complete treatment, due to disease progression. The date of disposition of the treatment epoch, DSSTDTC, is the date the subject started the follow-up epoch. For this study, that was defined as four weeks after the start of the last treatment element. This means that although the subject's last dose of treatment was "2015-04-14", the end of the treatment period was later, on "2015-04-26", when the subject started the follow-up treatment.
Row 8:Shows disposition of study participation in the follow-up epoch. The subject died.

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSSCATDSSTDTCTAETORDEPOCH
1DS10DS1011INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE
2015-01-211SCREENING
2DS10DS1012RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE
2015-02-011SCREENING
3DS10DS1013COMPLETEDCOMPLETEDDISPOSITION EVENTSTUDY PARTICIPATION2015-02-011SCREENING
4DS10DS1014ToxicityADVERSE EVENTDISPOSITION EVENTDRUG C2015-03-062TREATMENT
5DS10DS1015Disease progressionPROGRESSIVE DISEASEDISPOSITION EVENTDRUGS A & B2015-04-146TREATMENT
7DS10DS1016Disease progressionPROGRESSIVE DISEASEDISPOSITION EVENTSTUDY PARTICIPATION2015-04-266TREATMENT
8DS10DS1017Death due to cancerDEATHDISPOSITION EVENTSTUDY PARTICIPATION2015-09-198FOLLOW-UP

6.2.4 Protocol Deviations

DV – Description/Overview

An events domain that contains protocol violations and deviations during the course of the study.

DV – Specification

dv.xpt, Protocol Deviations — Events, Version 3.3. One record per protocol deviation per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

DV – Assumptions

  1. The DV domain is an Events model for collected protocol deviations and not for derived protocol deviations that are more likely to be part of analysis. Events typically include what the event was, captured in --TERM (the topic variable), and when it happened (captured in its start and/or end dates). The intent of the domain model is to capture protocol deviations that occurred during the course of the study (see ICH E3: Section 10.2 at http://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E3/E3_Guideline.pdf). Usually these are deviations that occur after the subject has been randomized or received the first treatment.
  2. This domain should not be used to collect entry criteria information. Violated inclusion/exclusion criteria are stored in IE. The Deviations domain is for more general deviation data. A protocol may indicate that violating an inclusion/exclusion criterion during the course of the study (after first dose) is a protocol violation. In this case, this information would go into DV.
  3. Any Identifier variables, Timing variables, or Findings general-observation-class qualifiers may be added to the DV domain, but the following Qualifiers would generally not be used in DV: --PRESP, --OCCUR, --STAT, --REASND, --BODSYS, --LOC, --SEV, --SER, --ACN, --ACNOTH, --REL, --RELNST, --PATT, --OUT, --SCAN, --SCONG, --SDISAB, --SDTH, --SHOSP, --SLIFE, --SOD, --SMIE, --CONTRT, --TOXGR.

DV – Examples

Example

This is an example of data that was collected on a protocol-deviations CRF. The DVDECOD column is for controlled terminology, whereas the DVTERM is free text.

Rows 1, 3:Show examples of a TREATMENT DEVIATION type of protocol deviation.
Row 2:Shows an example of a deviation due to the subject taking a prohibited concomitant medication.
Row 4:Shows an example of a medication that should not be taken during the study.

dv.xpt

RowSTUDYIDDOMAINUSUBJIDDVSEQDVTERMDVDECODEPOCHDVSTDTC
1ABC123DV1231011IVRS PROCESS DEVIATION - NO DOSE CALL PERFORMED.TREATMENT DEVIATIONTREATMENT2003-09-21
2ABC123DV1231031DRUG XXX ADMINISTERED DURING STUDY TREATMENT PERIODEXCLUDED CONCOMITANT MEDICATIONTREATMENT2003-10-30
3ABC123DV1231032VISIT 3 DOSE <15 MGTREATMENT DEVIATIONTREATMENT2003-10-30
4ABC123DV1231041TOOK ASPIRINPROHIBITED MEDSTREATMENT2003-11-30

6.2.5 Healthcare Encounters

HO – Description/Overview

A events domain that contains data for inpatient and outpatient healthcare events (e.g., hospitalization, nursing home stay, rehabilitation facility stay, ambulatory surgery).

HO – Specification

ho.xpt, Healthcare Encounters — Events, Version 3.3. One record per healthcare encounter per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

HO – Assumptions

  1. HO Definition
    The Healthcare Encounters dataset includes inpatient and outpatient healthcare events (e.g., hospitalizations, nursing home stay, rehabilitation facility stays, ambulatory surgery).
  2. Values of HOTERM typically describe the location or place of the healthcare encounter (e.g., HOSPITAL instead of HOSPITALIZATION). HOSTDTC should represent the start or admission date and HOENDTC the end or discharge date.
  3. Data collected about healthcare encounters may include the reason for the encounter. The following supplemental qualifiers may be appropriate for representing such data.
    1. The supplemental qualifier with QNAM = "HOINDC", would be used to represent the indication/medical condition for the encounter (e.g., stroke). Note that --INDC is an interventions class variable, so is not a standard variable for HO, which is an events class domain.
    2. The supplemental qualifier with QNAM = "HOREAS", would be used to represent a reason for the encounter other than a medical condition (e.g., annual checkup). Note that --REAS is a non-standard variable listed in Appendix C2, Supplemental Qualifiers Name Codes.
  4. If collected data includes the name of the provider or the facility where the encounter took place, this may be represented using the supplemental qualifier with QNAM = "HONAM". Note that --NAM is a findings class variable, so is not a standard variable for HO, which is an events class domain.
  5. Any Identifier variables, Timing variables, or Findings general-observation-class qualifiers may be added to the HO domain, but the following Qualifiers would generally not be used in HO: --SER, --ACN, --ACNOTH, --REL, --RELNST, --SCAN, --SCONG, --SDISAB, --SDTH, --SHOSP, --SLIFE, --SOD, --SMIE, --BODSYS, --LOC, --SEV, --TOX, --TOXGR, --PATT, --CONTRT.

HO – Examples

Example

In this example, a healthcare encounter CRF collects verbatim descriptions of the encounter.

Rows 1-2:Subject ABC123101 was hospitalized and then moved to a nursing home.
Rows 3-5:Subject ABC123102 was in a hospital in the general ward and then in the intensive care unit. This same subject was transferred to a rehabilitation facility.
Rows 6-7:Subject ABC123103 has two hospitalization records.
Row 8:Subject ABC123104 was seen in the cardiac catheterization laboratory.
Rows 9-12:Subject ABC123105 and subject ABC123106 were each seen in the cardiac catheterization laboratory and then transferred to another hospital.

ho.xpt

RowSTUDYIDDOMAINUSUBJIDHOSEQHOTERMEPOCHHOSTDTCHOENDTCHODUR
1ABCHOABC1231011HOSPITALTREATMENT2011-06-082011-06-13
2ABCHOABC1231012NURSING HOMETREATMENT

P6D
3ABCHOABC1231021GENERAL WARDTREATMENT2011-08-062011-08-08
4ABCHOABC1231022INTENSIVE CARETREATMENT2011-08-082011-08-15
5ABCHOABC1231023REHABILIATION FACILITYTREATMENT2011-08-152011-08-20
6ABCHOABC1231031HOSPITALTREATMENT2011-09-092011-09-11
7ABCHOABC1231032HOSPITALTREATMENT2011-09-112011-09-15
8ABCHOABC1231041CARDIAC CATHETERIZATION LABORATORYTREATMENT2011-10-102011-10-10
9ABCHOABC1231051CARDIAC CATHETERIZATION LABORATORYTREATMENT2011-10-112011-10-11
10ABCHOABC1231052HOSPITALTREATMENT2011-10-112011-10-15
11ABCHOABC1231061CARDIAC CATHETERIZATION LABORATORYFOLLOW-UP2011-11-072011-11-07
12ABCHOABC1231062HOSPITALFOLLOW-UP2011-11-072011-11-09
Row 1:For the first encounter recorded for subject ABC123101, the indication/medical condition for hospitalization was recorded.
Row 2:For the second encounter recorded for subject ABC123101, the reason for admission to a nursing home was for rehabilitation.
Rows 3-4:For the two encounters recorded for subject ABC123103, the name of the facilities were recorded.
Row 5:For the first encounter for subject ABC123105, the indication/medical condition for the hospitalization was recorded.
Row 6:For the second encounter for subject ABC123105, the name of the hospital was recorded.

suppho.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCHOABC123101HOSEQ1HOINDCIndicationCONGESTIVE HEART FAILURECRF
2ABCHOABC123101HOSEQ2HOREASReasonREHABILITATIONCRF
3ABCHOABC123103HOSEQ1HONAMProvider NameGENERAL HOSPITALCRF
4ABCHOABC123103HOSEQ2HONAMProvider NameEMERSON HOSPITALCRF
5ABCHOABC123105HOSEQ1HOINDCIndicationATRIAL FIBRILLATIONCRF
6ABCHOABC123105HOSEQ2HONAMProvider NameROOSEVELT HOSPITALCRF

Example

In this example, the dates of an initial hospitalization are collected as well as the date/time of ICU stay. Subsequent to discharge from the initial hospitalization, follow-up healthcare encounters, including admission to a rehabilitation facility, visits with healthcare providers, and home nursing visits were collected. Repeat hospitalizations are categorized separately.

ho.xpt

RowSTUDYIDDOMAINUSUBJIDHOSEQHOTERMHOCATHOSTDTCHOENDTCHOENRTPTHOENTPT
1ABCHOABC1231011HOSPITALINITIAL HOSPITALIZATION2011-06-082011-06-12

2ABCHOABC1231012ICUINITIAL HOSPITALIZATION2011-06-08T11:002011-06-09T14:30

3ABCHOABC1231013REHABILITATION FACILITYFOLLOW-UP CARE2011-06-122011-06-22

4ABCHOABC1231014CARDIOLOGY UNITFOLLOW-UP CARE2011-06-252011-06-25

5ABCHOABC1231015OUTPATIENT PHYSICAL THERAPYFOLLOW-UP CARE2011-06-272011-06-27

6ABCHOABC1231016OUTPATIENT PHYSICAL THERAPYFOLLOW-UP CARE2011-07-122011-07-12

7ABCHOABC1231017HOSPITALREPEAT HOSPITALIZATION2011-07-232011-07-24

8ABCHOABC1231021HOSPITALINITIAL HOSPITALIZATION2011-06-192011-07-02

9ABCHOABC1231022ICUINITIAL HOSPITALIZATION2011-06-19T22:002011-06-23T09:30

10ABCHOABC1231023ICUINITIAL HOSPITALIZATION2011-06-25T10:002011-06-29T19:30

11ABCHOABC1231024SKILLED NURSING FACILITYFOLLOW-UP CARE2011-07-02
ONGOINGEND OF STUDY

The indication/medical condition for subject ABC123101's repeat hospitalization was represented as a supplemental qualifier.

suppho.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCHOABC123101HOSEQ7HOINDCIndicationSTROKECRF

6.2.6 Medical History

MH – Description/Overview

The medical history dataset includes the subject's prior history at the start of the trial. Examples of subject medical history information could include general medical history, gynecological history, and primary diagnosis.

MH – Specification

mh.xpt, Medical History — Events, Version 3.3. One record per medical history event per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

MH – Assumptions

  1. Prior Treatments
    1. Prior treatments, including prior medications and procedures should be submitted in an appropriate dataset from the Interventions class (e.g., CM or PR).
  2. Medical History Description and Coding
    1. MHTERM captures the verbatim term collected for the condition or event. It is the topic variable for the MH dataset. MHTERM is a required variable and must have a value.
    2. MHMODIFY is a permissible variable and should be included if the sponsor's procedure permits modification of a verbatim term for coding. The modified term is listed in MHMODIFY. The variable should be populated as per the sponsor's procedures; null values are permitted.
    3. If the sponsor codes the reported term (MHTERM) using a standard dictionary, then MHDECOD will be populated with the preferred term derived from the dictionary. The sponsor is expected to provide the dictionary name and version used to map the terms utilizing the external codelist element in the Define-XML document.
    4. MHBODSYS is the system organ class from the coding dictionary associated with the adverse event by the sponsor. This value may differ from the primary system organ class designated in the coding dictionary's standard hierarchy.
    5. If a CRF collects medical history by pre-specified body systems and the sponsor also codes reported terms using a standard dictionary, then MHDECOD and MHBODSYS are populated using the standard dictionary. MHCAT and MHSCAT should be used for the pre-specified body systems.
  3. Additional Categorization and Grouping
    1. MHCAT and MHSCAT may be populated with the sponsor's pre-defined categorization of medical history events, which are often pre-specified on the CRF. Note that even if the sponsor uses the body system terminology from the standard dictionary, MHBODSYS and MHCAT may differ, since MHBODSYS is derived from the coding system, while MHCAT is effectively assigned when the investigator records a condition under the pre-specified category.
      1. This categorization should not group all records (within the MH Domain) into one generic group such as "Medical History" or "General Medical History" because this is redundant information with the domain code. If no smaller categorization can be applied, then it is not necessary to include or populate this variable.
      2. Examples of MHCAT could include "General Medical History" (see above assumption since if "General Medical History" is an MHCAT value, then there should be other MHCAT values), "Allergy Medical History, " and "Reproductive Medical History".
    2. MHGRPID may be used to link (or associate) different records together to form a block of related records at the subject level within the MH domain. It should not be used in place of MHCAT or MHSCAT, which are used to group data across subjects. For example, if a group of syndromes reported for a subject were related to a particular disease then the MHGRPID variable could be populated with the appropriate text.
  4. Pre-Specified Terms; Presence or Absence of Events
    1. Information on medical history is generally collected in two different ways, either by recording free text or using a pre-specified list of terms. The solicitation of information on specific medical history events may affect the frequency at which they are reported; therefore, the fact that a specific medical history event was solicited may be of interest to reviewers. MHPRESP and MHOCCUR are used together to indicate whether the condition in MHTERM was pre-specified and whether it occurred, respectively. A value of "Y" in MHPRESP indicates that the term was pre-specified.
    2. MHOCCUR is used to indicate whether a pre-specified medical condition occurred; a value of "Y" indicates that the event occurred and "N" indicates that it did not.
    3. If a medical history event was reported using free text, the values of MHPRESP and MHOCCUR should be null. MHPRESP and MHOCCUR are permissible fields and may be omitted from the dataset if all medical history events were collected as free text.
    4. MHSTAT and MHREASND provide information about pre-specified medical history questions for which no response was collected. MHSTAT and MHREASND are permissible fields and may be omitted from the dataset if all medications were collected as free text or if all pre-specified conditions had responses in MHOCCUR.
      SituationValue of
      MHPRESP
      Value of
      MHOCCUR
      Value of
      MHSTAT
      Spontaneously reported event occurred


      Pre-specified event occurredYY
      Pre-specified event did not occurYN
      Pre-specified event has no responseY
      NOT DONE
    5. When medical history events are collected with the recording of free text, a record may be entered into the data management system to indicate "no medical history" for a specific subject or pre-specified body system category (e.g., Gastrointestinal). For these subjects or categories within subject, do not include a record in the MH dataset to indicate that there were no events.
  5. Timing Variables
    1. Relative timing assessments such as "Ongoing" or "Active" are common in the collection of Medical History information. MHENRF may be used when this relative timing assessment is coincident with the start of the study reference period for the subject represented in the Demographics dataset (RFSTDTC). MHENRTPT and MHENTPT may be used when "Ongoing" is relative to another date such as the screening visit date. See examples below and Section 4.4.7, Use of Relative Timing Variables.
    2. Additional timing variables (such as MHSTRF) may be used when appropriate.
  6. Medical History Event Date Type
    1. MHEVDTYP is a domain-specific variable that can be used to indicate the aspect of the event that is represented in the event start and/or end date/times (MHSTDTC and/or MHENDTC). If a start date and/or end dates is collected without further specification of what constitutes the start or end of the event, then MHEVDTYP is not needed. However, when data collection specifies how the start or end date is to be reported, MHEVDTYP can be used to provide this information. For example, the date of diagnosis may be collected, in which case the date of diagnosis would be used to populate MHSTDTC and MHEVDTYP would be populated with "DIAGNOSIS". If MHEVDTYP is not needed for any collected data, it need not be included in the dataset. If MHEVDTYP is included in the dataset, it should be populated only when the data collection specifies the aspect of the event that is to be used to populate the start and/or end date; otherwise it should be null.
    2. When data collected about an event includes two different dates that could be considered the start or end of an event, then an MH record will be created for each. For example, if data collection included both a date of onset of symptoms and a date of diagnosis, there would be two records for the event, one with MHSTDTC the date of onset of symptoms and MHEVDTYP = "SYMPTOMS" and a second with MHSTDTC the date of diagnosis and MHENDTYP = "DIAGNOSIS". In such a case, it is recommended that the two records be linked by means such as a common value of MHSPID or MHGRPID.
  7. Any identifiers, timing variables, or events general observation class qualifiers may be added to the MH domain, but the following Qualifiers would generally not be used in MH: --SER, --ACN, --ACNOTH, --REL, --RELNST, --OUT, --SCAN, --SCONG, --SDISAB, --SDTH, --SHOSP, --SLIFE, --SOD, --SMIE.

MH – Examples

Example

In this example, a General Medical History CRF collected verbatim descriptions of conditions and events by body system (e.g., Endocrine, Metabolic), did not collect start date, but asked whether or not the condition was ongoing at the time of the visit. Another CRF page was used for cardiac history events. This page asked for date of onset of symptoms and date of diagnosis, but did not include the ongoing question.

Rows 1-3:MHCAT indicates that these data were collected on the General Medical History CRF, and MHSCAT indicates the body system for which the event was collected. The reported events were coded using a standard dictionary. MHDECOD and MHBODSYS display the preferred term and body system assigned through the coding process. MHENRTPT was populated based on the response to the "Ongoing" question on the General Medical History CRF. MHENTPT displays the reference date for MHENRTPT, that is, the date the information was collected. If "Yes" was specified for Ongoing, MHENRTPT = "ONGOING"; if "No" was checked, MHENRTPT = "BEFORE". See Section 4.4.7, Use of Relative Timing Variables, for further guidance.
Rows 4-5:MHCAT indicates that these data were collected on the Cardiac Medical History CRF. Since two kinds of start date were collected for congestive heart failure, there are two records for this event, one with the start date for which MHEVDTYP = "SYMPTOM ONSET" and one with the start date for which MHEVDTYP = "DIAGNOSIS". The sponsor grouped these two records using the MHGRPID value "CHF".

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHGRPIDMHTERMMHDECODMHEVDTYPMHCATMHSCATMHBODSYSMHSTDTCMHENRTPTMHENTPT
1ABC123MH1231011
ASTHMAAsthma
GENERAL MEDICAL HISTORYRESPIRATORYRespiratory system disorders
ONGOING2004-09-18
2ABC123MH1231012
FREQUENT HEADACHESHeadache
GENERAL MEDICAL HISTORYCNSCentral and peripheral nervous system disorders
ONGOING2004-09-18
3ABC123MH1231013
BROKEN LEGBone fracture
GENERAL MEDICAL HISTORYOTHERMusculoskeletal system disorders
BEFORE2004-09-18
4ABC123MH1231014CHFCONGESTIVE HEART FAILURECardiac failure congestiveSYMPTOM ONSETCARDIAC MEDICAL HISTORY
Cardiac disorders2004-09-17

5ABC123MH1231015CHFCONGESTIVE HEART FAILURECardiac failure congestiveDIAGNOSISCARDIAC MEDICAL HISTORY
Cardiac disorders2004-09-19

Example

In this example, data from three CRF modules related to medical history were collected:

  • A general medical history CRF collected descriptions of conditions and events by body system (e.g., Endocrine, Metabolic) and asked whether or not the conditions were ongoing at study start. The reported events were coded using a standard dictionary.
  • A second CRF collected stroke history. Terms were selected from a list of terms taken from the standard dictionary.
  • A third CRF asked whether or not the subject had any of a list of four specific risk factors.

In all of the records shown below, MHCAT is populated with the CRF module (general medical history, stroke history, or risk factors) through which the data were collected. MHPRESP and MHOCCUR were populated only when the term was pre-specified, in keeping with MH Assumption 4.

Rows 1-3:Show records from the general medical history CRF. MHSCAT displays the body systems specified on the CRF. The coded terms are represented in MHDECOD. MHENRF has been populated based on the response to the "Ongoing at Study Start" question on the CRF. If "Yes" was specified, MHENRF = "DURING/AFTER"; if "No" was checked, MHENRF = "BEFORE". See Section 4.4.7, Use of Relative Timing Variables, for further guidance on using --STRF and --ENRF.
Row 4:Shows the record from the stroke history CRF. MHSTDTC was populated with the date and time at which the event occurred.
Rows 5-8:Show records from the risk factors CRF. MHPRESP values of "Y" indicate that each risk factor was pre-specified on the CRF. MHOCCUR is populated with "Y" or "N", corresponding to the CRF response to the questions for the four pre-specified risk factors. The terms used to describe these risk factors were chosen to have associated codes in the standard dictionary.

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHTERMMHDECODMHCATMHSCATMHPRESPMHOCCURMHBODSYSMHSTDTCMHENRF
1ABC123MH1231011ASTHMAAsthmaGENERAL MEDICAL HISTORYRESPIRATORY

Respiratory system disorders
DURING/AFTER
2ABC123MH1231012FREQUENT HEADACHESHeadacheGENERAL MEDICAL HISTORYCNS

Central and peripheral nervous system disorders
DURING/AFTER
3ABC123MH1231013BROKEN LEGBone fractureGENERAL MEDICAL HISTORYOTHER

Musculoskeletal system disorders
BEFORE
4ABC123MH1231014ISCHEMIC STROKEIschaemic StrokeSTROKE HISTORY



2004-09-17T07:30
5ABC123MH1231015DIABETESDiabetes mellitusRISK FACTORS
YY


6ABC123MH1231016HYPERCHOLESTEROLEMIAHypercholesterolemiaRISK FACTORS
YY


7ABC123MH1231017HYPERTENSIONHypertensionRISK FACTORS
YY


8ABC123MH1231018TIATransient ischaemic attackRISK FACTORS
YN


Example

This is an example of a medical history CRF where the history of specific (pre-specified) conditions is solicited. The conditions were not coded using a standard dictionary. The data were collected as part of the Screening visit.

Rows 1-9:MHPRESP = "Y" indicates that these conditions were specifically queried. Presence or absence of the condition is represented in MHOCCUR.
Row 10:There was also a specific question about ASTHMA, as indicated by MHPRESP = "Y", but this question was not asked. Since the question was not asked, MHOCCUR is null and MHSTAT = "NOT DONE". In this case, a reason for the absence of a response was collected, and this is represented in MHREASND.

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHTERMMHDECODMHPRESPMHOCCURMHSTATMHREASNDVISITNUMVISITMHDTCMHDY
1ABC123MH1010021HISTORY OF EARLY CORONARY ARTERY DISEASE (<55 YEARS OF AGE)Coronary Artery DiseaseYN

1SCREEN2006-04-22-5
2ABC123MH1010022CONGESTIVE HEART FAILURECongestive Heart FailureYN

1SCREEN2006-04-22-5
3ABC123MH1010023PERIPHERAL VASCULAR DISEASEPeripheral Vascular DiseaseYN

1SCREEN2006-04-22-5
4ABC123MH1010024TRANSIENT ISCHEMIC ATTACKTransient Ischemic AttackYY

1SCREEN2006-04-22-5
5ABC123MH1010025ASTHMAAsthmaYY

1SCREEN2006-04-22-5
6ABC123MH1010031HISTORY OF EARLY CORONARY ARTERY DISEASE (<55 YEARS OF AGE)Coronary Artery DiseaseYY

1SCREEN2006-05-03-3
7ABC123MH1010032CONGESTIVE HEART FAILURECongestive Heart FailureYN

1SCREEN2006-05-03-3
8ABC123MH1010033PERIPHERAL VASCULAR DISEASEPeripheral Vascular DiseaseYY

1SCREEN2006-05-03-3
9ABC123MH1010034TRANSIENT ISCHEMIC ATTACKTransient Ischemic AttackYN

1SCREEN2006-05-03-3
10ABC123MH1010035ASTHMAAsthmaY
NOT DONEFORGOT TO ASK1SCREEN2006-05-03-3

Example

This diabetes study included subjects with both Type 1 diabetes and Type 2 diabetes. Data collection included which kind of diabetes the subject had and the date of diagnosis of the condition.

Rows 1-2:Show that subject XYZ-001-001 had Type 1 diabetes, and did not have Type 2 diabetes. The fact that the start date in Row 1 is the date of diagnosis is indicated by MHEVDTYP = "DIAGNOSIS". Since this subject did not have Type 2 diabetes, no start date for Type 2 diabetes was collected, so MHEVDTYP in Row 2 is blank.
Rows 3-4:Show that subject XYZ-001-002 had Type 2 diabetes, and did not have Type 1 diabetes. The fact that the start date in Row 4 is the date of diagnosis is indicated by MHEVDTYP = "DIAGNOSIS".

mh.xpt

RowSTUDYIDDOMAINUSUBJIDMHSEQMHTERMMHDECODMHEVDTYPMHCATMHPRESPMHOCCURMHDTCMHSTDTC
1XYZMHXYZ-001-0011TYPE 1 DIABETES MELLITUSType 1 diabetes mellitusDIAGNOSISDIABETESYY2010-09-262010-03-25
2XYZMHXYZ-001-0012TYPE 2 DIABETES MELLITUSType 2 diabetes mellitus
DIABETESYN2010-09-26
3XYZMHXYZ-001-0021TYPE 1 DIABETES MELLITUSType 1 diabetes mellitus
DIABETESYN2010-10-26
4XYZMHXYZ-001-0022TYPE 2 DIABETES MELLITUSType 2 diabetes mellitusDIAGNOSISDIABETESYY2010-10-262010-04-25

6.3 Models for Findings Domains

Most subject-level observations collected during the study should be represented according to one of the three SDTM general observation classes. This is the list of domains corresponding to the Findings class.

Domain CodeDomain Description
DA

Drug Accountability

A findings domain that contains the accountability of study drug, such as information on the receipt, dispensing, return, and packaging.

DD

Death Details

A findings domain that contains the diagnosis of the cause of death for a subject.

EG

ECG Test Results

A findings domain that contains ECG data, including position of the subject, method of evaluation, all cycle measurements and all findings from the ECG including an overall interpretation if collected or derived.

IE

Inclusion/Exclusion Criteria Not Met

A findings domain that contains those criteria that cause the subject to be in violation of the inclusion/exclusion criteria.

IS

Immunogenicity Specimen Assessments

A findings domain for assessments that determine whether a therapy induced an immune response.

LB

Laboratory Test Results

A findings domain that contains laboratory test data such as hematology, clinical chemistry and urinalysis. This domain does not include microbiology or pharmacokinetic data, which are stored in separate domains.

MB and MS

Microbiology Domains

Microbiology Specimen (MB)

A findings domain that represents non-host organisms identified including bacteria, viruses, parasites, protozoa and fungi.

Microbiology Susceptibility (MS)

A findings domain that represents drug susceptibility testing results only. This includes phenotypic testing (where drug is added directly to a culture of organisms) and genotypic tests that provide results in terms of susceptible or resistant. Drug susceptibility testing may occur on a wide variety of non-host organisms, including bacteria, viruses, fungi, protozoa and parasites.

MI

Microscopic Findings

A findings domain that contains histopathology findings and microscopic evaluations.

MO

Morphology

A domain relevant to the science of the form and structure of an organism or of its parts.

The MO domain was originally created to hold all macroscopic results, but is expected to be deprecated in a later version of the SDTMIG. Submissions using that later SDTMIG version would represent morphology results in the appropriate body system-based physiology/morphology domain.

For data prepared using a version of the SDTMIG that includes both the MO domain and body system-based physiology/morphology domains, morphology findings may be represented in either the MO domain or in a body-system based physiology/morphology domain. Custom body system-based domains may be used if the appropriate body system-based domain is not included in the SDTMIG version being used.

CV, MK, NV, OE, RP, RE and UR

Morphology/Physiology Domains

Cardiovascular System Findings (CV)

A findings domain that contains physiological and morphological findings related to the cardiovascular system, including the heart, blood vessels and lymphatic vessels.

Musculoskeletal System Findings (MK)

A findings domain that contains physiological and morphological findings related to the system of muscles, tendons, ligaments, bones, joints, and associated tissues.

Nervous System Findings (NV)

A findings domain that contains physiological and morphological findings related to the nervous system, including the brain, spinal cord, the cranial and spinal nerves, autonomic ganglia and plexuses.

Ophthalmic Examinations (OE)

A findings domain that contains tests that measure a person's ocular health and visual status, to detect abnormalities in the components of the visual system, and to determine how well the person can see.

Reproductive System Findings (RP)

A findings domain that contains physiological and morphological findings related to the male and female reproductive systems.

Respiratory System Findings (RE)

A findings domain that contains physiological and morphological findings related to the respiratory system, including the organs that are involved in breathing such as the nose, throat, larynx, trachea, bronchi and lungs.

Urinary System Findings (UR)

A findings domain that contains physiological and morphological findings related to the urinary tract, including the organs involved in the creation and excretion of urine such as the kidneys, ureters, bladder and urethra.

PC and PP

Pharmacokinetics

Pharmacokinetics Concentrations (PC)

A findings domain that contains concentrations of drugs or metabolites in fluids or tissues as a function of time.

Pharmacokinetics Parameters (PP)

A findings domain that contains pharmacokinetic parameters derived from pharmacokinetic concentration-time (PC) data.

PE

Physical Examination

A findings domain that contains findings observed during a physical examination where the body is evaluated by inspection, palpation, percussion, and auscultation.

FT, QS, and RS

Questionnaires, Ratings and Scales

Functional Tests (FT)

A findings domain that contains data for named, stand-alone, task-based evaluations designed to provide an assessment of mobility, dexterity, or cognitive ability.

Questionnaires (QS)

A findings domain that contains data for named, stand-alone instruments designed to provide an assessment of a concept. Questionnaires have a defined standard structure, format, and content; consist of conceptually related items that are typically scored; and have documented methods for administration and analysis.

Disease Response and Clin Classification (RS)

A findings domain for the assessment of disease response to therapy, or clinical classification based on published criteria.

SC

Subject Characteristics

A findings domain that contains subject-related data not collected in other domains.

SS

Subject Status

A findings domain that contains general subject characteristics that are evaluated periodically to determine if they have changed.

TU and TR

Tumor/Lesion Domains

Tumor/Lesion Identification (TU)

A findings domain that represents data that uniquely identifies tumors or lesions under study.

Tumor/Lesion Results (TR)

A findings domain that represents quantitative measurements and/or qualitative assessments of the tumors or lesions identified in the tumor/lesion identification (TU) domain.

VS

Vital Signs

A findings domain that contains measurements including but not limited to blood pressure, temperature, respiration, body surface area, body mass index, height and weight.

6.3.1 Drug Accountability

DA – Description/Overview

A findings domain that contains the accountability of study drug, such as information on the receipt, dispensing, return, and packaging.

DA – Specification

da.xpt, Drug Accountability — Findings, Version 3.3. One record per drug accountability finding per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

DA – Assumptions

  1. One way a sponsor may choose to differentiate different types of medications, (e.g., study medication, rescue medication, run-in medication) is to use DACAT.
  2. DAREFID and DASPID are both available for capturing label information.
  3. Any Identifiers, Timing variables, or Findings general observation class qualifiers may be added to the DA domain, but the following Qualifiers would not generally be used in DA: --MODIFY, --POS, --BODSYS, --ORNRLO, --ORNRHI, --STNRLO, --STNRHI, --STNRC, --NRIND, --RESCAT, --XFN, --NAM, --LOINC, --SPEC, --SPCCND, --METHOD, --BLFL, --FAST, --DRVFL, --TOX, --TOXGR, --SEV.

DA – Examples

Example

This example shows drug accounting for a study with two study meds and one rescue med, all of which were measured in tablets. The sponsor chose to add EPOCH from the list of timing variables and to use DASPID and DAREFID for code numbers that appeared on the label.

da.xpt

RowSTUDYIDDOMAINUSUBJIDDASEQDAREFIDDASPIDDATESTCDDATESTDACATDASCATDAORRESDAORRESUDASTRESCDASTRESNDASTRESUVISITNUMEPOCHDADTC
1ABCDAABC-010011XBYCC-E990AA375827DISAMTDispensed AmountStudy MedicationBottle A30TABLET3030TABLET1Study Med Period 12004-06-15
2ABCDAABC-010012XBYCC-E990AA375827RETAMTReturned AmountStudy MedicationBottle A5TABLET55TABLET2Study Med Period 12004-07-15
3ABCDAABC-010013XBYCC-E990BA227588DISAMTDispensed AmountStudy MedicationBottle B15TABLET1515TABLET1Study Med Period 12004-06-15
4ABCDAABC-010014XBYCC-E990BA227588RETAMTReturned AmountStudy MedicationBottle B0TABLET00TABLET2Study Med Period 12004-07-15
5ABCDAABC-010011

DISAMTDispensed AmountRescue Medication
10TABLET1010TABLET1Study Med Period 12004-06-15
6ABCDAABC-010011

DISAMTReturned AmountRescue Medication
10TABLET1010TABLET2Study Med Period 12004-07-15

Example

In this study, drug containers, rather than their contents, were being accounted for and the sponsor did not track returns. In this case, the purpose of the accountability tracking is to verify that the containers dispensed were consistent with the randomization. The sponsor chose to use DASPID to record the identifying number of the container dispensed.

da.xpt

RowSTUDYIDDOMAINUSUBJIDDASEQDASPIDDATESTCDDATESTDACATDASCATDAORRESDAORRESUDASTRESCDASTRESNDASTRESUVISITNUMDADTC
1ABCDAABC/010011AB001DISPAMTDispensed AmountStudy MedicationDrug A1CONTAINER11CONTAINER12004-06-15
2ABCDAABC/010011AB002DISPAMTDispensed AmountStudy MedicationDrug B1CONTAINER11CONTAINER12004-06-15

6.3.2 Death Details

DD – Description/Overview

A findings domain that contains the diagnosis of the cause of death for a subject.

The domain is designed to hold supplemental data that are typically collected when a death occurs, such as the official cause of death. It does not replace existing data such as the SAE details in AE. Furthermore, it does not introduce a new requirement to collect information that is not already indicated as Good Clinical Practice or defined in regulatory guidelines. Instead, it provides a consistent place within SDTM to hold information that previously did not have a clearly defined home.

DD – Specification

dd.xpt, Death Details — Findings, Version 3.3. One record per finding per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

DD – Assumptions

  1. There may be more than one cause of death. If so, these may be separated into primary and secondary causes and/or other appropriate designations. DD may also include other details about the death, such as where the death occurred and whether it was witnessed.
  2. Death details are typically collected on designated CRF pages. The DD domain is not intended to collate data that are collected in standard variables in other domains, such as AE.AEOUT (Outcome of Adverse Event), AE.AESDTH (Results in Death) or DS.DSTERM (Reported Term for the Disposition Event). Data from other domains that relates to the death can be linked to DD using RELREC.
  3. This domain is not intended to include data obtained from autopsy. An autopsy is a procedure from which there will usually be findings. Autopsy information should be handled as per recommendations in the Procedures domain.
  4. Any Identifiers, Timing variables, or Findings general observation class qualifiers may be added to the DD domain, but the following qualifiers would not generally be used in DD: --MODIFY, --POS, --BODSYS, --ORNRLO, --ORNRHI, --STNRLO, --STNRHI, --STNRC, --NRIND, --RESCAT, --NAM, --LOINC, --SPEC, --SPCCND, --LOBXFL, --BLFL, --FAST, --DRVFL, --TOX, --TOXGR, --SEV.

DD – Examples

Example

This example shows the primary cause of death for three subjects. The CRF also collected the location of the subject's death and a secondary cause of death.

Rows 1-2:Show the primary cause of death and location of death for a subject. DDDTC is the date of assessment.
Rows 3-4:Show records for primary cause of death and location of death for another subject for whom the information was not known.
Rows 4-6:Show primary and secondary cause of death and location of death for a third subject.

dd.xpt

RowSTUDYIDDOMAINUSUBJIDDDSEQDDTESTCDDDTESTDDORRESDDSTRESCDDDTC
1ABC123DDABC123010011PRCDTHPrimary Cause of DeathSUDDEN CARDIAC DEATHSUDDEN CARDIAC DEATH2011-01-12
2ABC123DDABC123010012LOCDTHLocation of DeathHOMEHOME2011-01-12
3ABC123DDABC123010021PRCDTHPrimary Cause of DeathUNKNOWNUNKNOWN2011-03-15
4ABC123DDABC123010022LOCDTHLocation of DeathUNKNOWNUNKNOWN2011-03-15
5ABC123DDABC123010231PRCDTHPrimary Cause of DeathCARDIAC ARRHYTHMIACARDIAC ARRHYTHMIA2011-09-09
6ABC123DDABC123010232SECDTHSecondary Cause of DeathCHFCONGESTIVE HEART FAILURE2011-09-09
7ABC123DDABC123010233LOCDTHLocation of DeathMEMORIAL HOSPITALHOSPITAL2011-09-09

Example

This example illustrates how the DD, DS, and AE data for a subject are linked using RELREC. Note that each of these domains serves a different purpose, even though the information is related. This subject had a fatal adverse event, represented in the AE domain.

ae.xpt

RowSTUDYIDDOMAINUSUBJIDAESEQAETERMAESTDTCAEENDTCAEDECODAEBODSYSAEOUTAESERAESDTH
1ABC123AEABC123010016SUDDEN CARDIAC DEATH2011-01-102011-01-10SUDDEN CARDIAC DEATHCARDIOVASCULAR SYSTEMFATALYY

The primary cause of death was collected and is represented in DD. In this case, the result for primary cause of death is the same as the term in the AE record.

dd.xpt

RowSTUDYIDDOMAINUSUBJIDDDSEQDDTESTCDDDTESTDDORRESDDSTRESCDDDTC
1ABC123DDABC123010011PRCDTHPrimary Cause of DeathSUDDEN CARDIAC DEATHSUDDEN CARDIAC DEATH2011-01-12

The subject's death was also represented in the DS domain as the reason for their withdrawal from the study.

Rows 1-3:Show typical protocol milestones and disposition events.
Row 4:Shows the date the death event occurred (DSSTDTC) and was recorded (DSDTC).

ds.xpt

RowSTUDYIDDOMAINUSUBJIDDSSEQDSTERMDSDECODDSCATDSDTCDSSTDTC
1ABC123DSABC123010011INFORMED CONSENT OBTAINEDINFORMED CONSENT OBTAINEDPROTOCOL MILESTONE2011-01-022011-01-02
2ABC123DSABC123010012COMPLETEDCOMPLETEDDISPOSITION EVENT2011-01-032011-01-03
3ABC123DSABC123010013RANDOMIZEDRANDOMIZEDPROTOCOL MILESTONE2011-01-032011-01-03
4ABC123DSABC123010014SUDDEN CARDIAC DEATHDEATHDISPOSITION EVENT2011-01-102011-01-10

The relationship between the DS, AE, and DD records that reflect the subject's death is represented in RELREC.

relrec.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALRELTYPERELID
1ABC123DSABC12301001DSSEQ4
1
2ABC123AEABC12301001AESEQ6
1
3ABC123DDABC12301001DDSEQ1
1

6.3.3 ECG Test Results

EG – Description/Overview

A findings domain that contains ECG data, including position of the subject, method of evaluation, all cycle measurements and all findings from the ECG including an overall interpretation if collected or derived.

EG – Specification

eg.xpt, ECG Test Results — Findings, Version 3.3. One record per ECG observation per replicate per time point or one record per ECG observation per beat per visit per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

EG – Assumptions

  1. EGREFID is intended to store an identifier (e.g. UUID) for the associated ECG tracing. EGXFN is intended to store the name of and path to the ECG waveform file when it is submitted.
  2. There are separate codelists for tests and results based on regular 10-second ECGs and for tests and results based on Holter monitoring.
  3. For non-individual ECG beat data and for aggregate ECG parameter results (e.g., "QT interval", "RR", "PR", "QRS"), EGREFID is populated for all unique ECGs, so that submitted SDTM data can be matched to the actual ECGs stored in the ECG warehouse. Therefore, this variable is expected for these types of records.
  4. For individual-beat parameter results, waveform data will not be stored in the warehouse, so there will be no associated identifier for these beats.
  5. The method for QT interval correction is specified in the test name by controlled terminology: EGTESTCD = "QTCFAG" and EGTEST = "QTcF Interval, Aggregate" is used for Fridericia's formula; EGTESTCD = "QTCBAG" and EGTEST = "QTcB Interval, Aggregate", is used for Bazett's formula.
  6. EGBEATNO is used to differentiate between beats in beat-to-beat records.
  7. EGREPNUM is used to differentiate between multiple repetitions of a test within a given time frame.
  8. EGNRIND can be added to indicate where a result falls with respect to reference range defined by EGORNRLO and EGORNRHI. Examples: "HIGH", "LOW". Clinical significance would be represented as described in Section 4.5.5, Clinical Significance for Findings Observation Class Data as a record in SUPPEG with a QNAM of EGCLSIG (see also EG Example 1).
  9. When "QTcF Interval, Aggregate" or "QTcB Interval, Aggregate" is derived by the sponsor, the derived flag (EGDRVFL) is set to "Y". However, when the "QTcF Interval, Aggregate" or "QTcB Interval, Aggregate" is received from a central provider or vendor, the value would go into EGORRES and EGDRVFL would be null (see Section 4.1.8.1, Origin Metadata for Variables).
  10. If this domain is used in conjunction with the ECG QT Correction Model Data (QT) domain:
    1. For each QT correction method used in the study, values of EGTESTCD and EGTEST are assigned at the study level.
    2. The sponsor should assign values for EGTESTCD/EGTEST appropriately with clear documentation on what each test code represents. For example, if the protocol calls for computing the top two best fit models, the sponsor could choose to name the top best fit model QTCIAG1 and the second best fit model QTCIAG2, in rank order.
  11. Any Identifiers, Timing variables, or Findings general observation class qualifiers may be added to the EG domain, but the following qualifiers would not generally be used in EG: --MODIFY, --BODSYS, --SPEC, --SPCCND, --FAST, --SEV. It is recommended that --LOINC not be used.

EG - Examples

Example

This example shows ECG measurements and other findings from one ECG for one subject. EGCAT has been used to group tests.

Row 1:Shows a measurement of ventricular rate.
Rows 2-4:These interval measurements were collected in seconds. However, in this submission, the standard unit for these tests was milliseconds, so the results have been converted in EGSTRESC and EGSTRESN.
Rows 5-6:Show "QTcB Interval, Aggregate" and "QTcF Interval, Aggregate". These results were derived by the sponsor, as indicated by the "Y" in the EGDRVFL column. Note that EGORRES is null for these derived records.
Rows 7-10:Show results from tests looking for certain kinds of abnormalities, which have been grouped using EGCAT = "FINDINGS".
Row 11:Shows a technical problem represented as the result of the test "Technical Quality". Results of this test can be important to the overall understanding of an ECG, but are not truly findings or interpretations about the subject's heart function.
Row 12:Shows the result of the TEST "Interpretation" (i.e., the interpretation of the ECG strip as a whole), which for this ECG was "ABNORMAL".

eg.xpt

RowSTUDYIDDOMAINUSUBJIDEGSEQEGREFIDEGTESTCDEGTESTEGCATEGPOSEGORRESEGORRESUEGSTRESCEGSTRESNEGSTRESUEGXFNEGNAMEGDRVFLEGEVALVISITNUMVISITEGDTCEGDY
1XYZEGXYZ-US-701-0021334PT89EGHRMNECG Mean Heart RateMEASUREMENTSUPINE62beats/min6262beats/minPQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
2XYZEGXYZ-US-701-0022334PT89PRAGPR Interval, AggregateINTERVALSUPINE0.15sec150150msecPQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
3XYZEGXYZ-US-701-0023334PT89QRSAGQRS Duration, AggregateINTERVALSUPINE0.103sec103103msecPQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
4XYZEGXYZ-US-701-0024334PT89QTAGQT Interval, AggregateINTERVALSUPINE0.406sec406406msecPQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
5XYZEGXYZ-US-701-0025334PT89QTCBAGQTcB Interval, AggregateINTERVALSUPINE

469469msecPQW436789-07.xmlTest LabY
1Screening 12003-04-15T11:58-36
6XYZEGXYZ-US-701-0026334PT89QTCFAGQTcF Interval, AggregateINTERVALSUPINE

446446msecPQW436789-07.xmlTest LabY
1Screening 12003-04-15T11:58-36
7XYZEGXYZ-US-701-0027334PT89SPRTARRYSupraventricular TachyarrhythmiasFINDINGSUPINEATRIAL FIBRILLATION
ATRIAL FIBRILLATION

PQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
8XYZEGXYZ-US-701-0028334PT89SPRTARRYSupraventricular TachyarrhythmiasFINDINGSUPINEATRIAL FLUTTER
ATRIAL FLUTTER

PQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
9XYZEGXYZ-US-701-0029334PT89STSTWUWST Segment, T wave, and U waveFINDINGSUPINEPROLONGED QT
PROLONGED QT

PQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
10XYZEGXYZ-US-701-00210334PT89CHYPTENLChamber Hypertrophy or EnlargementFINDINGSUPINELEFT VENTRICULAR HYPERTROPHY
LEFT VENTRICULAR HYPERTROPHY

PQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
11XYZEGXYZ-US-701-00211334PT89TECHQUALTechnical Quality
SUPINEOTHER INCORRECT ELECTRODE PLACEMENT
OTHER INCORRECT ELECTRODE PLACEMENT

PQW436789-07.xmlTest Lab

1Screening 12003-04-15T11:58-36
12XYZEGXYZ-US-701-00212334PT89INTPInterpretation
SUPINEABNORMAL
ABNORMAL





1Screening 12003-04-15T11:58-36

For some tests, clinical significance was collected. These assessments of clinical significance were represented in supplemental qualifier records.

Row 1:Shows that the record in the EG dataset with EGSEQ = "1" (the record showing a ventricular rate of 62 bpm), was assessed as having a value of "N" for the variable EGCLSIG. In other words, the result was not clinically significant.
Row 2:Shows that the record in the EG dataset with EGSEQ = "2" (the record showing a PR interval of 0.15 sec), was assessed as being clinically significant.

suppeg.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1XYZEGXYZ-US-701-002EGSEQ1EGCLSIGClinically SignificantNCRF
2XYZEGXYZ-US-701-002EGSEQ2EGCLSIGClinically SignificantYCRF

Example

This example shows ECG results where only the overall assessment was collected. Results are for one subject across multiple visits. In addition, the ECG interpretation was provided by the investigator and, when necessary, by a cardiologist. EGGRPID is used to group the overall assessments collected on each ECG.

Rows 1-3:Show interpretations performed by the principal investigation on three different occasions. The ECG at Visit "SCREEN 2" has been flagged as the last observation before start of study treatment.
Rows 4-5:Show interpretations of the same ECG by both the investigator and a cardiologist. EGGRPID has been used to group these two records to emphasize their relationship.

eg.xpt

RowSTUDYIDDOMAINUSUBJIDEGSEQEGGRPIDEGTESTCDEGTESTEGPOSEGORRESEGSTRESCEGSTRESNEGLOBXFLEGEVALVISITNUMVISITVISITDYEGDTCEGDY
1ABCEGABC-99-CA-4561
INTPInterpretationSUPINENORMALNORMAL

PRINCIPAL INVESTIGATOR1SCREEN I-22003-11-26-2
2ABCEGABC-99-CA-4562
INTPInterpretationSUPINEABNORMALABNORMAL
YPRINCIPAL INVESTIGATOR2SCREEN II-12003-11-27-1
3ABCEGABC-99-CA-4563
INTPInterpretationSUPINEABNORMALABNORMAL

PRINCIPAL INVESTIGATOR3DAY 10102003-12-07T09:0210
4ABCEGABC-99-CA-4564Comp 1INTPInterpretationSUPINEABNORMALABNORMAL

PRINCIPAL INVESTIGATOR4DAY 15152003-12-1215
5ABCEGABC-99-CA-4565Comp 1INTPInterpretationSUPINEABNORMALABNORMAL

CARDIOLOGIST4DAY 15152003-12-1215

Example

This example shows 10-second ECG replicates extracted from a continuous recording. The example shows one subject's (USUBJID = "2324-P0001") extracted 10-second ECG replicate results. Three replicates were extracted for planned time points "1 HR" and "2 HR"; EGREPNUM is used to identify the replicates. Summary mean measurements are reported for the 10 seconds of extracted data for each replicate. EGDTC is the date/time of the first individual beat in the extracted 10-second ECG. In order to save space, some permissible variables (EGREFID, VISITDY, EGTPTNUM, EGTPTREF, EGRFTDTC) have been omitted, as marked by ellipses.

eg.xpt

RowSTUDYIDDOMAINUSUBJIDEGSEQEGTESTCDEGTESTEGCATEGPOSEGORRESEGORRESUEGSTRESCEGSTRESNEGSTRESUEGLEADEGMETHODVISITNUMVISITEGDTCEGTPTEGREPNUM
1STUDY01EG2324-P00011PRAGPR Interval, AggregateINTERVALSUPINE176msec176176msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:00:211 HR1
2STUDY01EG2324-P00012RRAGRR Interval, AggregateINTERVALSUPINE658msec658658msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:00:211 HR1
3STUDY01EG2324-P00013QRSAGQRS Duration, AggregateINTERVALSUPINE97msec9797msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:00:211 HR1
4STUDY01EG2324-P00014QTAGQT Interval, AggregateINTERVALSUPINE440msec440440msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:00:211 HR1
5STUDY01EG2324-P00015PRAGPR Interval, AggregateINTERVALSUPINE176msec176176msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:01:351 HR2
6STUDY01EG2324-P00016RRAGRR Interval, AggregateINTERVALSUPINE679msec679679msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:01:351 HR2
7STUDY01EG2324-P00017QRSAGQRS Duration, AggregateINTERVALSUPINE95msec9595msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:01:351 HR2
8STUDY01EG2324-P00018QTAGQT Interval, AggregateINTERVALSUPINE389msec389389msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:01:351 HR2
9STUDY01EG2324-P00019PRAGPR Interval, AggregateINTERVALSUPINE169msec169169msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:02:141 HR3
10STUDY01EG2324-P000110RRAGRR Interval, AggregateINTERVALSUPINE661msec661661msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:02:141 HR3
11STUDY01EG2324-P000111QRSAGQRS Duration, AggregateINTERVALSUPINE90msec9090msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:02:141 HR3
12STUDY01EG2324-P000112QTAGQT Interval, AggregateINTERVALSUPINE377msec377377msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T10:02:141 HR3
13STUDY01EG2324-P000113PRAGPR Interval, AggregateINTERVALSUPINE176msec176176msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:00:212 HR1
14STUDY01EG2324-P000114RRAGRR Interval, AggregateINTERVALSUPINE771msec771771msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:00:212 HR1
15STUDY01EG2324-P000115QRSAGQRS Duration, AggregateINTERVALSUPINE100msec100100msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:00:212 HR1
16STUDY01EG2324-P000116QTAGQT Interval, AggregateINTERVALSUPINE379msec379379msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:00:212 HR1
17STUDY01EG2324-P000117PRAGPR Interval, AggregateINTERVALSUPINE179msec179179msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:01:312 HR2
18STUDY01EG2324-P000118RRAGRR Interval, AggregateINTERVALSUPINE749msec749749msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:01:312 HR2
19STUDY01EG2324-P000119QRSAGQRS Duration, AggregateINTERVALSUPINE103msec103103msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:01:312 HR2
20STUDY01EG2324-P000120QTAGQT Interval, AggregateINTERVALSUPINE402msec402402msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:01:312 HR2
21STUDY01EG2324-P000121PRAGPR Interval, AggregateINTERVALSUPINE175msec175175msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:02:402 HR3
22STUDY01EG2324-P000122RRAGRR Interval, AggregateINTERVALSUPINE771msec771771msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:02:402 HR3
23STUDY01EG2324-P000123QRSAGQRS Duration, AggregateINTERVALSUPINE98msec9898msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:02:402 HR3
24STUDY01EG2324-P000124QTAGQT Interval, AggregateINTERVALSUPINE356msec356356msecLEAD II12 LEAD STANDARD2VISIT 22014-03-22T11:02:402 HR3

Example

The example shows one subject's continuous beat-to-beat EG results. Only 3 beats are shown, but there could be measurements for, as an example, 101,000 complexes in 24 hours. The actual number of complexes in 24 hours can be variable and depends on average heart rate. The results are mapped to the EG (ECG Test Results) domain using EGBEATNO. If there is no result to be reported, then the row would not be included.

Rows 1-2:Show the first beat recorded. The first beat was considered to be the beat for which the recording contained a complete P-wave. It was assigned EGBEATNO = "1". There is no RR measurement for this beat because RR is measured as the duration (time) between the peak of the R-wave in the reported single beat and peak of the R-wave in the preceding single beat, and the partial recording that preceded EGBEATNO = "1" did not contain an R-wave. EGDTC was the date/time of the individual beat.
Rows 3-5:EGBEATNO = "2" had an RR measurement, since the R-wave of the preceding beat (EGBEATNO = "1") was recorded.
Rows 6-8:There is a 1-hour gap between beats 2 and 3 due to electrical interference or other artifacts that prevented measurements from being recorded. Note that EGBEATNO = "3" does have an RR measurement because the partial beat preceding EGBEATNO = "3" contained an R-wave.

eg.xpt

RowSTUDYIDDOMAINUSUBJIDEGSEQEGTESTCDEGTESTEGCATEGPOSEGBEATNOEGORRESEGORRESUEGSTRESCEGSTRESNEGSTRESUEGLEADEGMETHODVISITNUMVISITVISITDYEGDTC
1STUDY01EG2324-P00011PRSBPR Interval, Single BeatINTERVALSUPINE1176msec176176msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T14:32:12.3
2STUDY01EG2324-P00012QRSSBQRS Duration, Single BeatINTERVALSUPINE197msec9797msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T14:32:12.3
3STUDY01EG2324-P00013PRSBPR Interval, Single BeatINTERVALSUPINE2176msec176176msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T14:32:13.3
4STUDY01EG2324-P00014RRSMRR Interval, Single MeasurementINTERVALSUPINE2679msec679679msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T14:32:13.3
5STUDY01EG2324-P00015QRSSBQRS Duration, Single BeatINTERVALSUPINE295msec9595msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T14:32:13.3
6STUDY01EG2324-P00016PRSBPR Interval, Single BeatINTERVALSUPINE3169msec169169msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T15:32:14.2
7STUDY01EG2324-P00017RRSMRR Interval, Single MeasurementINTERVALSUPINE3661msec661661msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T15:32:14.2
8STUDY01EG2324-P00018QRSSBQRS Duration, Single BeatINTERVALSUPINE390msec9090msecLEAD II12 LEAD STANDARD1SCREENING-72014-02-11T15:32:14.2

6.3.4 Inclusion/Exclusion Criteria Not Met

IE – Description/Overview

A findings domain that contains those criteria that cause the subject to be in violation of the inclusion/exclusion criteria.

IE – Specification

ie.xpt, Inclusion/Exclusion Criteria Not Met — Findings, Version 3.2. One record per inclusion/exclusion criterion not met per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

IE – Assumptions

  1. The intent of the domain model is to collect responses to only those criteria that the subject did not meet, and not the responses to all criteria. The complete list of Inclusion/Exclusion criteria can be found in the Trial Inclusion/Exclusion Criteria (TI) dataset described in Section 7.4.1, Trial Inclusion/Exclusion Criteria.
  2. This domain should be used to document the exceptions to inclusion or exclusion criteria at the time that eligibility for study entry is determined (e.g., at the end of a run-in period or immediately before randomization). This domain should not be used to collect protocol deviations/violations incurred during the course of the study, typically after randomization or start of study medication. See Section 6.2.4, Protocol Deviations, for the Protocol Deviations (DV) events domain model that is used to submit protocol deviations/violations.
  3. IETEST is to be used only for the verbatim description of the inclusion or exclusion criteria. If the text is no more than 200 characters, it goes in IETEST; if the text is more than 200 characters, put meaningful text in IETEST and describe the full text in the study metadata. See Section 4.5.3.1, Test Name (--TEST) Greater than 40 Characters, for further information.
  4. Additional findings qualifiers: The following Qualifiers would generally not be used in IE: --MODIFY, --POS, --BODSYS, --ORRESU, --ORNRLO, --ORNRHI, --STRESN, --STRESU, --STNRLO, --STNRHI, --STNRC, --NRIND, --RESCAT, --XFN, --NAM, --LOINC, --SPEC, --SPCCND, --LOC, --METHOD, --BLFL, --LOBXFL, --FAST, --DRVFL, --TOX, --TOXGR, --SEV, --STAT.

IE – Examples

Example

This example shows records for three subjects who failed to meet all inclusion/exclusion criteria but who were included in the study.

Rows 1-2:Show data for a subject with two inclusion/exclusion exceptions.
Rows 3-4:Show data for two other subjects, both of whom failed the same inclusion criterion.

ie.xpt

RowSTUDYIDDOMAINUSUBJIDIESEQIESPIDIETESTCDIETESTIECATIEORRESIESTRESCVISITNUMVISITVISITDYIEDTCIEDY
1XYZIEXYZ-0007117EXCL17Ventricular RateEXCLUSIONYY1WEEK -8-561999-01-10-58
2XYZIEXYZ-000723INCL03Acceptable mammogram from local radiologist?INCLUSIONNN1WEEK -8-561999-01-10-58
3XYZIEXYZ-004713INCL03Acceptable mammogram from local radiologist?INCLUSIONNN1WEEK -8-561999-01-12-56
4XYZIEXYZ-009613INCL03Acceptable mammogram from local radiologist?INCLUSIONNN1WEEK -8-561999-01-13-55

6.3.5 Immunogenicity Specimen Assessments

IS – Description/Overview

A findings domain for assessments that determine whether a therapy induced an immune response.

IS – Specification

is.xpt, Immunogenicity Specimen Assessments — Findings, Version 3.3. One record per test per visit per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

IS – Assumptions

  1. The Immunogenicity Specimen Assessments (IS) domain model holds assessments that describe whether a therapy provoked/caused/induced an immune response. The response can be either positive or negative. For example, a vaccine is expected to induce a beneficial immune response, while a cellular therapy such as erythropoiesis-stimulating agents may cause an adverse immune response.
  2. Any Identifier variables, Timing variables, or Findings general observation class qualifiers may be added to the IS domain, but the following Qualifiers would not generally be used in IS: --POS, --BODSYS, --ORNRLO, --ORNRHI, --STNRLO, --STNRHI, --STRNC, --NRIND, --RESCAT, --XFN, --LOINC, --SPCCND, --FAST, --TOX, --TOXGR, --SEV.

IS – Examples

Example

In this example, subjects were dosed with a Hepatitis C vaccine. Note that information about administration of the vaccine is found in the Exposure domain, not the Immunogenicity domain, so it is not included here.

is.xpt

RowSTUDYIDDOMAINUSUBJIDISSEQISTESTCDISTESTISCATISORRESISORRESUISSTRESCISSTRESNISSTRESUISSPECISMETHODISLOBXFLISLLOQVISITNUMVISITISDTCISDY
1ABC-123IS1234571HCABHepatitis C Virus AntibodySEROLOGY3115.016gpELISA unit/mL3115.0163115.016gpELISA unit/mLSERUMENZYME IMMUNOASSAYY1001VISIT 12008-10-101
2ABC-123IS1234572HCABHepatitis C Virus AntibodySEROLOGY1772.78gpELISA unit/mL1772.781772.78gpELISA unit/mLSERUMENZYME IMMUNOASSAY
1002VISIT 22008-11-2143
3ABC-123IS1234601HCABHepatitis C Virus AntibodySEROLOGY217.218gpELISA unit/mL217.218217.218gpELISA unit/mLSERUMENZYME IMMUNOASSAYY1001VISIT 12008-09-011
4ABC-123IS1234602HCABHepatitis C Virus AntibodySEROLOGY203.88gpELISA unit/mL203.88203.88gpELISA unit/mLSERUMENZYME IMMUNOASSAY
1002VISIT 22008-10-0231

Example

In this example, subject was dosed with the study product consisting of 0.5 mL of varicella vaccine. The immunogenic response of the study product, as well as the pneumococcal vaccine that was given concomitantly, was measured to ensure that immunogenicity of both vaccines was sufficient to provide protection. The measurements of antibody to the vaccines were represented in the IS domain.

is.xpt

RowSTUDYIDDOMAINUSUBJIDISSEQISTESTCDISTESTISCATISORRESISORRESUISSTRESCISSTRESNISSTRESUISSPECISMETHODISLOBXFLISLLOQVISITNUMVISITISDYISDTC
1GHJ-456IS60171PNPSAB14Pneumococcal Polysacch AB Serotype 14SEROLOGY9.715ug/mL9.7159.715ug/mLSERUMENZYME IMMUNOASSAYY2.51VISIT 112010-02-06
2GHJ-456IS60172VZVABVaricella-Zoster Virus AntibodySEROLOGY141.616gpELISA unit/mL141.616141.616gpELISA unit/mLSERUMENZYME IMMUNOASSAYY2.51VISIT 112010-02-06
3GHJ-456IS60173PNPSAB14Pneumococcal Polysacch AB Serotype 14SEROLOGY13.244ug/mL13.24413.244ug/mLSERUMENZYME IMMUNOASSAY
2.52VISIT 2312010-03-09
4GHJ-456IS60174VZVABVaricella-Zoster Virus AntibodySEROLOGY870.871gpELISA unit/mL870.871870.871gpELISA unit/mLSERUMENZYME IMMUNOASSAY
2.52VISIT 2312010-03-09

6.3.6 Laboratory Test Results

LB – Description/Overview

A findings domain that contains laboratory test data such as hematology, clinical chemistry and urinalysis. This domain does not include microbiology or pharmacokinetic data, which are stored in separate domains.

LB – Specification

lb.xpt, Laboratory Test Results — Findings, Version 3.3. One record per lab test per time point per visit per subject, Tabulation.

¹ In this column, * indicates the variable may be subject to controlled terminology, and CDISC/NCI codelist code values are enclosed in (parenthesis).

LB – Assumptions

  1. LB definition: This domain captures laboratory data collected on the CRF or received from a central provider or vendor.
  2. For lab tests that do not have continuous numeric results (e.g., urine protein as measured by dipstick, descriptive tests such as urine color), LBSTNRC could be populated either with normal range values that are a range of character values for an ordinal scale (e.g., "NEGATIVE to TRACE") or a delimited set of values that are considered to be normal (e.g., "YELLOW", "AMBER"). LBORNRLO, LBORNRHI, LBSTNRLO, and LBSTNRHI should be null for these types of tests.
  3. LBNRIND can be added to indicate where a result falls with respect to reference range defined by LBORNRLO and LBORNRHI. Examples: "HIGH", "LOW". Clinical significance would be represented as described in Section 4.5.5, Clinical Significance for Findings Observation Class Data, as a record in SUPPLB with a QNAM of LBCLSIG (see also LB Example 1 below).
  4. For lab tests where the specimen is collected over time, e.g., a 24-hour urine collection, the start date/time of the collection goes into LBDTC and the end date/time of collection goes into LBENDTC. See Section 4.4.8, Date and Time Reported in a Domain Based on Findings.
  5. Any Identifiers, Timing variables, or Findings general observation class qualifiers may be added to the LB domain, but the following Qualifiers would not generally be used in LB: --BODSYS, --SEV.
  6. A value derived by a central lab according to their procedures is considered collected rather than derived. See Section 4.1.8.1, Origin Metadata for Variables.
  7. The variable, LBORRESU uses the UNIT codelist. This means that sponsors should be submitting a term from the column "CDISC Submission Value" in the published Controlled Terminology List that is maintained for CDISC by NCI EVS. When sponsors have units that are not in this column, they should first check to see if their unit is mathematically synonymous with an existing unit and submit their lab values using that unit. (Example: "g/L" and "mg/mL" are mathematically synonymous, but only "g/L" is in the CDISC Unit codelist.) If this is not the case, then a New-Term Request Form should be submitted.

LB – Examples

Example

Row 1:Shows a value collected in one unit, but converted to selected standard unit. See Section 4.5.1, Original and Standardized Results of Findings and Tests Not Done for additional examples for the population of Result Qualifiers.
Rows 1, 3, 5-8:LBLOBXFL = "Y" indicates that these were last observations before exposure to study treatment.
Rows 2-3:Show two records (Rows 2 and 3) for Alkaline Phosphatase done at the same visit, one day apart.
Row 4:Shows a derived record (average of the records 2 and 3) and flagged derived (LBDRVFL = "Y").
Rows 6-7:Show a suggested use of the LBSCAT variable. It could be used to further classify types of tests within a laboratory panel (i.e., "DIFFERENTIAL").
Row 9:Shows the proper use of the LBSTAT variable to indicate "NOT DONE", where a reason was collected when a test was not done.
Row 10:The subject had cholesterol measured. The normal range for this test is <200 mg/dL. Note that the sponsor has decided to make LBSTNRHI = "199", however another sponsor may have chosen a different value.
Row 12:Shows use of LBSTNRC for Urine Protein that is not reported as a continuous numeric result.

lb.xpt

RowSTUDYIDDOMAINUSUBJIDLBSEQLBTESTCDLBTESTLBCATLBSCATLBORRESLBORRESULBORNRLOLBORNRHILBSTRESCLBSTRESNLBSTRESULBSTNRLOLBSTNRHILBSTNRCLBNRINDLBSTATLBREASNDLBLOBXFLLBFASTLBDRVFLVISITNUMVISITLBDTC
1ABCLBABC-001-0011ALBAlbuminCHEMISTRY
30g/L35503.03.0g/dL3.55
LOW

YY
1Week 11999-06-19
2ABCLBABC-001-0012ALPAlkaline PhosphataseCHEMISTRY
398IU/L40160398398IU/L40160




Y
1Week 11999-06-19
3ABCLBABC-001-0013ALPAlkaline PhosphataseCHEMISTRY
350IU/L40160350350IU/L40160



YY
1Week 11999-06-20
4ABCLBABC-001-0014ALPAlkaline PhosphataseCHEMISTRY




374374IU/L40160




YY1Week 11999-06-19
5ABCLBABC-001-0015WBCLeukocytesHEMATOLOGY
5.910^9/L4115.95.910^9/L411



YY
1Week 11999-06-19
6ABCLBABC-001-0016LYMLELymphocytesHEMATOLOGYDIFFERENTIAL6.7%25406.76.7%2540
LOW

YY
1Week 11999-06-19
7ABCLBABC-001-0017NEUTNeutrophilsHEMATOLOGYDIFFERENTIAL5.110^9/L285.15.110^9/L28



YY
1Week 11999-06-19
8ABCLBABC-001-0018PHpHURINALYSIS
7.5
5.09.07.5

5.009.00



YY
1Week 11999-06-19
9ABCLBABC-001-0019ALBAlbuminCHEMISTRY











NOT DONEINSUFFICIENT SAMPLE


2Week 21999-07-21
10ABCLBABC-001-00110CHOLCholesterolCHEMISTRY
229mg/dL0<200229229mg/dL0199






2Week 21999-07-21
11ABCLBABC-001-00111WBCLeukocytesHEMATOLOGY
5.910^9/L4115.95.910^9/L411




Y
2Week 21999-07-21
12ABCLBABC-001-00112PROTProteinURINALYSIS
MODERATE


MODERATE



NEGATIVE to TRACEABNORMAL




2Week 21999-07-21

The SUPPLB dataset example shows clinical significance assigned by the investigator for test results where LBNRIND (reference range indicator) is populated.

supplb.xpt

RowSTUDYIDRDOMAINUSUBJIDIDVARIDVARVALQNAMQLABELQVALQORIGQEVAL
1ABCLBABC-001-001LBSEQ1LBCLSIGClinical SignificanceNCRFINVESTIGATOR
2ABCLBABC-001-001LBSEQ6LBCLSIGClinical SignificanceNCRFINVESTIGATOR

Example

Row 1:Shows an example of a pre-dose urine collection interval (from 4 hours prior to dosing until 15 minutes prior to dosing) with a negative value for LBELTM that reflects the end of the interval in reference to the fixed reference LBTPTREF, the date of which is recorded in LBRFTDTC.
Rows 2-3:Show an example of postdose urine collection intervals with values for LBELTM that reflect the end of the intervals in reference to the fixed reference LBTPTREF, the date of which is recorded in LBRFTDTC.

lb.xpt

RowSTUDYIDDOMAINUSUBJIDLBSEQLBTESTCDLBTESTLBCATLBORRESLBORRESULBORNRLOLBORNRHILBSTRESCLBSTRESNLBSTRESULBSTNRLOLBSTNRHILBNRINDVISITNUMVISITLBDTCLBENDTCLBTPTLBTPTNUMLBELTMLBTPTREFLBRFTDTC
1ABCLBABC-001-0011GLUCGlucoseURINALYSIS7mg/dL1150.390.39mmol/L0.10.8NORMAL2INITIAL DOSING1999-06-19T04:001999-06-19T07:45Pre-dose1-PT15MDosing1999-06-19T08:00
2ABCLBABC-001-0012GLUCGlucoseURINALYSIS11mg/dL1150.610.61mmol/L0.10.8NORMAL2INITIAL DOSING1999-06-19T08:001999-06-19T16:000-8 hours after dosing2PT8HDosing1999-06-19T08:00
3ABCLBABC-001-0013GLUCGlucoseURINALYSIS9mg/dL1150.50.5mmol/L0.10.8NORMAL2INITIAL DOSING1999-06-19T16:001999-06-20T00:008-16 hours after dosing3PT16HDosing1999-06-19T08:00

Example

This is an example of pregnancy test records, one with a result and one with no result because the test was not performed because the subject was male.