Data Curation and Data Standards Programmer


We have an exciting opportunity to join our team as a Data Curation and Data Standards Programmer. 

In this role, the successful candidate EPPIC-Net is a network that will include a Data Coordinating Center (DCC), Clinical Coordinating Center (CCC), and ten Specialized Clinical Centers (SCCs). The DCC will work with Federal government partners to conduct high-quality efficient clinical trials to develop non-addictive treatments for pain as part of the Helping to End Addiction Long-term (HEAL) initiative. The DCC will provide a range of coordinating services for this NIH-funded network, including statistical and data management; collection of clinical, biomarker and imaging data; and establishment of repositories for biospecimens and biomedical images.
The Data Curation and Data Standards Programmer will work closely with the Principal Investigator, Data Management Leadership, and key stakeholders to support the DCCs data curation and dataset archival needs. In addition, s/he will help develop, document, and implement the EPPIC-Net meta-data standards and data dictionary, including the ontogenies and data formats developed by the DCC. The successful candidate will have experience in retrieving and assembling clinical data from multiple sources, creating derived datasets, and implementing data specifications to transform study data into CDISC-compliant datasets. S/he will have experience in employing SAS or similar programming languages for data transformation and analysis. Specifically, s/he will be skilled in developing programs to import complex external data from a variety of sources into SAS or export SAS output files to other formats. In addition, s/he will have experience in reports programming, output validation, and project documentation. Lastly, s/he will possess working knowledge of accepted data standards such as Clinical Data Interchange Standards Consortium (CDISC) Clinical Data Acquisition Standards Harmonization (CDASH) and Study Data Tabulation Model (SDTM).

Detailed Description
•Archive and curate datasets from EPPIC-Net supported studies and assets from academic and industry partners of the HEAL initiative; develop SAS programs/routines to transform datasets into analysis-ready and EPPIC-Net meta-data and ontogeny-compliant formats.
•Establish EPPIC-Nets data standards/conventions in close collaboration with the Principal Investigator and Data Management leadership; develop and update data standards user guides/manuals; serve as in-house subject matter expert in mapping study data to EPPIC-Nets meta-data standards, ontogenies, and required formats; lead the enforcement and governance of data standards/specifications.
•Develop specifications and document procedures for extracting data from the EPPIC-Net HPC storage infrastructure; develop, maintain, and document SAS programming standards for creating derived datasets that conform to pre-specified analyses data models.
•Assist in establishing procedures and best practices for implementation of data standards; develop and implement extract, transform, and load (ETL) specifications to facilitate interim storage of data from EPPIC-Net supported studies into staging repositories; document specifications and develop extraction programs/routines to facilitate transfer and retrieval of data from completed studies to the NIH Strides platform.
•Review protocols for standards (e.g., CDISC) conformance.
•Review Case Report Forms and eCRFs with standards (e.g., CDISC CDASH and SDTM) conformant elements.
•Write and manage SAS code for mapping clinical data to data structures in conformance with EPPIC-Net DCCs standards implementation guidelines.
•Review and QC NIH STRIDES-submission ready datasets, define.xml, and supporting documentation, as appropriate