- CATALYST home
- Learn about our work
-
See our recent news
- Postdoc stays on task despite COVID-19 shifts
- Web access disparities are a public health issue
- Call for papers: Grand health care challenges
- Study: Privacy fears can lead to withheld info
- OTHC hosted online because of COVID-19
- CATALYST nears inpatient portal study's conclusion
- Dr. Timothy Huerta takes on new role as CRIO
- CATALYST presents infant mortality data visually
- Faculty come together to demo simulation projects
- Postdoc uses big data to solve real-world issues
- Postdoc contributes to patient portal research
- Researchers evaluate care simulation training
- Dr. McAlearney cited in the Washington Post
- CATALYST researchers share work in D.C.
- $2.27M grant awarded to CATALYST’s Dr. Sieck
- Collaboration yields new patient engagement tool
- OSU faculty members write for NEJM Catalyst
- What is T3 research? Dr. Huerta takes a look
- CATALYST researchers publish new work
- Dr. McAlearney speaks in Argentina on SMART
- Council weighs in on major healthcare issues
- Dr. Chandrasekaran to speak at CATALYST
- CATALYST welcomes Dr. Sheon for health IT talk
- Meet our team
- The Center to STOP-COVID
- DataCore
- Join CATALYST
Introduction
Purveyor: Centers for Medicare and Medicaid Services
Years in the DataCore: 2012-2017
Years of data owned: 2012-2017
Unit of data: Claim
Dataset website: https://www.resdac.org/cms-data/files/carrier-ffs
General description: This is also known as the Physician/Supplier Part B claims file and contains final action fee-for-service claims submitted on a CMS-1500 claim form. Most of the claims are from non-institutional providers, such as physicians, physician assistants, clinical social workers, nurse practitioners, and free-standing facility claims.
Common Key Linking Variables
CLM_ID is the unique identifier for a given claim.
Hospital Linking:
- CARR_NUM can be used to identify the Carrier where the claim was submitted from.
Provider Linking:
- NPI can be used to uniquely identify providers.
Geographic Linking:
- ZIP code data are provided for every claim.
Carrier Structure
Base Claim File
Every row of the claim file represents a claim submitted to CMS.
The Primary Key of the claim file is CLM_ID
Line File
Every row of the Line file represents data that can exist in duplicity for a given Claim.
The Primary Key of the line file is CLM_ID and CLM_LN
DataCore Staff Errata
5/28/2019: No data errata, data exceptions or data corrections have been issued.
DataCore Purveyor Errata
5/28/2019: No data errata, data exceptions or data corrections have been implemented.
Provenance
CMS sent the claims files as comma separated value files (.csv) along with a SAS load script and a data dictionary. It was found that the data dictionary files were incorrect and could not be used to load the data into SQL. Instead, the process below was used.
For the code used for these processes, email datacore@osumc.edu.
- The .csvfiles were loaded into SAS using the provided SAS load files.
- SQL tables were created using the proc sql "create table like" command in SAS.
- SAS was then used to convert the .csv into Tab Separated Value files (.tsv)
- A bulk copy program (BCP) was used in order to upload the .tsv into SQL.
- The provided data dictionary was used to generate metadata about the dataset fields and was used to generate the data dictionary.