My da|ra Login

Detailed view

metadata language: English

CMS Medicaid Analytic Extract (MaxFile) Medicaid Claims Data: 100 Percent of Claims for 14 Southern States, 2004-2007

Resource Type
Dataset : administrative records data
  • Rust, George (Morehouse School of Medicine)
Publication Date
Funding Reference
  • United States Department of Health and Human Services. Agency for Healthcare Research and Quality
    • Award Number: 1R24HS01947001
Free Keywords
Schema: ICPSR
health care access; health care services; Medicaid; Medicare; patients
  • Abstract

    Purpose. This was a Data Capacity-Building Project, to build a robust comparative effectiveness research infrastructure, agenda, and collaborative partnerships focused on eliminating health disparities. Specifically, a database was built comprised of all Medicaid enrollees and claims in the states that share in common both adverse minority health outcomes and the historical roots of racial health disparities in the South.

    Setting and Participants. A 100 percent sample of four years 2004-2007 of Medicaid Analytic Extract (MAX-file) data (plus Medicare-linked claims for dual-eligibles) from fourteen southern states, representing 3.8 to 5.4 million persons each year (one-third of all United States Medicaid enrollees, nearly half [48 percent] of African American and 21 percent of Latino Medicaid enrollees in the United States) was obtained from the Centers for Medicare and Medicaid Services (CMS). This region is the epicenter of the Black-White health disparities epidemic, and has also experienced a recent and rapid influx of Latino immigrants. This project provided support for personnel and infrastructure needed to efficiently organize and analyze these data to support minority investigators. The HBCU-based team had extensive previous experience training health services researchers (especially minority investigators) to use Medicaid claims data for research.

    Specific Aims: Using Medicaid Claims Data

    1. To build a Medicaid claims dataset (including socieconomic, contextual, and geospatial analytic variables, NDC cross-walk data and therapeutic class codes, as well as certain Medicare data for dual-eligibles) to support projects focused on the intersection between disparities research and comparative effectiveness research in clinically and socially complex patient populations.

    2. To create an efficient process for assisting non-Morehouse investigators to develop research protocols, analysis plans, CMS data re-use requests, and analytic files for collaborative research.

    3. To train, develop, cultivate, and support emerging minority investigators (especially at Historically Black colleges and universities (HBCUs) and other minority-serving institutions) as independently-funded health services researchers who are increasingly proficient in multivariate analysis of Medicaid and Medicare claims data.

    4. Cultivate comparative effectiveness and disparities research collaborations with Georgia Tech experts in mathematics, complexity science, simulation modeling, and interactive computing.

    Relevance. Medicaid patients are characterized by clinical and social complexity -- the very characteristics which often exclude them from clinical trials and yet drive health disparities. This Medicaid-based dataset populates studies that help users understand how local area, provider-level, and patient-level differences in treatment (natural experiments in comparative effectiveness) influence clinical and economic outcomes. Variation implies that disparities are not inevitable. The comparative impact of this natural variation can be measured in meaningful outcomes such as emergency department visits, hospital admissions, inpatient bed-days, deaths, and total Medicaid expenditures, as well as community-level disparity rate-ratios. Medicaid data allow users to follow a complex patient (e.g., comorbid diabetes and schizophrenia or COPD and CHF) from treatment to outcomes through every billable service in the health care system (i.e., from doctor's visit to lab tests to prescriptions to emergency room visits or hospital admissions). Morehouse School of Medicine has a unique ability to develop a new cadre of minority investigators to conduct and interpret the results of health services research with a racially sensitive, culturally competent perspective.

    Data Overview. The Centers for Medicare and Medicaid Services produces the MAX-files from Medicaid Statistical Information System (MSIS) data submitted by each state, with some data cleaning and validation by CMS sub-contractors before data are released to researchers.

    The MAX-file data from CMS were loaded onto encrypted, secure servers at Morehouse School of Medicine. Research analytic files were created for each sub-project, including sickle cell disease, diabetes and schizophrenia; asthma; dementia; and congestive heart failure. For specific sub-projects, contextual variables from census data or area resource file were linked by county FIPS code.

    Data Access. The data cannot be made publicly available. Data are stored on Morehouse School of Medicine encrypted servers, and may be used only for projects covered within the aims of the original research protocol and Centers for Medicare and Medicaid Services (CMS)-approved data use agreement. Data sharing is allowed only for research protocols approved under data re-use requests by the CMS privacy board. The CMS process for data re-use requests is described at the ResDAC Web site.

    Due to limitations of research staff within the Morehouse National Center for Primary Care, and limitations of the existing CMS data use agreement, only re-use requests consistent with the original aims of the approved research protocol are considered (temporal and geographic variation in racial-ethnic disparities in quality, access and outcomes for Medicaid enrollees in 14 southern states). Specific aims of the current research protocol define the boundaries of what kind of research questions could be answered or sub-projects developed within the existing research protocol and data use agreement (see above "Specific Aims" section). A worksheet for developing an analysis plan for a specific research question is attached. Parties interested in the data should contact George Rust, MD, MPH (

    Six SAS program syntax files used for data analysis, however, are available on the ICPSR site.

    Aside from data re-use requests, the Morehouse National Center for Primary Care is open to collaborations which address these research aims and are consistent with their health equity research priorities, in which analyses could be performed by the Morehouse National Center for Primary Care research team and papers authored or co-authored by faculty from other minority-serving institutions or affiliated with the Research Centers in Minority Institutions (RCMI) Translational Research Network (RTRN).

  • Methods

    Presence of Common Scales: Emergency department visits, hospital admissions, total inpatient bed-days, total Medicaid charges ($), and Elixhauser comorbidity index.
  • Abstract

    Datasets: DS0: Study-Level Files DS1: Capture Count ED Visits DS2: Calculate a Comorbidity Index DS3: Identify Hospital Admits and ED Visits re: Ambulatory Care Sensitive Conditions (ACSC) DS4: Identify Persons By Diagnosis Diabetes Schizophrenia CHF Asthma Sickle Cell Examples DS5: Identify Person Having RX Claim of B-Blocker ACE ARB, Inhaled Corticosteroid, SABA, and Antidepressant) DS6: Identify Medicare Hospital Admit Records for a Dual-Eligible Person
Temporal Coverage
  • 2004-01-01 / 2007-12-31
    Time Period: Thu Jan 01 00:00:00 EST 2004--Mon Dec 31 00:00:00 EST 2007
Geographic Coverage
  • North Carolina
  • United States
  • Tennessee
  • Kentucky
  • Alabama
  • Florida
  • Arkansas
  • South Carolina
  • Mississippi
  • Texas
  • Missouri
  • Louisiana
  • Georgia
  • Virginia
  • Maryland
Sampled Universe
All Medicaid claims for all Medicaid enrollees in 14 states for the years 2004-2007. Smallest Geographic Unit: All records can be linked to the individual's place of residence at the zip code or county level.
100 percent of claims for 100 percent of enrollees.
Collection Mode
  • Dr. Robert Levine (Meharry Medical College) and Dr. Usha Sambamoorthi (West Virginia University) were special collaborators.

One or more files in this study are not available for download due to special restrictions; consult the study documentation to learn more on how to obtain the data.
  • Is version of
    DOI: 10.3886/ICPSR34353

Update Metadata: 2020-11-18 | Issue Number: 8 | Registration Date: 2015-06-16

Rust, George (2013): CMS Medicaid Analytic Extract (MaxFile) Medicaid Claims Data: 100 Percent of Claims for 14 Southern States, 2004-2007. Version: v1. ICPSR - Interuniversity Consortium for Political and Social Research. Dataset.