Databases Containing Claims for Payment

Medicare claims: Medicare claims come in a number of forms, including random 5% and 20% samples, a 100% sample (very hard to get), files for specific conditions pre-assembled in the Chronic Conditions Warehouse, and custom data pulls you can request (e.g., all the patients who had cataract surgery in the last 2 years).

Medicare Claims Link: https://www.resdac.org/

Medicare Claims Pro Tip: The 5% and 20% random samples are drawn anew each year, so you cannot use, eg, 10 years of the 20% random sample to actually follow specific individuals over 10 years. However, you can get an “enhanced 5% sample” that will allow this.


Truven Health (IBM) MarketScan Databases: These are a set of databases of insurance claims compiled from health insurers, large employers, and government programs. Truven reports that, since 1995, the claims of 240 million unique individuals have been incorporated into the databases. The actual number of patients included varies year to year, but patients can be followed across years and insurers (as long as they stay with an insurer or employer in the dataset). Claims include physician, hospital, and pharmacy claims.

Truven/IBM link: http://truvenhealth.com/markets/life-sciences/products/data-tools/marketscan-databases

Truven Health MarketScan Databases Pro Tip: Providers (e.g., physicians, hospitals) cannot be identified. Furthermore, because different insurers use different identifiers for providers, claims cannot be aggregated up to the provider level (even if using the encrypted identifier) for the providers.


Optum Insight Databases: These are a set of databases of insurance claims compiled from health insurers, large employers, and government programs. Optum reports that, since 1993, the claims of 216 million unique individuals have been incorporated into the databases. The actual number of patients included varies year to year, but patients can be followed across years and insurers (as long as they stay with an insurer or employer in the dataset). Claims include physician, hospital, and pharmacy claims.

Optum Insight link: https://www.optum.com/solutions/data-analytics/data/real-world-data-analytics-a-cpl/claims-data.html

Optum Insight Pro Tip: Optum offers seven different “views” of their data, and the choice of view matters a lot. For example, as of March, 2018, you cannot get a view that provides both an indicator for patient death AND a patient geographic location smaller than the state in which the patient resides. Therefore, you cannot, for example, calculate mortality in a ZIP code, city, or county.


All Payer Claims Databases (APCDs)/Multipayer Claims Databases (MPCDs): These are databases compiled by individual state agencies or state-designated organizations that aggregate claims from all or most of the payers (insurers and large employers plus Medicaid and often plus Medicare) in a state. Patients usually can be followed for years, even if they move from employer-based insurance to Medicaid to Medicare. Claims usually include physician, hospital, and pharmacy claims. The APCD Council, the National Association of Health Data Organizations, and the University of New Hampshire maintain a list of which states have APCDs.           

Map Listing States with APCDs: https://www.apcdcouncil.org/state/map.

APCD Pro Tip: APCDs capture the vast majority of the citizens of a state in a database that allows them to be followed over time. The exceptions to this are patients without insurance. Because they don’t have insurance, there are no claims filed for these patients and nothing is reported to an APCD.