Methods

How To Evaluate A Healthcare Program

This article outlines a practical method for evaluating healthcare programs using standards commonly applied in CMS evaluations. The goal is to separate true program effects from background noise.

Home / Evidence & Performance / How To Evaluate A Healthcare Program

1. Establish the logic model

Before running any analysis, define the expected pathway from intervention to outcome. For example: early outreach and medication reconciliation can reduce avoidable acute events, which then lowers inpatient utilization and total spend.

  • List each operational mechanism and the expected downstream outcome.
  • Define timing assumptions: how quickly should each effect appear?
  • Identify outcomes that should change first vs. later.

2. Choose an evaluation design

In most real-world settings, a difference-in-differences design is the strongest practical approach.

  • Treatment trend: pre/post change among participants.
  • Comparison trend: pre/post change among similar non-participants during the same period.
  • Program effect: the difference between those two changes.

This structure helps net out system-wide effects such as policy shifts, coding changes, or pandemic-era shocks.

3. Build a valid comparison group

The comparison group must look like the treatment group before program launch.

  • Use propensity score matching on age, sex, geography, chronic burden, and baseline utilization.
  • Check pre-period parallel trends in the core outcomes.
  • Document balance diagnostics and unmatched exclusions.

4. Measure both outcomes and implementation

Quantitative measures tell you what changed. Qualitative evidence helps explain why.

  • Spending: PMPM total cost and service-specific costs.
  • Utilization: admissions, ED visits, SNF days, readmissions.
  • Quality: potentially avoidable events and evidence-based process measures.
  • Health outcomes: mortality and days at home.
  • Implementation: clinician interviews, workflow assessment, and patient feedback.

5. Anticipate real-world data constraints

  • Unmeasured confounding risk increases as program duration grows.
  • Small samples can hide meaningful effects when confidence intervals are wide.
  • Administrative claims are often preferable for treatment/comparison parity and reproducibility.

6. Interpret beyond a yes/no result

  • Magnitude: Is the effect practically meaningful?
  • Uncertainty: What does the confidence interval imply for risk-bearing decisions?
  • Probability framing: What is the chance of at least modest savings?
  • Subgroups: Which populations drive effect heterogeneity?

Related methods

Previous: Do an evaluation on a napkin  |  Next: Index date in program evaluation

For more granular data, more recent data, or scientific analysis support, please email us.

Back to Evidence & Performance