How To Evaluate A Healthcare Program
This article outlines a practical method for evaluating healthcare programs using standards commonly applied in CMS evaluations. The goal is to separate true program effects from background noise.
Home / Evidence & Performance / How To Evaluate A Healthcare Program
1. Establish the logic model
Before running any analysis, define the expected pathway from intervention to outcome. For example: early outreach and medication reconciliation can reduce avoidable acute events, which then lowers inpatient utilization and total spend.
- List each operational mechanism and the expected downstream outcome.
- Define timing assumptions: how quickly should each effect appear?
- Identify outcomes that should change first vs. later.
2. Choose an evaluation design
In most real-world settings, a difference-in-differences design is the strongest practical approach.
- Treatment trend: pre/post change among participants.
- Comparison trend: pre/post change among similar non-participants during the same period.
- Program effect: the difference between those two changes.
This structure helps net out system-wide effects such as policy shifts, coding changes, or pandemic-era shocks.
3. Build a valid comparison group
The comparison group must look like the treatment group before program launch.
- Use propensity score matching on age, sex, geography, chronic burden, and baseline utilization.
- Check pre-period parallel trends in the core outcomes.
- Document balance diagnostics and unmatched exclusions.
4. Measure both outcomes and implementation
Quantitative measures tell you what changed. Qualitative evidence helps explain why.
- Spending: PMPM total cost and service-specific costs.
- Utilization: admissions, ED visits, SNF days, readmissions.
- Quality: potentially avoidable events and evidence-based process measures.
- Health outcomes: mortality and days at home.
- Implementation: clinician interviews, workflow assessment, and patient feedback.
5. Anticipate real-world data constraints
- Unmeasured confounding risk increases as program duration grows.
- Small samples can hide meaningful effects when confidence intervals are wide.
- Administrative claims are often preferable for treatment/comparison parity and reproducibility.
6. Interpret beyond a yes/no result
- Magnitude: Is the effect practically meaningful?
- Uncertainty: What does the confidence interval imply for risk-bearing decisions?
- Probability framing: What is the chance of at least modest savings?
- Subgroups: Which populations drive effect heterogeneity?
Related methods
Previous: Do an evaluation on a napkin | Next: Index date in program evaluation
For more granular data, more recent data, or scientific analysis support, please email us.