How To Account For Outliers In Spending Data
In healthcare program evaluation, outliers can materially distort spending estimates. This guide describes practical, reproducible methods to control that risk while keeping the analysis decision-ready.
Home / Evidence & Performance / How To Account For Outliers In Spending Data
1. Use a 99th-percentile reset (winsorization)
A common approach is to cap spending at a high percentile threshold. For example, values above the 99th percentile are reset to the 99th-percentile value.
- Purpose: limit distortion from rare catastrophic events.
- Rule: apply the exact same cap logic to all cohorts.
- Reporting: show both uncapped and capped results when possible.
2. Run site-level sensitivity tests
In multi-site programs, one location can dominate aggregate outcomes.
- Use leave-one-site-out testing: remove one site at a time and re-estimate effects.
- Check whether effect direction and magnitude remain stable.
- If one site drives most of the signal, flag concentration risk in conclusions.
3. Quantify uncertainty created by outliers
- Outliers increase variance, widen confidence intervals, and reduce statistical power.
- Small samples are especially sensitive to high-cost members.
- Interpret non-significant results in the context of variance and sample size, not only p-values.
4. Align methods with contract language
If the work supports payer/provider contracting, outlier policy should be explicitly documented in the contract or statement of work.
- Define whether spending is uncapped, trimmed, winsorized, or member-level capped.
- Specify the percentile/threshold, service categories included, and runout treatment.
- Require consistent methodology across baseline and performance periods.
- State the primary estimate and required sensitivity analyses.
Recommended reporting format
- Primary result (pre-specified method).
- Uncapped sensitivity result.
- 99th-percentile winsorized sensitivity result.
- Leave-one-site-out range of estimates.
Related methods
Previous: Small details in evaluation | Next: Claims, EMR, and CRM data
For more granular data, more recent data, or scientific analysis support, please email us.