Ways to Account for Risk

When to use Risk Adjustment and/or Stratification

Risk stratification refers to reporting outcomes separately for different groups, unadjusted by the risk factor associated with the grouping. For example, if we use a variable representing the social risk factor of federal poverty level (FPL), and this variable has four levels (e.g., under 100% of the FPL, 100% to under 200% of the FPL, 200% to under 300% of the FPL, and 300% of FPL or above), then we would conduct risk stratification by running the statistical model of measured entity performance without the FPL variable on each of the four levels of FPL. There are two circumstances when risk adjustment in combination with stratification might be the most appropriate risk adjustment strategy (i.e., when the goal of the strategy is fair comparisons). The first circumstance is when patient factors are not independent of the quality construct. The second circumstance is when there is treatment heterogeneity, which is another case when the patient factors and quality construct are not independent, but for legitimate clinical reasons.

As part of a risk adjustment strategy, the CMS Consensus-Based Entity (CBE) suggests using risk models in conjunction with risk stratification when use of a risk model alone would obscure important gaps of care. Industry guidance suggests, at a minimum, the stratification of risk-adjusted measures by social and functional risk factors to improve the ability to measure gaps of care and differential outcomes even if not adjusting for social or functional characteristics (NQF, 2022). If stratification is not possible and the patient factors are non-observable, reliability adjustment can eliminate the inherent bias introduced by low case volume (Dimik, et al., 2012). Risk adjustment is appropriate when the patient factors are correlated with the outcome and not the quality construct. The table provides a high-level framework for risk adjustment strategies.

Framework for Risk Adjustment Strategies

Relationship of Patient Factors and Quality Construct	Measurement of Patient Factors
Relationship of Patient Factors and Quality Construct	Observable	Non-observable
Correlated (not independent)	Patient group stratification	Peer group stratification
Uncorrelated (independent)	Risk adjustment	Reliability adjustment

The Assistant Secretary for Planning and Evaluation (ASPE) recommendation for when to adjust for social risk is based on the type of measure and the program (ASPE, 2020, p. 34).

Measure developers should always consider what method would be most appropriate to account for social risk factors (e.g., risk adjustment, stratification by groups within a quality measure, stratification at the quality measure level). Stratification at the quality measure level may be similar to peer group stratification, in which patient factors are unobserved and correlated with observed measured entity characteristics. CMS suggests exploring use of a risk adjustment strategy, i.e., use of a statistical risk adjustment model, and, if necessary, risk stratification for selected populations. For CMS to accept an outcome measure and the CMS CBE to endorse it, the measure developer must demonstrate appropriate use of a risk adjustment and/or stratification strategy. Measure developers should provide rationale and strong evidence if an outcome measure is not risk adjusted or risk stratified.

It is the measure developer s responsibility to determine whether to account for variation in factors intrinsic to the patient before comparing outcomes and to determine how to best apply these factors in the quality measure specifications. Vogel and Chen (2018) noted failure to address risk adjustment in an adequate manner can lead to biased conclusions that may adversely impact decision-making in both research and policy contexts (p.1).

Clinical Quality Language (CQL) has the capability to model risk adjustment and stratification. Historically, for risk-adjusted eCQMs, measure developers have been

describing risk adjustment methodology in the metadata
using quality measure data post hoc to risk adjust
representing variables for risk adjustment as required supplemental data
including the logic or algorithm in the risk adjustment section of the Health Quality Measure Format (HQMF)

Measure developers may model stratification in CQL. See the Measure Authoring Development Integrated Environment (MADiE) User Guide for guidance. The measure developer may add multiple strata to a single stratification.

Features of Risk Adjustment Models

The measure developer must evaluate the need for a risk adjustment strategy (i.e., risk adjustment, stratification, or both) for all potential outcome measures and statistically assess the adequacy of any strategies used. In general, a risk adjustment model possesses features such as those listed in the table below, which was partially derived from a description of preferred features of models used for publicly reported outcomes (Krumholz et al., 2006). While some of the descriptions are targeting risk adjustment models, the features are not exclusive to risk adjusted measures or risk adjustment models. See the table for descriptions of each feature with more detail provided in subsections below.

Features of Risk Adjustment Models

Feature	Description
Sample definition	Clearly define the sample(s), clinically appropriate for the quality measure's risk adjustment, and large enough for sufficient statistical power and precision.
Appropriate time intervals	Clearly define the time intervals for model variables, ensure they are sufficiently long to observe an outcome and recent enough to retain clinical credibility.
High data quality	Data should be reliable, valid, complete, comprehensive, and rely on as few proxy measures as possible.
Appropriate variable selection	Selected adjustment or stratification variables should be clinically meaningful.
Appropriate analytic approach	The analytic approach must be scientifically rigorous and defensible and consider multilevel or clustered organization of data (if necessary).
Complete documentation	Fully document risk adjustment and/or stratification details and the model's performance and disclose all known issues.

Guidance for Measure Developers

Define the Appropriate Measure Development Sample

Distributions of characteristics and their interactions within a measure development sample (i.e., the population used to develop the model) should be representative of the overall population to which the measure developer is applying the risk model. Measure developers should clearly and explicitly define the sample(s) as well as all inclusion, exclusion, and exception criteria they used to select the sample.

Risk adjustment models generalize well (i.e., fit the parent population) to the extent the samples used to develop, calibrate, and validate them appropriately represent the parent population. Measure developers need to explain their rationale for using selected samples and offer justification of the sample s appropriateness.

Appropriate Time Intervals

The time interval is the time frame the measure developer uses to determine cases for inclusion in the population of interest and outcome of interest and includes an index event and a period of time. The measure developer should clearly state the criteria used to formulate decisions regarding the selection of the time interval and explain these criteria in the quality measure documentation. The time interval criteria used to identify risk factors for the stated outcomes should be clinically appropriate and clearly stated (e.g., the risk factor occurs within 24 hours of admission). Risk factors should be present at the start of care to avoid mistakenly adjusting for factors arising due to deficiencies in measured care unless there is use of person-time adjustments. Outcomes should occur soon enough after care to establish they are the result of that care. For example, measure developers may use renal failure as one of the comorbidities for risk adjustment of a hospital mortality measure. If poor care received at the hospital caused the patient to develop renal failure after admission, it would be inappropriate to adjust for renal failure for that patient.

If not using person-time adjustments, the evaluation of outcomes must also be based on a standardized period of assessment. If there is no standardization of the periods of the outcome assessments, such as the assessment of events during hospitalization, there may be bias in the evaluation because measured entities have different practice patterns (e.g., varying lengths of stay).

High Data Quality

The measure developer must ensure the data they use for risk adjustment are of the highest quality possible. Considerations in determining the quality of data include

There is reliable data collection. The method of collection must be reproducible with minimal variation between one collection and another if the same population is the source.
Data are sufficiently valid for their purpose. Validation ultimately rests on the strength of the logical connection between the construct of interest and the results of operationalizing their measurement, recording, storage, and retrieval.
Data are sufficiently comprehensive to limit the number of proxy measures required for the model. Obtaining the actual information is sometimes impossible, so using proxy measures might be necessary for certain projects.
Data are as recent as possible. If the measure developer used 1990 data in a model designed for use tomorrow, many people would argue the health care system has changed so much since 1990, the model may not be relevant.
Data are as complete as possible. Data should contain as few missing values as possible. Missing values are difficult to interpret and lower the validity of the model.
Documentation and full disclosure of data sources, including the dates of data collection, manner of data cleaning, data manipulation techniques (if applicable), and the data s assumed quality.

Appropriate Variable Selection

The risk adjustment model variables should be clinically meaningful or related to clinically meaningful variables. When developing a risk-adjusted model, the clinical relevance of included variables should be apparent to subject matter experts (SMEs). When variables are clearly clinically relevant, they serve two purposes: the clinical relevance contributes to the face validity of the model and the likelihood the model explains variation identified by health care professionals, and/or the literature as important to the outcome. Parsimonious models and their outcome are likely to have the highest face validity and be optimal for use in a model. The measure developer must determine which risk factors to retain in the risk adjustment model, for example retained variables are clinically relevant and statistically significantly associated with the outcome(s).

Occasionally, the measure developer may consider including proxy variables in the risk adjustment model based on prior research. This situation may arise when direct assessment of a relevant variable is not possible, and there is a requirement for use of a substitute or proxy variable. However, the relevance of these substitute variables should be empirically appropriate for the clinical topic of interest. For example, medications taken might be useful as a proxy for illness severity or progression of a chronic illness, provided practice guidelines or prior studies clearly link the medication patterns to the illness severity or trajectory. Similarly, the measure developer should consider inclusion of variables previously shown to moderate the relationship between a risk adjustor and the quality measure. Moderating variables are interaction terms sometimes included in a model to understand complex information structures among variables (e.g., a prior mental health diagnosis may be only weakly associated with a measured outcome, but it may interact with another variable to strongly predict the outcome).

Moderating variables and interaction terms, when needed, require specialized data coding and interpretation.

Appropriate Analytic Approach

An appropriate statistical model is determined by many factors. Measure developers may use logistic regression or hierarchical logistic regression when the outcome is dichotomous. However, in certain instances, they may use the same data to develop a linear regression model provided that doing so does not violate key statistical assumptions. There is no intention to suggest logistic regression is appropriate to model continuous manifest variables (i.e., available data). Nonetheless, measure developers use various forms of logistic regression to model latent traits (i.e., inferred variables modeled through related observations) assumed to be continuous, except where the available data are dichotomous, such as the probability of receiving a specified health care service. Selecting the correct statistical model is imperative because an incorrect model can lead to erroneous or misleading results. The analytic approach should also consider any multilevel and/or clustered organization of data, which is typically present when assessing institutions such as hospitals from widespread geographic areas.

Risk factors retained in the model should account for substantive and significant variation in the outcome. Overall differences between adjusted and unadjusted outcomes should also be pragmatically and clinically meaningful. Moreover, risk factors should not be related to stratification factors. A statistician can guide the measure development team and recommend the most useful variable formats and appropriate models.

Complete Documentation

Transparency is one of the key design principles in the Blueprint Measure Lifecycle stages. When measure developers do not disclose the steps they used to create a risk adjustment model, others cannot understand or fully evaluate the model. The CMS CBE's Measure Evaluation Rubric promotes the full disclosure of all aspects of a risk adjustment model used in measure development.
For the sake of transparency, measure developers should fully describe the risk adjustment method used; performance of the risk adjustment model, its components, and its algorithms; sources of the data and methods used to clean or manipulate the data; and the code (e.g., SAS) and documentation for how to run the calculation code. Documentation should be sufficient to enable others to reproduce the findings. The expectation is that the quality measure documentation will incorporate statistical and methodological recommendations from a knowledgeable statistician to explain the model and justify its selection.

To promote a clear understanding of the model and rationale for decisions made, the risk adjustment methodology documentation should

Confirm the timeframes used in the model are
- an important attribute of the model
- clearly stated and appropriate
- appropriate for implementation in the selected program
Discuss the variables included in the model. If using social, geographic, and economic factors as risk adjustment variables, ensure they do not obscure gaps of care. CMS has a continued interest in identifying and mitigating gaps in care in clinical care areas/outcomes across patient demographics.
Justify the analytic approach/types of models used.
- Confirm the appropriateness of techniques used to assess the model.
- Review the predictive validity, discriminant validity, and overall fit of the model.

CMS posts methodology reports for most risk-adjusted measures on the CMS Measure Methodology page or the eCQI Resource Center.

Last Updated: May 2025