Considerations for Developing, Evaluating, and Maintaining Population Health Measures

Measure Conceptualization

Conceptualization of population health measures should identify opportunities for improvement at the population level, rather than only seeking to identify gaps or variations in clinical care. Similarly, information gathering and business case development should be at the population level to identify health differences among populations, including disparities among subpopulations. During measure conceptualization, measure developers should always consider whether to stratify and/or risk adjust the measure(s). Measure developers should estimate the potential for population level improvement as well as the potential benefits, burdens, and costs of achieving the population health goals.

Conceptualization of population health measures presents unique challenges for measure developers. CMS notes that the current health care delivery system lacks an incentive structure to support local problem-solving. For example, insurers have not traditionally paid measured entities and systems for their efforts, e.g., screening for social risk factors or coordinating with local community providers and governments beyond the clinical setting. Existing value-based purchasing programs were not rewarding coordinated community health improvement efforts. Although population health improvement is a priority goal, there are limited incentives tied to improvements or disincentives to worsening of population health.

CMS is working to change this while also addressing equity. For example, numerous CMS quality programs adopted Screening for Social Drivers of Health and Screen Positive Rate for Social Drivers of Health measures. The Accountable Care Organization and Merit-based Incentive Payment System (MIPS) added a health equity adjustment to the ACO and MIPS quality performance category score. CMS has released guidance to give states greater flexibility to address health disparities and health-related social needs among Medicaid beneficiaries in managed care. The next step is to evaluate the outcomes of these new measures and adjustments.

Although the focus of population health measures differs from clinical quality measures, measure development should address alignment of the population measures with existing or potential measures of clinical care and other drivers of population health improvement. These may include individual behaviors, prevention, and social drivers of health, e.g., housing, transportation, food security, economic stability, education, social and community context, access to health care, and neighborhood environment.

Environmental Scan

Where should measure developers go to find population health areas needing improvement? Measure developers need to expand their environmental scan search criteria beyond their usual sources for quality measures, e.g., CMS Measures Inventory Tool (CMIT) and Submission Tool and Repository (STAR) Measures Database. For example, the OECD has approximately 80 key indicators for population health and health system performance. These key indicators use data from official national statistics to compare countries in terms of health status and health-seeking behavior, access to and quality of health care, and the resources available for health (OECD, 2023). These key indicators may provide insights into population health areas needing further investigation and offer ideas for measure concepts. For information about conducting an environmental scan, see the Environmental Scans for Quality Measurement supplemental material.

  • Community Health Needs Assessment and Implementation Strategy
    The Patient Protection and Affordable Care Act of 2010 requires a community health needs assessment (CHNA) and implementation strategy every three years for all Section 501(c)(3) nonprofit hospitals (charitable hospitals) working with public health agencies and community members (Internal Revenue Service [IRS], n.d.). The IRS requires hospitals to submit their needs assessment and implementation strategy with their IRS Form 990 (or provide the URLs) and provide an annual description of how the hospital is addressing the needs identified in their CHNA and implementation strategy. The IRS also requires hospitals to make their CHNA and implementation strategy publicly available, which is usually only on the individual hospital’s website. There is no requirement for the CHNA to include measures. However, the implementation strategy must include an evaluation of the impact of any action taken since the last CHNA to address the significant health needs identified (IRS, 2014). There is no oversight on the quality of the content of these CHNAs, no public central repository (other than the IRS), no public state-based repository, and no widely available measures focused on measuring the impact of the hospitals’ implementation strategy on the population of the community. Some state hospital associations may collect CHNAs and the American Hospital Association’s Health Research & Educational Trust (HRET) has collected and analyzed 3000 CHNA’s from 2011-2016, which they make available to their members. The HRET website also has articles related to CHNAs.

    These CHNAs and implementation strategies have the potential to provide ideas for improvement opportunities in population health. However, a review of the first CHNAs in Texas (Pennel et al., 2016) found that few included health improvement or program goals and objectives. Ruggles (2020) noted the CHNA requirements have “surprisingly little accountability for improving outcomes” (p. 1). Fos et al. (2019) assessed whether the CHNA requirement incentivized North Carolina tax-exempt hospitals to increase spending in community health programs. They found the hospitals continue to spend primarily on patient care financial assistance and little on community health programs. They suggest meeting the CHNA requirement is more about compliance than a tool to improve population health. Stoto et al. (2019) reviewed the CHNAs and implementation strategies from 10 exemplary hospital systems. They found variability in the format and content in implementation strategies compared with CHNAs. Some hospital systems developed models with population-level goals, objectives, and strategies with clear accountability and metrics. A few of the implementation strategies they reviewed were less developed. Stoto et al. (2019) recommend strengthening the CHNA regulations to require reporting of the evaluation measures the hospitals intend to monitor based on an established community health improvement model.

  • Research Social Drivers of Health and Social Risk Factors
    Research into the effects of social drivers of health (SDOH) and social risk factors may also provide measure developers with ideas for population health measure concepts. For example, a recent analysis of state and local government spending on non-health care services (e.g., education, social services, environment, and housing) noted an association between increased spending and lower infant mortality rates among certain high-risk populations (Goldstein et al., 2020).
  • Healthy People
    Healthy People 2020 started and Health People 2030 continues with providing a set of broad population level goals and objectives broken down into five categories: health conditions, health behaviors, populations, settings and systems, and social determinants of health (HHS, Office of Disease Prevention and Health Promotion, n.d.). These goals and objectives may also be concepts for population health measure developers to consider. Healthy People 2030 continues to address SDOH. Healthy People 2030 identifies five SDOH domains each with multiple objectives: economic stability, education access and quality, health care access and quality, neighborhood and built environment, and social and community context.
  • Existing Population Health Measures
    Measure developers should look for existing measures currently identified as population health measures, but these may not meet the CMS definition. In 2015, IOM identified several existing measures that are indicators for population health, for example, life expectancy, overweight and obesity rates, and teenage pregnancy rates. The usual databases of measures, e.g., CMIT and STAR, have historically not included these types of measures. CMIT and STAR can also help identify existing measures to use as proxies for population health, e.g., communication between measured entities and patients for patient-centered care.

A recent scan focusing on existing clinician-level measures applicable to population health identified 248 measures. The reviewers then categorized the measures according to level of analysis using clinician/ clinician group, facility, health plan, integrated delivery system, and population. Some measures applied to more than one level of analysis. The table below shows the number of measures identified for each conceptual topic and the number of measures with population as the level of analysis (CMS, Health Services Advisory Group, 2021). For brevity, the table omits numbers for other levels of analysis.

Existing Quality Measures at Population Level of Analysis

Conceptual TopicNumber of Quality Measures IdentifiedNumber of quality measures with population as level of analysis
Access to care213
Clinical outcomes10819
Coordination of care and community services3913
Health behaviors265
Preventive care and screening505
Utilization of health services40
Measure Specification

The general processes for specifying population health measures are no different than other types of measures. See the Measure Specification content. There are, however, some specification building blocks needing closer consideration. The measure developer must distinguish population health measures from clinical quality measures.

Target/Initial population

Population level specification – geography-based

The geographic level of specification may include zip code, county, city, state, national, or other geographically-based areas.

Example – all adults, 18 years and older, living in zip code 20500

Population level specification – patient panel-based

Denominator specifications may be beneficial for use in assessing population health improvement based on the work of health care delivery systems. When specifying at a patient panel level, it is critical to avoid limiting the measure denominator to only patients who receive specific services. In other words, the denominator should include all patients in the patient panel without regard to particular services rendered or patient encounters that occur. Otherwise, the measures no longer address the health of the entire patient panel population, but rather only those that receive certain services. As such, they become clinical quality (measured entity-focused) measures assessing quality of the services rendered rather than an assessment of the health of the population.

Example – all Medicare beneficiaries in Acme ACO

Stratification

Most population health measures will need a stratification plan. As noted by the IOM (2015), factors outside of health care substantially shape the health of populations, e.g., social, environmental, individual behaviors. Stratification is necessary to provide actionable information to measured entities and policymakers, beyond the collection and sharing of data. Stratifying the data by race, ethnicity, language preference, sexual orientation, gender identity, age, disability, and SDOH, including those related to education and literacy, social and community context, economic stability, and neighborhood and built environment, can enable focused quality improvement activities.

As CMS moves toward population-based payments and shared risk, it is increasingly important for measured entities and payors to be able to quantify and address differences and disparities among the communities and populations served by CMS programs. Stratified data is the critical first step to improving the health of all individuals and families.

As part of the Reducing Provider and Patient Burden proposed rule, released December 10, 2020, CMS included a request for information on barriers to adopting standards, and opportunities to accelerate adoption of standards, related to social risk factors. CMS acknowledges health care “providers in value-based payment arrangements rely on comprehensive, high-quality data to identify opportunities to improve patient care and drive value.” The goal is to standardize and liberate these data for multiple reasons, such as to decrease patient reporting burden and increase the chances of connecting patients with appropriate community care and support.

Data Sources

As with clinical quality measures, measure developers must specify data sources for population health measures. Data sources may include clinical data (electronic health records [EHRs], registries, paper patient records), claims, surveys, patient assessments (e.g., Minimum Data Set), screening tools (e.g., the Accountable Health Communities Health-related Social Needs screening tool [Billioux et al., 2017]), and administrative data that can include census data, crime data, birth and death records, etc. The Gravity Project is working to change inconsistencies in existing SDOH data collection processes. The Gravity Project convenes interested parties across the country through an open and transparent collaborative process where they develop and test consensus-based standards to facilitate SDOH data capture and exchange across a variety of systems and settings of care and social services. The Gravity Project submitted SDOH-related data elements for inclusion in the United States Core Data for Interoperability (USCDI). The Office of the National Coordinator for Health Information Technology (ONC) incorporated these data elements into USCDI version 2.

However, most medical records and other clinical data are unable to describe population health for the total population group, and therefore, are insufficient. For example, although the measure developer may consider immunization status a measure of population health, if a measure denominator includes solely patients who receive certain clinical services or have documented patient encounters, the measure becomes one of clinical care quality. Immunization status for a population must consider the population. As such, public health data sources, such as those provided by CDC, or other repurposed data, e.g., crime statistics, number of grocery stores, community health assessments, and community health needs assessments, may be more valuable than the typical patient care data used for clinical quality measures. CDC’s National Center for Health Statistics administers and/or collects data from multiple surveys, e.g., National Health and Nutrition Examination Survey and the National Health Interview Survey, and vital statistics, which can provide population-level data for comparison. In the absence of appropriate data sources, survey development and implementation may be necessary.

Level of Analysis

The level of analysis for population health measures should be at the population level and not limited solely to patients who receive services. However, obtaining information on persons not receiving care is challenging. The measure developer must clearly define the population in the target/initial population.

Time Interval

What is the appropriate time interval for population health measures? Is one year long enough or do these measures require a longer time interval to determine significant changes? Measure developers should consider the appropriate time interval for measurement, bearing in mind a longer period of time may be necessary to identify significant changes at the population level.

Risk Adjustment

Outcome measures typically require risk adjustment where the purpose is to assess quality of clinicians or facilities. However, the intent of population health measures is typically to assess the actual outcomes experienced in the community regardless of the cause. However, if the intent is to narrow down the cause, then risk adjustment or stratification using factors not associated with the cause of interest may still be necessary. Either way, the measure developer should provide a justification for the choices made. See the Risk Adjustment content for more information. 

Measure Testing

Measure testing may be challenging due to the potential use of multiple data sources in a single measure and a lack of data, especially SDOH data. Lack of interoperability among data sources is likely. Consider the (in)completeness of data sources and data elements (e.g., incomplete SDOH data). The measure developer needs to be creative with their testing plan and should partner with a variety of interested parties including data owners. See the Measure Testing content.

For the purposes of testing measures of population health, the nature of the quality construct (inferences about underlying processes or structures [Messick, 1987]) determines the testing approach (see the nature of quality construct table below). The first consideration is which system owns the quality construct and then whether the construct is a public good. A public good is one for which consumption of the good by one person does not preclude consumption of the same good by another person (e.g., a city park, clean air).

Until recently, the distinction in attribution was more well defined. Hospitals and physicians provided clinical care; public health and social service agencies did not provide clinical services or they were very limited. With the increased emphasis on the importance of social risk factors, health care systems are more directly involved in addressing these social risk factors through the direct provision of or referral to services like housing or food security, and/or are formalizing collaborations with entities outside the health care system. For the purposes of defining the quality construct for measure testing, the measure developer needs to expand attribution for traditional measured entities to include the delivery of public health and social services, and/or to expand the measured entity to include both health care and non- health care providers.

The nature of the quality construct for population health measures

System owning the quality constructThe quality construct is a public goodThe quality construct is not a public good
Health carePopulation healthPreventive health care
Non-health carePublic healthSocial services

Assuming the quality construct is not a public good, once the measure developer defines the quality construct, then measure testing would proceed as with any quality measure with the focus on importance, scientific acceptability (reliability and validity), feasibility, and usability.

Measure validation is critical. If the intervention is a public good, then the validation should demonstrate that. If the focus is on allocative efficiency, then there is no need for validation. If the focus is on some population characteristic like “cohesion,” then the focus is on that construct. The reliability focus would be whether there is detectable variation in the quality construct across measured entities (signal), or whether there is overwhelming variation in factors independent of the quality construct (noise). Because there is an extended pathway between, for example, food (in)security-to- clinical care-to-outcome, one might surmise a greater influence of factors independent (e.g., transportation availability) of the quality construct, thereby reducing reliability. The measure developer might need to consider strategies to increase reliability, such as increasing the effective sample size, e.g., borrowing statistical strength of related process or outcome measures (NQF, 2015) or enhancing the information context (e.g., incorporating structure measures in reliability adjustment).

Measure Implementation

Because population health measures are not setting specific, their adoption would primarily be into CMS programs such as the Medicare Shared Savings Program (MSSP), Marketplace Quality Rating System, and Medicare Advantage program. CMS is using population health measures in Merit-Based Incentive Payment System Value Pathways (MVPs). The Foundational Layer identifies requirements for all MVPs including promoting interoperability performance category and population health measures. MVPs are one of three reporting options for clinicians and topics cover a variety of specialties. There are two population health measures available: Hospital-Wide, 30-Day, All- Cause Unplanned Readmission (HWR) Rate for the Merit-Based Incentive System (MIPS) Groups (CBE ID# 3495), and Clinician-Group Risk-Standardized Acute Hospital Admission Rate for Patients with Multiple Chronic Conditions under the Merit-based Incentive Payment System (CBE ID# 3597). Quality Payment Program (QPP) MVP participants must select one of the measures. Neither measure requires additional data submission from the clinician as both are claims-based measures.

Commercial insurers could adopt population health measures for continuous quality improvement and serve as a comparison with other commercial insurers. Communities could adopt population health measures to assess the success of pertinent implemented community programs. Measure results could also serve as input to community resource and intervention planning and impact policy decisions at the local, state, and national levels. The ideal plan is to use population health measures broadly, report them at the community level, and share results among participating clinicians, public health, community, and other organizations.

Consistent with the notion of a “shared measurement system,” this arrangement increases the likelihood of multiple interested parties coalescing around addressing the measure concept.

See the Measure Implementation content.

Measure Use, Continuing Evaluation, and Maintenance 

Population health measures are subject to the same three types of measure maintenance reviews as other types of measures – annual, triennial comprehensive, and early maintenance – using the measure evaluation criteria and the Measure Maintenance Reviews content.

Interested Party Engagement

Given the broad nature of population health measures, it is critical to include community members and organizations early in the development process. These would include local community organizations and local governments that address social needs. Community organizations, e.g., soup kitchens and homeless shelters, can provide important input as to the gaps in population health, the drivers of improvement, and the benefits of improvement. Measure developers should adhere to the latest Person and Family Engagement (PFE) Toolkit at the earliest stages of developing population health measures. Also see the Interested Party Engagement content, the Person and Family Engagement in Quality Measurement and Technical Expert Panels supplemental materials.

Public Comment

Soliciting public comments for population health measures should occur frequently throughout the Measure Lifecycle. The timing in which the measure developer solicits comments in the Measure Lifecycle may be different than other quality measures. The measure developer may want to target representatives of the populations and/or communities for measurement. These groups are not the usual readers and responders to the Federal Register and websites requesting comments from the public, so concerted targeting may be necessary.

Technical Expert Panel (TEP)

The composition of a TEP for a population health measure may vary from other measure development TEPs. The TEP should include representatives of the proposed population, group, and/or community for measurement. Again, as with public comment, concerted outreach may be necessary.

Key Points

Most current population health measures summarize population outcomes at a geographic level. Generally, these encompass health outcomes based on mortality or life expectancy, and survey-based measures of subjective health status, psychological state, or ability to function (Parish, 2010). Measure developers may need to expand their interested party outreach, e.g., community organizations. Multiple data sources may be necessary to include non-health care sources.

There is much more to learn about population health and population health measurement. While the principle of rigor in measure development remains the same, learning the details will come by doing and iterating.

Last Updated: