MedPAC Wrestles With How to Fix Deep Flaws in Medicare Advantage Quality Metrics

— Comparing MA plans with fee-for-service quality called "a hornet's nest"

MedicalToday
 A photo of a table set up in front of a van advertising Medicare Advantage plans to passersby.

The methods used to compare quality between one Medicare Advantage (MA) plan and another are so seriously flawed that the system needs an overhaul, members of the Medicare Payment Advisory Commission (MedPAC) suggested last week.

A commission review has determined the program "is costly and not a good basis for judging quality," MedPAC principal policy analyst Ledia Tabor, MPH, told the commission during a presentation Thursday. It does not promote the use of high-value care, nor provide beneficiaries with meaningful information about local plan quality, she said.

It's also important because currently, the quality program -- based on a five-star ratings system -- pays higher-rated MA plans bonuses of some $15 billion a year from the Medicare trust funds. However, nearly every plan gets a high score. According to the Centers for Medicare & Medicaid Services (CMS), the average score for 2024 plans is stars, and only a few dozen MA plans received fewer than three stars. As of this year, 52% of beneficiaries are now enrolled in MA plans.

The commissioners' discussion about how to change the system came after Tabor explained a major part of the problem. Medicare uses more than 100 metrics in those star ratings for each plan under contract. But those measures are evaluated at the contract level -- nationally -- even when a contract covers as many as 2.6 million enrollees nationwide.

The commission has repeatedly, in 2010, 2018, and , recommended that plans be evaluated at local market-area levels since there is so much regional variation, with different providers and different plan networks.

What a beneficiary wants to know "is the MA score for my doctor. I don't really care about the aggregate plan number. I want to know my doctor, my hospital, how does MA perform in my community?" said commissioner Lynn Barr, MPH.

Commissioner Brian Miller, MD, MBA, noted that "we probably shouldn't have a 'Lake Wobegon effect' where the average MA plan is 4.5 stars in many counties." He added that there also should be a way to rate quality of care for fee-for-service (FFS) beneficiaries to compare them with MA plans. Miller said he is thinking of ways to rescue the rating system, although he quipped that it "still may need to go out back and meet its final demise."

The commissioners made their remarks after Tabor's of an alternative way of measuring MA quality of care: risk-adjusted rates of ambulatory care-sensitive (ACS) hospitalizations within each plan's market areas. These are hospitalizations for conditions that preventive strategies -- such as timely visits to a primary care provider or specialist, or certain screenings -- might have avoided.

The MedPAC analysis, based on 2021 MA encounter MedPAR (Medicare Provider Analysis and Review) data, showed wide variation in scores, with some market areas showing nearly twice the rate of avoidable ACS hospitalization rates as the better-performing market areas: 41.7 versus 22.4 admissions per 1,000 enrollees.

"The considerable variation in risk-adjusted ACS hospitalization rates across market areas suggests some relatively high performers that could be rewarded, as well as opportunities to improve the quality of care in some markets," Tabor said.

Some commissioners had suggestions for tweaking a metric on ACS hospitalizations.

Commissioner Lawrence Casalino, MD, PhD, suggested that including ED visits in the ACS hospitalizations metric might be useful in comparing MA plans regionally.

Commissioner Cheryl Damberg, PhD, MPH, wondered how these rates vary by the availability of primary care in the plan, or the amount of plan spending on primary care. "The plans could vary substantially," she said.

Commissioner Stacie Dusetzina, PhD, said that in her view, an important metric for comparing MA quality in the context of ACS hospitalizations is how easily enrollees are able to access specialty care, a marker for network adequacy.

Bringing quality metrics down to a local level is incredibly relevant because it's what patients want, said commissioner Greg Poulsen, MBA. "The same program is variable across geography. And it's based on the providers that they work with," he said. "The way they pay those providers can vary geographically."

One commissioner raised questions about some quality measures that, she said, possibly shouldn't be. "You may not agree with me," said commissioner Betty Rambur, PhD, RN, but the presentation under discussion "reminded me about how concerned I am about measurement-driven overscreening, screening-driven overtreatment, and the cascade of events that occur from that and which can cause people to become patients when they really shouldn't be."

She was referring to how improvements in treatment have diminished the mortality benefit from breast cancer screening, and how colonoscopy screening in much older people can cause dehydration or other harms.

MA vs Fee-for-Service 'Hornet's Nest'

The commissioners listened to a second analysis during the session: a review of literature comparing quality between MA plans and FFS, which has no star-rating system now.

But the task of that analysis was a tough one, said MedPAC senior analyst Katelyn Smalley, PhD, MSc. She said a review of literature in 2020 found some studies showed MA plans outperformed FFS on some metrics, while others reported better quality or patient experience in FFS than in MA, and still other studies found no difference at all.

So MedPAC staff looked for studies done since 2020. They found substantial variation in the populations studied, the metrics reviewed, the data sources used, and the beneficiary population subsets that were included. None of those studies could be extrapolated to a general population, Smalley said.

Even when studies used common metrics -- preventive care, readmissions, mortality, and surgical complications -- they differed on how they defined their outcomes. And all of the studies had methodological challenges that limited their reliability, she said.

Another problem Smalley noted is that MA plans tend to code more patient diagnoses than physicians treating FFS beneficiaries, so patients' health status in MA cannot be compared with that of patients in FFS.

She also said that "beneficiaries who choose to enroll in MA likely differ in meaningful ways from those who choose fee-for-service," which "does complicate comparisons between the programs when those differences are unobservable and are poorly understood."

"What a hornet's nest," said Barr.

"We're comparing apples and oranges.... I think it's incredibly important for this work for patients, taxpayers, and CMS to understand the quality of care they're getting with the choices they make. But I don't see a path forward in the current structure where we're actually going to get meaningful information. I honestly think we should abandon this work," Barr said. "I don't think you're ever going to get to an answer that says, 'This is fee-for-service versus MA,' because they are so different."

She noted that perhaps quality metrics in the Medicare Access and CHIP Reauthorization Act -- commonly known as MACRA -- could be synced with the same metrics in MA, "and instead start thinking about how we have one quality program for all of Medicare beneficiaries, and then we can start analyzing the differences."

  • author['full_name']

    Cheryl Clark has been a medical & science journalist for more than three decades.