Racial bias in a medical algorithm favors white patients over sicker black patients

Sendhil Mullainathan, the Roman Family University Professor of Computation and Behavioral Science at Chicago Booth. Photo: Chicago Booth.

A widely used algorithm that predicts which patients will benefit from extra medical care dramatically underestimates the health needs of the sickest black patients, amplifying long-standing racial disparities in medicine, researchers have found.

The problem was caught in an algorithm sold by a leading health services company, called Optum, to guide health care decision-making for millions of people. But the same issue almost certainly exists in other tools used by other private companies, nonprofit health systems and government agencies to manage the health care of about 200 million people in the United States each year, the scientists reported in the journal Science.

Correcting the bias would more than double the number of black patients flagged as at risk of complicated medical needs within the health system the researchers studied, and they are already working with Optum on a fix. When the company replicated the analysis on a national data set of 3.7 million patients, they found that black patients who were ranked by the algorithm as equally as in need of extra care as white patients were much sicker: They collectively suffered from 48,772 additional chronic diseases.

“It’s truly inconceivable to me that anyone else’s algorithm doesn’t suffer from this,” said Sendhil Mullainathan, a professor of computation and behavioral science at the University of Chicago Booth School of Business, who oversaw the work. “I’m hopeful that this causes the entire industry to say, ‘Oh, my, we’ve got to fix this.'”

The algorithm wasn’t intentionally racist – in fact, it specifically excluded race. Instead, to identify patients who would benefit from more medical support, the algorithm used a seemingly race-blind measure: how much patients would cost the health care system in the future. But cost isn’t a race-neutral measure of health care need. Black patients incurred about $1,800 less in medical costs per year than white patients with the same number of chronic conditions; thus the algorithm scored white patients as equally at risk of future health problems as black patients who had many more diseases.

Machines increasingly make decisions that affect human life, and big organizations – particularly in health care – are trying to leverage massive data sets to improve how they operate. They utilize data that may not appear to be racist or biased, but may have been heavily influenced by longstanding social, cultural and institutional biases – such as health care costs. As computer systems determine which job candidates should be interviewed, who should receive a loan or how to triage sick people, the proprietary algorithms that power them run the risk of automating racism or other human biases.

In medicine, there is a long history of black patients facing barriers to accessing care and receiving less effective health care. Studies have found black patients are less likely to receive pain treatment, potentially lifesaving lung cancer surgery or cholesterol-lowering drugs compared to white patients. Such disparities likely have complicated roots, including explicit racism, access problems, lack of insurance, mistrust of the medical system, cultural misunderstandings, or unconscious biases that doctors may not even know they have.

Mullainathan and his collaborators discovered that the algorithm they studied, which was designed to help health systems target patients who would have the greatest future health care needs, was actually predicting how likely people were to use a lot of health care and rack up high costs in the future. Since black patients generally use health care at lower rates, the algorithm was less likely to flag them as likely to use lots of health care in the future.

The algorithm would then deepen that disparity by flagging healthier white patients as in need of more intensive care management.

“Predictive algorithms that power these tools should be continually reviewed and refined, and supplemented by information such as socio-economic data, to help clinicians make the best-informed care decisions for each patient,” Optum spokesman Tyler Mason said. “As we advise our customers, these tools should never be viewed as a substitute for a doctor’s expertise and knowledge of their patients’ individual needs.”

Ruha Benjamin, an associate professor of African American studies at Princeton University, drew a parallel to the way Henrietta Lacks, a young African American mother with cervical cancer, was treated by the medical system. Lacks is well known now because her cancer cells, taken without her consent, are used throughout modern biomedical research. She was treated in the Negro wing of Johns Hopkins Hospital in an era when hospitals were segregated. Imagine if today, Benjamin wrote in an accompanying article, Lacks were “digitally triaged” with an algorithm that didn’t explicitly take into account her race, but underestimated her sickness because it was using data that reflected historical bias to project her future needs. Such racism, though not driven by a hateful ideology, could have the same result as earlier segregation and substandard care.

“I am struck by how many people still think that racism always has to be intentional and fueled by malice. They don’t want to admit the racist effects of technology unless they can pinpoint the bigoted boogeyman behind the screen,” Benjamin said.

The software used to predict patients’ need for more intensive medical support was an outgrowth of the Affordable Care Act, which created financial incentives for health systems to keep people well instead of waiting to treat them when they got sick. The idea was that it would be possible to simultaneously contain costs and keep people healthier by identifying those patients at greatest risk for becoming very sick and providing more resources to them. But because wealthy, white people tend to utilize more health care, such tools could also lead health systems to focus on them, missing an opportunity to help some of the sickest people.

Christine Vogeli, director of evaluation and research at the Center for Population Health at Partners HealthCare, a nonprofit health system in Massachusetts, said when her team first tested the algorithm, they mapped the highest scores in their patient population and found them concentrated in some of the most affluent suburbs of Boston. That led them to use the tool in a limited way, supplementing it with other information, rather than using it off the shelf.

“You’re going to have to make sure people are savvy about it . . . or you’re going to have an issue where you’re only serving the richest and most wealthy folks,” Vogeli said.

Such biases may seem obvious in hindsight, but algorithms are notoriously opaque because they are proprietary products that can cost hundreds of thousands of dollars. The researchers who conducted the new study had an unusual amount of access to the data that went into the algorithm and what it predicted.

They also found a relatively straightforward way to fix the problem. Instead of just predicting which patients would incur the highest costs and use the most health care in the future, they tweaked the algorithm to make predictions about their future health conditions.

Suchi Saria, a machine learning and health care expert at Johns Hopkins University, said the study was fascinating because it showed how, once a bias is detected, it can be corrected. Much of the scientific study of racial disparities in medicine provides evidence of inequity, but correcting those problems might require sweeping social and cultural changes, as well as individual behavior changes by thousands of providers. In contrast, once a flawed algorithm is identified, the bias can be removed.

“The cool thing is we could easily measure the bias that has historically existed, switch out the algorithm and correct the bias,” Saria said. The trickier part may be developing an oversight mechanism that will detect the biases in the first place.

Saria said that one possibility is that data experts could potentially test companies’ algorithms for bias, the same way security firms test whether a companies’ cyber defenses are sufficient.



Please enter your comment!
Please enter your name here