mhuffman 5 days ago

"The model used in the new study, called CheXzero, was developed in 2022 by a team at Stanford University using a data set of almost 400,000 chest x-rays of people from Boston with conditions such as pulmonary edema, an accumulation of fluids in the lungs. Researchers fed their model the x-ray images without any of the associated radiologist reports, which contained information about diagnoses. "

... very interesting that the inputs to the model had nothing related to race or gender, but somehow it still was able to miss diagnose Black and female patients? I am curious of the mechanism for this. Can it just tell which x-rays belong to Black or female patients and then use some latent racism or misogyny to change the diagnosis? I do remember when it came out that AI could predict race from medical images with no other information[1], so that part seems possible. But where would it get the idea to do a worse diagnosis, even if it determines this? Surely there is no medical literature that recommends this!

[1]https://news.mit.edu/2022/artificial-intelligence-predicts-p...

6
FanaHOVA 5 days ago

The non-tinfoil hat approach is to simply Google "Boston demographics", and think of how training data distribution impacts model performance.

> The data set used to train CheXzero included more men, more people between 40 and 80 years old, and more white patients, which Yang says underscores the need for larger, more diverse data sets.

I'm not a doctor so I cannot tell you how xrays differ across genders / ethnicities, but these models aren't magic (especially computer vision ones, which are usually much smaller). If there are meaningful differences and they don't see those specific cases in training data, they will always fail to recognize them at inference.

h2zizzle 5 days ago

Non-technical suggestion: if AI represents an aspect of the collective unconscious, as it were, then a racist society would produce latently racist training data that manifests in racist output, without anyone at any step being overtly racist. Same as an image model having a preference for red apples (even though there are many colors of apple, and even red ones are not uniformly cherry red).

The training data has a preponderance of examples where doctors missed a clear diagnosis because of their unconscious bias? Then this outcome would be unsurprising.

An interesting test would be to see if a similar issue pops up for obese patients. A common complaint, IIUC, is that doctors will chalk up a complaint to their obesity rather than investigating further for a more specific (perhaps pathological) cause.

protonbob 5 days ago

I'm going to wager an uneducated guess. Black people are less likely to go to the doctor for both economic and historical reasons so images from them are going to be underrepresented. So in some way I guess you could say that yes, latent racism caused people to go to the doctor less which made them appear less in the data.

encipriano 5 days ago

Arent black people like 10% of us population? You dont have ro look further

apical_dendrite 5 days ago

Where the data comes from also matters. Data is collected based on what's available to the researcher. Data from a particular city or time period may have a very different distribution than the general population.

ars 4 days ago

Men are also way less likely to go to Dr vs women. Yet this claims a bias against women as well.

cratermoon 5 days ago

> Can it just tell which x-rays belong to Black or female patients and then use some latent racism or misogyny to change the diagnosis?

The opposite. The dataset is for the standard model "white male", and the diagnoses generated pattern-matched on that. Because there's no gender or racial information, the model produced the statistically most likely result for white male, a result less likely to be correct for a patient that doesn't fit the standard model.

XorNot 5 days ago

The better question is just "are you actually just selecting for symptom occurrence by socioeconomic group?"

Like you could modify the question to ask "is the model better at diagnosing people who went to a certain school?" and simplistically the answer would likely seem to be yes.

searealist 4 days ago

Then why is the headline not "AI models miss disease in Asian patients" or even "AI models miss disease in Latino patients"?

It just so happens to align with what maximizes political capital in today's world.

daveguy 5 days ago

You really just have to understand one thing: AI is not intelligent. It's pattern matching without wisdom. If fewer people in the dataset are a particular race or gender it will do a shittier job predicting and won't even "understand" why or that it has bias, because it doesn't understand anything at a human level or even a dog level. At least most humans can learn their biases.

bilbo0s 5 days ago

Isn't it kind of clear that it would have to be that the data they chose was influenced somehow by bias?

Machines don't spontaneously do this stuff. But the humans that train the machines definitely do it all the time. Mostly without even thinking about it.

I'm positive the issue is in the data selection and vetting. I would have been shocked if it was anything else.