Animats 5 days ago

What's so striking is how strongly race shows in X-rays. That's unexpected.

5
dekhn 5 days ago

It doesn't seem surprising at all. Genetic history correlates with race, and genetic history correlates with body-level phenotypes; race also correlates with socioeconomic status which correlates with body-level phenotypes. They are of course fairly complex correlations with many confounding factors and uncontrolled variables.

It has been controversial to discuss this and a lot of discussions about this end up in flamewars, but it doesn't seem surprising, at least to me, from my understanding of the relationship between genetic history and body-level phenotypes.

KittenInABox 5 days ago

What is the body-level phenotype of a ribcage by race?

I think what baffles me is that black people as a group are more genetically diverse than every other race put together so I have no idea how you would identify race by ribcage x-rays exclusively.

dekhn 5 days ago

I use the term genetic history, rather than race, as race is only weakly correlated with body level phenotypes.

If your question is truly in good faith (rather than a "I want to get in argument "), then my answer is: it's complicated. Machine learning models that work on images learn extremely complicated correlations between pixels and labels. If on average, people with a specific genetic history had slightly larger ribcages (due to their genetics, or even socioeconomic status that correlated with genetic history), that would exhibit in a number of ways in the pixels of a radiograph- larger bones spread across more pixels, density of bones slightly higher or lower, organ size differences, etc.

It is true that Africa has more genetic diversity than anywhere else; the current explanation is that after humans arose in africa, they spread and evolved extensively, but only a small number of genetically limited groups left africa and reproduced/evolved elsewhere in the world.

KittenInABox 5 days ago

I am genuinely asking because it makes no sense to me that a genetically diverse group are distinctly identifiable by their ribcage bones in an x-ray. If it's something more specific like AI sucks at statistically larger ribcages, statistically noticeable bone densities, or similar, okay. But something like so-small-humans-cannot-tell-but-is-simultaneously-widely-applicable-to-a-large-genetic-population is utterly baffling to me.

dekhn 4 days ago

I dunno. My perspective is that I've worked in ML for 30+ years now and over time, unsupervised clustering and direct featurization (IE, treating the image pixel as the features, rather than extracting features) have shown great utility in uncovering subtle correlations that humans don't notice. Sometimes, with careful analysis, you can sort of explain these ("it turns out the unlabelled images had the name of the hospital embedded in them, and hospital 1 had more cancer patients than hospital 2 patients because it was a regional cancer center, so the predictor learned to predict cancer more often for images that came from hospital 1") while other cases, no human, even a genius, could possibly understand the combination of variables that contributed to an output (pretty much anything in cellular biology, where billions of instances of millions of different factors act along with feedback loops and other regulation to produce systems that are robust to perturbations).

I concluded long ago I wasn't smart enough to understand some things, but by using ML, simulations, and statistics, I could augment my native intelligence and make sense of complex systems in biology. With mixed results- I don't think we're anywhere close to solving the generalized genotype to phenotype problem.

bflesch 4 days ago

Sounds like "geoguesser" players who learn to recognize google street view pictures from a specific country by looking at the color of the google street view car or a specific piece of dirt on the camera lens.

dekhn 4 days ago

Yeah, there's also an likely apocryphal story about tanks and machine learning: https://gwern.net/tank

The more you work with large-scale ML systems the more you develop an intuition for these kinds of properties. If you work a lot with debugging models and training data, or even just dimensionality reduction and matrix factorization, you begin to realize that many features are highly correlated with each other, often being close to scaled linear.

echoangle 4 days ago

> it makes no sense to me that a genetically diverse group are distinctly identifiable by their ribcage bones in an x-ray

I don't see how diversity would prevent identification. Butterflies are very diverse, but I still recognize one and don't think it's a bird. As long as the diversity is constrained to specific features, it can still be discriminated (and even if it's not, it technically still could be by just excluding everything else).

stevenhuang 4 days ago

If differences exist then statistical methods will have a better chance at finding them than human intuition, yes. I'm not sure why this is baffling to you.

Avshalom 5 days ago

Africa is extremely diverse but due to the slave trade mostly drawing from the Gulf of Guinea (and then being, uh... artificially selected in addition to that) 'Black' -as an American demographic- is much less so.

goatlover 4 days ago

Ignoring African immigrants, mixed race, black Latinos, etc.

lesuorac 5 days ago

If you have 2 samples where one is highly concentrated around 5 and the other is dispersed more evenly between 0 and 10 then for any value of 5 you should guess Sample 1.

But anyways, the article links out to a paper [1] but unfortunately the paper tries to theorize things that would explain how and they don't find one (which may mean the AI is cheating imo not theirs).

[1]: https://www.thelancet.com/journals/landig/article/PIIS2589-7...

intuitionist 4 days ago

Sub-Saharan Africans are extremely genetically diverse but a sample of ~100 Black Americans is unlikely to have any Khoekhoe or Twa representation.

Anyway it’s possible that the model can pick up on other cues as well; if you had some X-rays from a hospital in Portland, Oregon and some from a hospital in Montgomery, Alabama and some quirk of the machine in Montgomery left artifacts that a model could pick up on, the presence of those artifacts would be quite correlated with race.

danielmarkbruce 5 days ago

The fact that the vast majority of physical differences don't matter in the modern world doesn't mean they don't actually exist..

DickingAround 5 days ago

This is a good point; a man or woman sitting behind a desk doing correlation analysis are going to look very similar in their function to a business. But they probably physically look pretty distinct to an x-ray picture.

kjkjadksj 5 days ago

Race has such striking phenotypes on the outside it should come as no surprise there are also internal phenotypes and significant heterogeneity.

banqjls 5 days ago

But is it really?

sergiotapia 5 days ago

It's odd how we can segment between different species in animals, but in humans it's taboo to talk about this. Threw the baby out with the baby water. I hope we can fix this soon so everybody can benefit from AI. The fact that I'm a male latino should be an input for an AI trained on male latinos! I want great care!

I don't want pretend kumbaya that we are all humans in the end. That's not true. We are distinct! We all deserve love and respect and care, but we are distinct!

schnable 5 days ago

That's because humans are all the same species.

sdsd 5 days ago

In terms ofLinnaean taxonomy, and Chihuahuas and wolves are also the same species, in that they can reproduce fertile offspring. We instead differentiate them using the less objective subspecies classification. So it appears that with canines we're comfortable delineating subspecies, why not with humans?

I don't think we should, but your particular argument seems open to this critique.

sergiotapia 5 days ago

yes this is what I was referring to. I think it's time we become open to this reality to improve healthcare for everybody.