I came across a fascinating Microsoft research paper on MedFuzz (https://www.microsoft.com/en-us/research/blog/medfuzz-explor...) that explores how adding extra, misleading prompt details can cause large language models (LLMs) to arrive at incorrect answers.
For example, a standard MedQA question describes a 6-year-old African American boy with sickle cell disease. Normally, the straightforward details (e.g., jaundice, bone pain, lab results) lead to “Sickle cell disease” as the correct diagnosis. However, under MedFuzz, an “attacker” LLM repeatedly modifies the question—adding information like low-income status, a sibling with alpha-thalassemia, or the use of herbal remedies—none of which should change the actual diagnosis. These additional, misleading hints can trick the “target” LLM into choosing the wrong answer. The paper highlights how real-world complexities and stereotypes can significantly reduce an LLM’s performance, even if it initially scores well on a standard benchmark.
Disclaimer: I work in Medical AI and co-founded the AI Health Institute (https://aihealthinstitute.org/).
> information like low-income status, a sibling with alpha-thalassemia, or the use of herbal remedies
Heck, even the ethnic-clues in a patient's name alone [0] are deeply problematic:
> Asking ChatGPT-4 for advice on how much one should pay for a used bicycle being sold by someone named Jamal Washington, for example, will yield a different—far lower—dollar amount than the same request using a seller’s name, like Logan Becker, that would widely be seen as belonging to a white man.
This extends to other things, like what the LLM's fictional character will respond-with when it is asked about who deserves sentences for crimes.
[0] https://hai.stanford.edu/news/why-large-language-models-chat...
That seems to be identical to creating an correlation table on market places and check the relationship between price and name. Names associated with higher economical status will correlate with higher price. Take a random name associated with higher economical status, and one can predict a higher price than a name that is associated with lower economical status.
As such, you don't need an LLM to create this effect. Math will have the same result.
I'm not sure what point you're trying to make here. It doesn't matter what after-the-fact explanation someone generates to try to explain it, or whether we could purposely do the bad thing more efficiently with manual code.
It AustrianPainterLLM has an unavoidable pattern of generating stories where people are systematically misdiagnosed / shortchanged / fired / murdered because a name is Anne Frank or because a yarmulke in involved, it's totally unacceptable to implement software that might "execute" risky stories.
When looking for meaning in correlations, its important to understand that a correlation does not mean that there aught to be correlation, nor that correlation mean causation. It only mean that one can calculate a correlation.
Looking for correlations between sellers name and used bike prices is only going to return a proxy for social economic status. If one accounts for social economic status the difference will go away. This mean that the question given to the LLM lacks any substance for which a meaningful output can be created.
It's almost as if you'd want to not feed what the patient says directly to an LLM.
A non-trivial part of what doctors do is charting - where they strip out all the unimportant stuff you tell them unrelated to what they're currently trying to diagnose / treat, so that there's a clear and concise record.
You'd want to have a charting stage before you send the patient input to the LLM.
It's probably not important whether the patient is low income or high income or whether they live in the hood or the uppity part of town.
> It's almost as if you'd want to not feed what the patient says directly to an LLM.
> A non-trivial part of what doctors do is charting - where they strip out all the unimportant stuff you tell them unrelated to what they're currently trying to diagnose / treat, so that there's a clear and concise record.
I think the hard part of medicine -- the part that requires years of school and more years of practical experience -- is figuring out which observations are likely to be relevant, which aren't, and what they all might mean. Maybe it's useful to have a tool that can aid in navigating the differential diagnosis decision tree but if it requires that a person has already distilled the data down to what's relevant, that seems like the relatively easy part?
By the way, the show The Pitt currently on Max touches on some of this stuff with a great deal of accuracy (I'm told) and equal amounts of empathy. It's quite good.
Yes - theoretically, some form of ML/AI should be very good at charting the relevant parts, prompting the doctor for follow-up questions & tests that would be good to know to rule out certain conditions.
The harder problem would be getting the actual diagnosis right, not filtering out irrelevant details.
But it will be an important step if you're using an LLM for the diagnosis.
I generally agree, however socioeconomic and environmental factors are highly correlated with certain medical conditions (social determinants of health). In some cases even causative. For example, patients who live near an oil refinery are more likely to have certain cancers or lung diseases.
So that's the important part, not that they're low income.
Sure, but correlation is correlation. Ergo 'low income', as well as affections or causes of being 'low income' are valid diagnostic indicators.
> a sibling with alpha-thalassemia
I have no clue what that is or why it shouldn't change the diagnosis, but it seems to be a genetic thing. Is the problem that this has nothing to do with the described symptoms? Because surely, a sibling having a genetic disease would be relevant if the disease could be a cause of the symptoms?
In medicine, if it walk like a horse and talks like a horse, it’s a horse. You don’t start looking into the health of relatives when your patient tells the full story on their own.
Sickle cell anemia is common among African Americans (if you don’t have the full-blown version, the genes can assist with resisting one of the common mosquito-borne diseases found in Africa, which is why it developed in the first place I believe).
So, we have a patient in the primary risk group presenting with symptoms that match well with SCA. You treat that now, unless you have a specific reason not to.
Sometimes you have a list of 10-ish diseases in order of descending likelihood, and the only way to rule out which one it isn’t, is by seeing no results from the treatment.
Edit: and it’s probably worth mentioning no patient ever gives ONLY relevant info. Every human barrages you with all the things hurting that may or may not be related. A doctor’s specific job in that situation is to filter out useless info.
Unfortunately, humans talking to a doctor give lots of additional, misleading hints...
Can't the same be said for humans though? Not to be too reductive, but aren't most general practitioners just pattern recognition machines?
I'm sure humans can make similar errors, but we're definitely less suggestible than current language models. For example, if you tell a chat-tuned LLM it's incorrect, it will almost always respond with something like "I'm sorry, you're right..." A human would be much more likely to push back if they're confident.
Sure, “just” a machine honed over millions of years and trained on several years of specific experience in this area.
You are being too reductive saying humans are "just pattern recognition machines", ignoring everything else about what makes us human in favor of taking an analogy literally. For one thing, LLMs aren't black or female.