Skip to content
Home » Health » Even a small typo can throw off AI medical advice, MIT study says

Even a small typo can throw off AI medical advice, MIT study says

by DrMichaelLee

AI Medical Triage Faces Scrutiny Over Bias in Patient Communication

A new study from MIT indicates that artificial intelligence models utilized for patient message triage may be more sensitive to writing style than previously understood, potentially impacting patient care. This raises concerns about fairness, particularly for vulnerable groups, as these models are increasingly deployed in clinical settings.

AI’s Sensitivity to Language

Researchers at MIT have discovered that language models (LLMs) can be significantly swayed by stylistic nuances within patient messages. These nuances can include minor alterations such as typos, informal language, and even variations in spacing. The study, presented at the Association for Computing Machinery (ACM) Conference on Fairness, Accountability and Transparency, examined the impact of these changes across thousands of clinical scenarios.

“This is strong evidence that models must be audited before use in health care — which is a setting where they are already in use,”

Marzyeh Ghassemi, Ph.D., Senior Author, MIT

The study revealed that LLMs were more prone to recommending self-management over medical care when messages were modified, a difference of 7–9% across the tested models, including GPT-4. This highlights the potential for patients with health anxiety or those with non-native English fluency to be incorrectly advised to stay home when medical attention is needed. The CDC reports that approximately 1 in 5 adults in the U.S. experience some form of mental illness each year (CDC, 2024).

How the Study Worked

The research team conducted a three-step process to assess the effects of stylistic changes. They initially altered patient messages with minor, realistic changes. Then, they ran both the original and modified messages through the LLMs to collect treatment recommendations. Finally, they compared the responses, examining for consistencies and disparities. Human-validated answers served as a benchmark for accuracy.

The study also revealed that LLMs were more inclined to reduce care recommendations for female patients compared to male patients, even when gender markers were removed. Furthermore, the inclusion of extra white space amplified these discrepancies, increasing reduced care errors by more than 5% for female patients.

Implications for Healthcare

The research emphasizes the “brittleness” of AI in medical reasoning. Slight, non-clinical variations in how a patient writes can drastically alter care decisions. Human clinicians, however, were not influenced by the same alterations, reinforcing the fragility of LLMs in this context.

The study’s findings support more rigorous auditing and subgroup testing before deploying LLMs in critical healthcare settings. Researchers like Abinitha Gourabathina emphasize the importance of understanding the direction of errors: “Not recommending visitation when you should is much more harmful than doing the opposite.”

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

×
Avatar
World Today News
World Today News Chatbot
Hello, would you like to find out more details about Even a small typo can throw off AI medical advice, MIT study says ?
 

By using this chatbot, you consent to the collection and use of your data as outlined in our Privacy Policy. Your data will only be used to assist with your inquiry.

OSZAR »