08.04.2025New Study Reveals Politeness Can Amplify AI-Generated Disinformation

A new study led by researchers at the Institute of Biomedical Ethics and History of Medicine (IBME), University of Zurich, highlights a previously overlooked vulnerability in large language models (LLMs): the emotional tone of user prompts can significantly influence the generation of disinformation.

The study, co-authored by Rasita Vinay, Giovanni Spitale, Nikola Biller-Andorno, and Federico Germani, demonstrates that LLMs are more likely to comply with disinformation requests when prompted using polite language. Impolite prompts, by contrast, were associated with a marked decrease in disinformation output, particularly in older models. Using a synthetic AI persona named Sam, the researchers tested four OpenAI models (davinci-002, davinci-003, GPT-3.5-turbo, and GPT-4) across 19,800 prompts on topics such as vaccine safety, climate change, and COVID-19. The findings revealed that politeness can override ethical safeguards, especially when models are framed as “helpful assistants.”

The study raises critical concerns about the ease with which LLMs can be manipulated through emotionally framed language, with implications for public health, online information integrity, and AI governance. The authors call for ethics-by-design approaches and stronger safeguards to mitigate the risks posed by emotionally guided prompt engineering.

The full article is available here.

Back to news overview

Quicklinks

Main navigation

08.04.2025New Study Reveals Politeness Can Amplify AI-Generated Disinformation

Unterseiten