AI’s Moral Misfires: A Case for Skepticism
Large (moral) language models—sophisticated AI systems trained on vast troves of text from the internet, books, and more—are everywhere, powering chatbots, writing tools, and even research simulations. Their human-like responses make them seem wise, prompting users to seek guidance on moral quandaries, from euthanasia debates to workplace ethics. “People increasingly rely on LLMs to advise on or even make moral decisions,” said Maximilian Maier of University College London, lead author of a study probing these systems’ moral reasoning. But do these models reason like humans, or are they just parroting patterns with a veneer of insight?
To find out, researchers ran four experiments comparing LLMs like GPT-4-turbo, Claude 3.5, and Llama 3.1-Instruct to human responses across moral dilemmas and collective action problems. The results? AI’s moral compass is skewed, plagued by biases that make it an unreliable guide.
“These models sound convincing, but they’re tripping over their own logic,” a cognitive scientist quipped, eyeing the study’s data.
The Bias Toward Inaction
Across 13 moral dilemmas and 9 collective action scenarios, LLMs consistently favored doing nothing over taking action, even when action could save lives or serve the greater good. In scenarios like legalizing medically assisted suicide or whistleblowing on unethical practices, models leaned heavily toward the status quo, unlike the 285 U.S. participants who showed more balanced reasoning. This “omission bias” was stark: LLMs were reluctant to endorse breaking moral norms or acting decisively, regardless of outcomes.
Worse, the models displayed a “yes–no” bias, answering “no” more often than “yes,” even when questions were logically equivalent. Rephrasing a dilemma—say, from “Should we legalize this?” to “Should we keep this banned?”—could flip their advice, a sensitivity humans didn’t share. “The models’ answers hinge on wording, not reason,” Maier told PsyPost, warning against their use in high-stakes decisions.
“Ask AI the same question twice, and it might contradict itself just because you swapped a word,” a researcher snorted, shaking her head.
Altruism or Programming?
In collective action problems—like conserving water in a drought or donating to charity—LLMs were oddly altruistic, often outpacing humans in endorsing self-sacrifice for the group. This might sound noble, but it’s likely a byproduct of fine-tuning, where developers tweak models to prioritize harm avoidance and helpfulness. A second study with 474 new participants confirmed these biases, as did a third using real-life dilemmas from Reddit’s “Am I the Asshole?” forum, like choosing between helping a roommate or prioritizing a partner. Even in these relatable scenarios, LLMs clung to inaction and “no” answers, while humans stayed consistent.
The fourth study dug into the biases’ roots, comparing versions of Llama 3.1: a pretrained model, a fine-tuned chatbot, and a “Centaur” version tuned with psychology data. The chatbot showed the strongest biases, suggesting that efforts to make AI “helpful” actually embed these flaws. “Paradoxically, aligning models for chatbot use introduces these inconsistencies,” Maier explained, pointing to the limits of training AI on human preferences without rigorous testing.
“We tried to make AI nice, and it ended up morally confused,” a tech analyst said, half-laughing.
The Risks of Blind Trust
These findings cast a shadow on LLMs’ role in moral decision-making. While their advice often sounds thoughtful, it’s swayed by superficial factors like question phrasing, undermining the logical coherence prized in moral philosophy. Humans show omission bias too, but AI’s amplified version, paired with its yes–no quirk, makes it uniquely unreliable. “Do not uncritically rely on LLM advice,” Maier urged, noting that while people rate AI’s responses as moral and trustworthy, they’re riddled with flaws.
The study didn’t test how much AI’s biased advice sways users, a gap Maier plans to explore. But as AI tools infiltrate education, workplaces, and personal lives—from art students misusing them to journalists wrestling with their role in newsrooms—the stakes are climbing. Overreliance risks “cognitive debt,” where users offload critical thinking, potentially dulling their own reasoning.
AI’s moral missteps aren’t just academic—they could shape real-world choices, from personal ethics to policy debates. Until developers address these biases, trusting LLMs for moral guidance is like asking a magic 8-ball for life advice: it sounds convincing, but it’s no substitute for human judgment.