home bbs files messages ]

Just a sample of the Echomail archive

<< oldest | < older | list | newer > | newest >> ]

 Message 1866 
 Mike Powell to All 
 Why yes-man AI could sink 
 24 Oct 25 09:46:33 
 
TZUTC: -0500
MSGID: 1623.consprcy@1:2320/105 2d60cda0
PID: Synchronet 3.21a-Linux master/123f2d28a Jul 12 2025 GCC 12.2.0
TID: SBBSecho 3.28-Linux master/123f2d28a Jul 12 2025 GCC 12.2.0
BBSID: CAPCITY2
CHRS: ASCII 1
FORMAT: flowed
Why yes-man AI could sink your business strategy  and how to stop it

Date:  Thu, 23 Oct 2025 14:27:03 +0000

Description:
Businesses risk flawed decisions as generalist AI hallucinates and
flattersspecialist models ensure accuracy.

FULL STORY
======================================================================
Generative AI is quickly becoming a ubiquitous tool in modern business .
According to McKinsey, 78% of companies are now leveraging AIs ability to
automate and elevate productivity  up from 55% in 2024. 

However, these systems arent without their flaws. Companies are becoming
increasingly aware of the issues associated with generalist large language
models, such as their eagerness to provide users with answers  even if they
arent factually correct. 

Hallucinations are a well-documented challenge. Indeed, OpenAIs research
revealed that its own o3 and o4-mini models hallucinated 33% and 48% of the
time respectively when tested by the companys PersonQA benchmark  designed to
measure the ability of models to answer short, fact-seeking questions. 

For organizations relying on generalist large language models to guide
decisions, their tendency to invent facts is a serious liability. Yet it is
not the only one. Equally, these mainstream models also present the issue of
sycophantic responses  when users perspectives are overly validated,
regardless of the truth.

How sycophancy can exacerbate yes-man AI 

While there is a much greater spotlight on hallucinations, yes-man models 
that wont advise users when they are wrong (and actually justify their
arguments with sycophantic responses) are in many ways more dangerous to
decision-making. When the default of an AI model is to agree, it can 
reinforce biases and entrench incorrect assumptions. 

Having rolled out (and quickly retracted) an update in April 2025 that made
its models noticeably more sycophantic, OpenAIs own researchers admitted that
people-pleasing responses can raise safety concerns around issues like mental
health, emotional over-reliance, or risky behavior. 

Concerningly, a study by Anthropic researchers looking at the way in which
human feedback can encourage sycophantic behavior showed that AI assistants
may modify accurate answers when questioned by the user, and ultimately give
an inaccurate response. 

Meanwhile, research has also shown that both humans and preference models
(PMs) prefer convincingly written sycophantic responses over correct ones a
non-negligible fraction of the time. 

Thats a worrisome combination. Not only do  generalist large language models
sometimes alter correct answers to appease users, but people themselves often
prefer these agreeable, sycophantic responses over factual ones. 

In effect, the generalist large language models are reinforcing users views
even when those views are wrong  creating a harmful loop in which validation
is valued above accuracy.

The issue of sycophancy in high stakes settings 

In high-stakes business settings such as strategic planning, compliance, risk
management or dispute resolution, this presents a serious risk. 

Looking at the latter example of dispute resolution, we can see how the 
issues of sycophancy arent limited to factual correctness but also extend to
tone and affirmation. 

Unlike in customer service  where a flattering, sycophantic answer may build
satisfaction  flattery is a structural liability in disputes. If a model
echoes a users sense of justification (i.e., youre right to feel that way),
then the AI may validate their perceived rightness, leading them to enter a
negotiation more aggressively. 

In this sense, that affirmation can actively raise the stakes of
disagreements, with users taking the AIs validation as implicit endorsement,
hardening their positions and making compromise more difficult. 

In other cases, models might validate both parties equally (i.e., you both
make strong points), which can create a false equivalence when one sides
position is actually weaker, harmful, or factually incorrect.

Greater segmentation and specialist AI are needed

The root of the problem lies in the purpose of generalized AI models like
ChatGPT. These systems are designed to be helpful, engaging in casual Q&A  
not for the rigorous impartiality that applications like dispute resolution
demand. Their very architecture rewards agreement and smooth conversation,
rather than critical evaluation. 

It is for this reason that strong segmentation is inevitable. While well
continue to see consumer-grade LLMs for casual use, organizations need to
adopt specialist AI models for more sensitive or business-critical functions
that are specifically engineered to avoid the pitfalls of hallucination and
sycophancy. 

What success looks like for these specialist AI models will be defined by 
very different metrics. In the case of dispute resolution, systems will be
rewarded not for making the user feel validated, but for moving the dispute
forward in a fair and balanced way. 

In changing alignment from pleasing users to maintaining accuracy and 
balance, specialist conflict resolution models can and should be trained to
acknowledge feelings without endorsing or validating positions (i.e., I hear
that this feels frustrating, rather than youre right to be frustrated). 

As generative AI further cements its position at the forefront of business
strategy, these details are critical. In high-stakes functions, the potential
cost of a yes-man AI  one that flatters rather than challenges, or invents
rather than informs  is simply too great. When business leaders lean on
validation rather than facts, the risk of poor decisions increases
dramatically. 

For organizations, the path forward is a clear one. Embrace specialist,
domain-trained models that are built to guide, not gratify. Only specialist 
AI models grounded in factual objectivity can help businesses to overcome
complex challenges rather than further complicate them, acting as trusted
assets in high stakes use cases. 

 This article was produced as part of TechRadarPro's Expert Insights channel
where we feature the best and brightest minds in the technology industry
today. The views expressed here are those of the author and are not
necessarily those of TechRadarPro or Future plc. If you are interested in
contributing find out more here:
https://www.techradar.com/news/submit-your-story-to-techradar-pro
======================================================================
Link to news story:
https://www.techradar.com/pro/why-yes-man-ai-could-sink-your-business-strategy
-and-how-to-stop-it
$$
--- SBBSecho 3.28-Linux
 * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)
SEEN-BY: 105/81 106/201 128/187 129/14 305 153/7715 154/110 218/700
SEEN-BY: 226/30 227/114 229/110 111 206 300 307 317 400 426 428 470
SEEN-BY: 229/664 700 705 266/512 291/111 320/219 322/757 342/200 396/45
SEEN-BY: 460/58 633/280 712/848 902/26 2320/0 105 304 3634/12 5075/35
PATH: 2320/105 229/426


<< oldest | < older | list | newer > | newest >> ]

(c) 1994,  bbs@darkrealms.ca