home bbs files messages ]

Just a sample of the Echomail archive

<< oldest | < older | list | newer > | newest >> ]

 Message 1332 
 Mike Powell to All 
 The AI That Cried AAAAAAH 
 26 Apr 25 10:47:00 
 
TZUTC: -0500
MSGID: 1065.consprcy@1:2320/105 2c72417d
PID: Synchronet 3.20a-Linux master/acc19483f Apr 26 202 GCC 12.2.0
TID: SBBSecho 3.20-Linux master/acc19483f Apr 26 2024 23:04 GCC 12.2.0
BBSID: CAPCITY2
CHRS: ASCII 1
The AI That Cried AAAAAAHHH!

Date:
Fri, 25 Apr 2025 21:30:00 +0000

Description:
Sesame's AI voice successfully mimics human speech almost perfectly.

FULL STORY
======================================================================

AI voices usually aim to be realistic in a friendly way, mimicking relaxed,
happy, helpful people. But a new open-source model named Dia is leaning into
the more emotional spectrum of voices, including some really intense
screaming. 

Dias creators at Nari Labs are a tiny group, but have given AI voices the
option to sound like a somewhat melodramatic performer, capable of making
realistic laughing, coughing, throat-clearing, sniffing, and yes, yelling. 

You might not think that yelling is a big deal for AI at this point, but
screaming is hard to fake. It can't just be talking loudly; it's an entirely
different speech mode. 

Emotionally expressive speech is a gap in most AI voices. Its easy for a 
voice model to read a bedtime story. However, its much harder for it to sound
like its trying to calm a friend down, or like it just saw something 
shocking. Most commercial models avoid sounding robotic by smoothing the tone
of the voice, which doesn't leave room for the kind of audio asymmetry of
speaking emotionally. 

Dia treats nonverbal communication as part of the performance. It knows that
"(coughs)" isnt something to be ignored or read literally. It knows that a
scream isnt just a louder line. And it performs these things with a level of
timing, pitch modulation, and breath control that makes them feel more real. 

One enterprising user even used it to recreate a bit of the famous Leroy
Jenkins sketch carried out on World of Warcraft. 

That's not to say that OpenAI, ElevenLabs, Google, Sesame , and others 
haven't produced amazing AI voice models. You can customize OpenAI's Advanced
Voice Mode to speak with different emotions, and ElevenLabs is good at
interpreting capitalization and punctuation to adjust speech, but that's not
the same as yelping in surprise or wheezing with laughter. 

Sesame is particularly good at sounding and reacting like a real person, but
even its models err towards cheerful and generally positive demeanors. 

Of course, realism is subjective, and you might work out pretty quickly that
Dia is an AI voice. Then again, fake screams and laughs are also pretty human
sounds to make in the right context.

What makes this a bigger story than just AI voice learns a party trick is 
what it signals for the broader race in AI for emotional intelligence. 

Were rapidly entering an era where it wont be enough for your assistant to 
say the right thing; itll need to say it in the right way. Think customer
support bots that sound genuinely sorry, teachers that sound encouraging
instead of instructional, and in-game characters that convey sincerity. 

Of course, giving AI the power to emote convincingly makes it more persuasive
and thus potentially more manipulative. If emotional speech can be just
another AI tool, then more than a few people may feel like screaming
themselves. 

Still, I can imagine some fun writing a ghost story for Dia to not just read,
but perform, screams and all.

======================================================================
Link to news story:
https://www.techradar.com/computing/artificial-intelligence/the-ai-that-cried-
aaaaaahhh

$$
--- SBBSecho 3.20-Linux
 * Origin: capitolcityonline.net * Telnet/SSH:2022/HTTP (1:2320/105)
SEEN-BY: 105/81 106/201 128/187 129/305 153/7715 154/110 218/700 226/30
SEEN-BY: 227/114 229/110 111 114 206 300 307 317 400 426 428 470 664
SEEN-BY: 229/700 705 266/512 291/111 320/219 322/757 342/200 396/45
SEEN-BY: 460/58 712/848 902/26 2320/0 105 3634/12 5075/35
PATH: 2320/105 229/426


<< oldest | < older | list | newer > | newest >> ]

(c) 1994,  bbs@darkrealms.ca