voice output

A place to discuss the science of computers and programs, from algorithms to computability.

Formal proofs preferred.

Moderators: phlip, Moderators General, Prelates

MildlyUpsetGrizzlyBear
Posts: 40
Joined: Tue Feb 09, 2010 12:51 pm UTC

voice output

Postby MildlyUpsetGrizzlyBear » Thu Sep 30, 2010 8:54 am UTC

Why are computers bad at talking?
Last edited by MildlyUpsetGrizzlyBear on Thu Oct 07, 2010 3:00 pm UTC, edited 1 time in total.

archeleus
Posts: 240
Joined: Wed Sep 29, 2010 1:49 pm UTC
Location: Valenvaryon
Contact:

Re: voice output

Postby archeleus » Thu Sep 30, 2010 10:40 am UTC

To sound convincingly human you need emotion. Programming a computer to insert emotion is no easy deal. That's just what I think btw.
I write a blog rant here.

Moose Hole
Posts: 398
Joined: Fri Jul 09, 2010 1:34 pm UTC

Re: voice output

Postby Moose Hole » Thu Sep 30, 2010 1:39 pm UTC

I thought maybe later we could go see a movie.

User avatar
DorkRawk
Posts: 51
Joined: Thu Mar 08, 2007 6:50 am UTC
Location: Chicago
Contact:

Re: voice output

Postby DorkRawk » Thu Sep 30, 2010 5:03 pm UTC

It seems like the missing piece here is inflection, which could just be another attribute to the speech model. Knowing when and where to put this inflection is probably the hard task here (the one presumably connected to emotion). It would be very difficult for a general case (some sort of conversational system), but seems like it might be more feasible in smaller systems (say a GPS system).

Divinas
Posts: 57
Joined: Wed Aug 26, 2009 7:04 am UTC

Re: voice output

Postby Divinas » Thu Sep 30, 2010 5:47 pm UTC

Have you seen this? It's not perfect, but it's WAY better than ms Sam :D
http://www2.research.att.com/~ttsweb/tts/demo.php

User avatar
Yakk
Poster with most posts but no title.
Posts: 11128
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: voice output

Postby Yakk » Fri Oct 01, 2010 1:18 pm UTC

Computers are great at speech. Everytime you play a youtube video, that is a computer speaking.

Of course, that is "just a recording". So you want a text-to-speech system? Text-to-speech is ridiculously hard to solve, because it involves the computer "understanding" what it is that it is reading, and "understanding" text is one of the problems that some people think is "AI-hard" (ie, a full AI could do it, and anything that could do it would be a full AI).

Now you could probably do a bit better by creating a huge library of sounds pronounced by a person, attaching it with a huge library of pronunciation keys on every English word, adding in some ability to blend between multiple sounds, and have it learn how to pronounce any given word. Then you'd have to learn how to "slur" your speaking between adjacent words...
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Turtlewing
Posts: 236
Joined: Tue Nov 03, 2009 5:22 pm UTC

Re: voice output

Postby Turtlewing » Mon Oct 11, 2010 5:46 pm UTC

Mostly it's because humans are analog and computers are digital.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11128
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: voice output

Postby Yakk » Tue Oct 12, 2010 12:18 pm UTC

Turtlewing wrote:Mostly it's because humans are analog and computers are digital.
Yes, as evidenced by DVD audio being so much worse than VHS audio.

Wait, what?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

SammyIAm
Posts: 37
Joined: Wed Oct 10, 2007 4:50 am UTC

Re: voice output

Postby SammyIAm » Thu Nov 18, 2010 1:44 am UTC

Divinas wrote:Have you seen this? It's not perfect, but it's WAY better than ms Sam :D
http://www2.research.att.com/~ttsweb/tts/demo.php


Apple's new voice "Alex" is pretty good too. The biggest change, and I think some of what's missing from computers being "good at talking" is phrasing. We humans speak, there's a certain flow to the sentence that connects related words so that it's more clear what's being talked about. Just putting emphasis on different words can change "I ate the pie" from a discussion about what was done to the apple, into a discussion about about what was eaten ("I ate the pie"). Since computers generally have no idea what they're talking about (just reproducing each sound in each word) this phrasing is hard.

Also pretty much everything is better than MS Sam. :wink:


Return to “Computer Science”

Who is online

Users browsing this forum: No registered users and 7 guests