Learn a language with synthetic speech

April 11, 2006 by Oliver
Filed under: Languages, Programming, Technology 

just good enough.

The traditional way to generate speech with a computer is algorithmically. Essentially someone works out how to overlay tones with different pitches and wave-shapes to form each sound. The newer way is to actually record each sound manually and essentially play them back one after the other.

There are more stages it to it than that – assuming you don’t want to write the speech phonetically (in IPA for instance) there also needs to be a way of turning text into phonetic information. This is usually half dictionary based for common words and syllables and half rule based (to avoid having a big dictionary and for coping with languages constantly expanding and evolving).

So we now have technology (almost freely available) that can produce speech that is good enough given the correct phonetic information – it’s the actualy language processing that is problematic. Most of the work is done by American companies and therefore most of the work is done processing English (American English at that).

This is not an insurmountable problem. The engine I’ve been playing with (available as an addin to Internet Explorer and as standard on Windows Vista) works fairly well with foreign words transcribed in dodgy-phonetic English. For example to get it to pronounce “Entshuldigung” (German) correctly you need to type “Enshooldicken”). This is workable for an semi-automated system – it could include a dictionary of sorts replacing words with their English-phonetics version.

I know the whole of this article is rather rambling – I’ll post something more readable later :P

TTS, language learning, text-to-speech, SASDK, speech synthesis, speech recognition, German]]>

Comments

No Comments on Learn a language with synthetic speech

  1. An introduction to SALT. - OliverBrown.me.uk on Wed, 12th Apr 2006 1:07 pm
  2. < ![CDATA[[...] il 12th, 2006 by Oliver

    The speech engine that I was talking about in my last article about speech synthesis is an add-in for Internet Explorer [...] ]]>

  3. Alexandre Rafalovitch on Sat, 15th Apr 2006 8:23 pm
  4. < ![CDATA[It is not (any longer) the phonetics that is going to bite you. It is prosody.

    So, you can have a sentence in a neutral tone or even with vague question/exclamation tone, but there has been not enough research into making viable computer-generation intonation yet. Certainly not enough to produce a dialog yet.

    But good luck anyway. Maybe you will discover an alternative solution.]]>

  5. Oliver on Sat, 15th Apr 2006 10:41 pm
  6. The amazing thing is that a week ago I didn’t even know what prosody was.