Learn a language with synthetic speech

Oliver Brown
— This upcoming video may not be available to view yet.

I’m currently working out how feasible computer based language-learning-type software using synthesised speech would be. Speech synthesis has improved a great deal recently and although it’s still not as good as a real person (although I believe it could be soon - at least for non-emotive situation) it could just good enough.

The traditional way to generate speech with a computer is algorithmically. Essentially someone works out how to overlay tones with different pitches and wave-shapes to form each sound. The newer way is to actually record each sound manually and essentially play them back one after the other.

There are more stages it to it than that - assuming you don’t want to write the speech phonetically (in IPA for instance) there also needs to be a way of turning text into phonetic information. This is usually half dictionary based for common words and syllables and half rule based (to avoid having a big dictionary and for coping with languages constantly expanding and evolving).

So we now have technology (almost freely available) that can produce speech that is good enough given the correct phonetic information - it’s the actualy language processing that is problematic. Most of the work is done by American companies and therefore most of the work is done processing English (American English at that).

This is not an insurmountable problem. The engine I’ve been playing with (available as an addin to Internet Explorer and as standard on Windows Vista) works fairly well with foreign words transcribed in dodgy-phonetic English. For example to get it to pronounce “Entshuldigung” (German) correctly you need to type “Enshooldicken”). This is workable for an semi-automated system - it could include a dictionary of sorts replacing words with their English-phonetics version.

I know the whole of this article is rather rambling - I’ll post something more readable later :P

Google and bad markup

Oliver Brown
— This upcoming video may not be available to view yet.

I had quick look at the source HTML on Google Analytics (specifically the first page: executive summary) and saw a piece of really bad markup: They had the whole page (HTML, HEAD, BODY, the lot) wrapped in a DIV tag… Naughty Google.

More stable server

Oliver Brown
— This upcoming video may not be available to view yet.

The server should now be up far more often.

I can’t work out why MySQL and/or Apache keep crashing so often so as an interim solution I’ve just written a PHP script that is run by the cron daemon that checks each if they are running as they should and restarts them if they’re not. Hopefully this means if the server is down it shouldn’t be for more than an hour.

Download Firefox

Oliver Brown
— This upcoming video may not be available to view yet.

Download Firefox for a better browsing experience.

More Google Analytics

Oliver Brown
— This upcoming video may not be available to view yet.

Well I now have to say that Google Analytics certainly looks impressive. It has all the stats that you would expect from anything and a heck of a load that you wouldn’t. The real gain is the way it is presented. All the stats can be quickly restricted to date ranges, you can compare two date ranges, most of the details can be combined arbitrarily (just see visitors from Spain using Internet Explorer on Macs for instances) as well as lots of other nifty things. It also has support for e-commerce tracking (including defining custom goals and ROI calculations) as well download and outbound link tracking.

And it’s free.

Well if you have more than 5 million hits a month you need to get a Google AdWords account which (at least when I signed up) needed a $5 deposit. But if you get 5 million hits a month I’d hope you could afford it.

One quick detail I discovered (without really looking for it). I get most of my traffic from search engines, however visitors from links from other sites visit more pages per visit.

I’ll post more as and when I find something particularly interesting to post (for instance I can’t test the date features with just one day of data).

Google Analytics

Oliver Brown
— This upcoming video may not be available to view yet.

Ooh, I’ve just been sent an invitation for Google Analytics. I can’t really say much about it yet since it takes 24 hours for data to appear.

No Bret Hart at Wrestlemania

Oliver Brown
— This upcoming video may not be available to view yet.

Somewhat disappointingly, but not exactly surprisingly, Bret Hart was not at Wrestlemania 22. There was some vague announcement by Howard Finkel before the Hall of Fame inductees were present about Bret being uncomfortable with the show.

Putting dating in context

Oliver Brown
— This upcoming video may not be available to view yet.

It’s that day again, April 1st, and boy the internet make it more fun. And of course Google are at again. Personally I don’t think they’ll ever beat Pigeon-Rank but they have to try.

The first thing I have to point out about Google is be very careful around April 1st. They have launched a number of genuine services on April 1st (including Gmail) that turned out to be real. None the less I’m fairly sure Google Romance is less than sincere…

And another thing, GameFAQs have decided it’s bad to cheat.

Patently not here

Oliver Brown
— This upcoming video may not be available to view yet.

I was reading about DotGNU recently and there interesting idea for getting round software patents. Outsource some calculation that requires patented software to a country that doesn’t recognise software patents and access it as a web service.

Well obviously that won’t work. If a government is serious about the patent law it will just make that sort of thing illegal (the same way Garry Glitter can be convicted of child sex offences here once he returns).

DotGNU and Freedom

Netscape for web developers

Oliver Brown
— This upcoming video may not be available to view yet.

There is an article on MSDN about how to get round the ActiveX activation issues that will be introduced into IE shortly. On that page it mentioned something I didn’t know - the latest version of Netscape Browser (version 8) can use Internet Explorer’s rendering engine (Trident) instead of the Mozilla rendering engine, Gecko.

If you develop web sites these days you need to make sure you can support at least IE and Firefox and preferably Safari. Testing Safari is often not possible if you primarily use Windows but testing in IE and Firefox can now be done from the same browser - you can actually change rendering engine at any time with CTRL-SHIFT-E. It also supports all the cool developer features of Firefox (like the DOM Inspector (although if you are using the IE rendering engine you can’t just click an element to select it).