Web Programming

FreeNAS

Oliver Brown
— This upcoming video may not be available to view yet.

In an effort to get more storage to share between the three computers at home (two Windows and one MythTV) I setup yet another machine running FreeNAS.

FreeNAS is a small (about 30MB) operating system based on FreeBSD designed just to be a NAS (Network Attached Storage). You add hard drives to it and it makes them (optionally) available in several different ways, including:

  • CIFS/Samba
  • NFS
  • rsync
  • HTTP
  • FTP

After a few minor problems setting it up (like a power cable breaking and installing from an old CD-ROM drive that didn’t work) it works great. Copying a large (~40GB) chunk of files to it at once took a while but writing to and reading from it at more sensible levels isn’t noticeably slower than using local files (on a gigabit network).

ClearType and IE 7

Oliver Brown
— This upcoming video may not be available to view yet.

Internet Explorer 7, which will be pushed to users in an almost forced manner shortly, will use ClearType by default. ClearType essentially uses a clever technique called sub-pixel rendering to provide clearer text at low resolutions. Like anti-aliasing, another technology to achieve similar results, it can make things look blurry but on the whole it looks okay.

One of the side effects is that vertical lines get a coloured “halo” which can be a little off putting. On the whole you get used to it, but after leaving it on for a while I noticed something: the problem is essentially a side effect of using “high readability” fonts like Verdana. Verdana was designed to be readable at small sizes and such has a lot of straight lines (since curves either look dodgy or need anti-aliasing at small sizes). These straight lines (specifically vertical ones) are the ones that have the most pronounced halos. Use fancy curvy fonts (even if it’s just a little) and the problem is greatly diminished.

QED Wiki and the Zend Framework

Oliver Brown
— This upcoming video may not be available to view yet.

IBM are working on an impressive looking product called QED Wiki, developed with the Zend Framework.

Fundamentally it’s a wiki like any other. But there is a cool layer on top of it that could be revolutionary (although like many Web 2.0 concepts will probably fall short and just be “cool” - we can hope). The interface allows you to create “situational applications” that can link different components together with the ease of a wiki. It doesn’t really make much sense just reading about it so go watch the video about it.

On a related note, you can now get snapshots of PHP 6.

How much fluff is needed?

Oliver Brown
— This upcoming video may not be available to view yet.

I’ve been sorting out exactly what needs recording for the language app (which I finally have an idea for a name for) and I was trying to decide how much extra instructor speech is needed. Situations aren’t described for instance (no “Image an English man sitting next to a French woman”) and you aren’t asked to say things explicitly (“How do you ask someone if they speak English?”). Will this harm the process at all?

The best thing to do perhaps would be to avoid trying to be Pimsleur quite so exactly.

Multilingual pretty URLs

Oliver Brown
— This upcoming video may not be available to view yet.

There is more and more emphasis on pretty URLs these days. With things like Ruby on Rails around to easily support it and better knowledge and use of things like mod_rewrite the days of horrible query strings is going away (excluding of course the most used websites - search engines). But how do you make your multilingual website have pretty URLs?

My language learning app uses the Zend Framework and so uses pretty URLs by default. I need the interface available in many languages, but then the URLs should be pretty in a localized way.

For example, starting a new Finnish lesson uses the following:

/lesson/new/fi

That would be the new action of the lesson controller with an extra language code parameter of fi.

In German this should be something like:

/lektion/neu/fi

By default this would access the neu action of the lektion controller.

The “simple” solution would be to write lots of controllers that just delegate to the real one. Which is silly. Instead an extra layer has to be added to the routing process some sort of look-up table mapping localized URL fragments with “real” canonical ones. This should be fairly simple with Zend Framework (although I haven’t actually tried yet).

Just an important issue no-one seems to have brought up yet…

Amazon Mechanical Turk

Oliver Brown
— This upcoming video may not be available to view yet.

It’s so crazy it just might work.

I heard about AMT a while ago and thought it looked cool. But not much was happening with it.

Well now it’s beginning to take off more and it might be usable in my language app.

It’s essentially a work marketplace wrapped in a web service API. Your application creates a job request (called a Human Intelligence Task) which someone then completes with the result being sent back to your application. So far it’s commonly used for processing lots of small tasks (for example there’s one about verifying info about some restaurants that only pays $0.03 but there are over two thousand individual tasks available), but it can be used for anything.

The relevance is that it might be possible to get people to record audio for the language app through it. Amazon Mechanical Turk.

Using XHTML, XSLT and XForms for Xemplorary performance

Oliver Brown
— This upcoming video may not be available to view yet.

Alliteration and bad pun. Good start :)

One of the features the language app will need is some sort of module editor. Although the XML format of the scripts is straightforward to anyone used to hand editing HTML, a lot of other people will not have a clue. Therefore a WYSIWIG would be a cool addition. And lots of X’s may be the way to go.

Although XForm support in browsers isn’t exactly stellar, the fact that only script editors will require means that needing a plug-in or extension isn’t such a big thing. And I get brownie points for being Web 2.0 as well.

I’m going to assume you know what XForms and XSLT are. If you don’t, then go find out. I’ll probably explain in a future post, but for now just accept them as “cool” :P

Basically a module is included directly into the XHTML source of the page. The only change is the addition of a namespace declaration (which are normally absent from the modules). XSLT is then used to add some nice formatting to the conversation along with XForm stuff for editing (including adding/removing elements). This makes the server side code really easy since the whole XML of the module gets posted back to the server.

In theory the XSLT shouldn’t be needed since XForms can do repeating and stuff. The only problem is I don’t think it can handle recursion which is a bit of a limitation.

There is one bit of the XSL that I’m stuck on there. I have the XML fragment in the head of the XHTML document. I need to be able to transform a copy of it and place it in the document body, but keep the original intact in the head. Does anyone have an XSL snippet to do that?

Almost ready for a public viewing

Oliver Brown
— This upcoming video may not be available to view yet.

The still unnamed language learning app is almost ready for a first public viewing. I’m just trying to get some audio of some other than myself. Firstly because I don’t like really hearing my own voice (and for this purpose my less than perfect pronunciation is too obvious) and secondly I need at least two people just for it not to be confusing.

In the meantime I thought I’d share an example of the script file I’m using: EntschuldigenSie.xml. It primarily contains English translations although one phrase is done in a few more languages. It does highlight one possible issue. I had to change the German ß to ss. Although Windows seems perfectly fine with Unicode file names (internally it uses Unicode for storage (either UCS2 or UTF-16 - not sure which)) PHP refuses to open them (fopen, file and file_exists for instance just don’t work) and Apache 2 seems to have issues as well. For German there are workarounds but for other languages it will get fiddly. This might not even be a problem on Linux where it will ultimately reside and it only affects file names which only have to give you a rough idea of what’s inside. But still, it’s annoying…

Best bits of the language app are done

Oliver Brown
— This upcoming video may not be available to view yet.

The most important bits of my cool language learning web app are done. Here’s quick overview of how it works.

Everything is split into modules which are XML script files and accompanying audio files. Currently one type of script is supported, a “conversation”. This contains a short (less than 10 sentences) conversation with sub elements all marked up in XML. Sub elements are phrases, terms and notes. At the moment phrases and terms are handled almost identically. Notes are little explanations or possible stumbling points (for example the test script I have alerts the listener to the difference in the ending between “Ich verstehe” and “Sie verstehe_n_” in German). Any element of a conversation that is to be repeated is named (literally - the XML tag is given a name attribute). The system keeps track of the number of times a name phrase/term is played to the user and when it was last played so the automatic repetition system can work.

A lesson is currently very simple. A module is loaded and the conversation is played straight through. Then the named phrases/terms are played* with translations. Then any phrases/terms scheduled for repetition are played*. The repetitions are actually determined before the conversation is played however so that if too many are required then no new conversation is played.

* Played in this case means a specific format. First the native version is played, then a pause, then the translation is played twice.