I’ve been vaguely using Google Docs (specifically Spreadsheets) since it came out but never to do anything actually important. Most of the time I just had a list I need sorting, or if I was feeling sophisticated I’d use it to decide on what was best value for money (how much £/GB a range of hard drives were for instance). Recently I started using it to plan lessons for the language learning app. The ability to use it from work (or any other computer I might be on - including viewing it on my Nokia 770) was useful, but in the end I was only really writing a list with it. Until now. I now have a nifty little C# app that generates modules directly from a Google Spreadsheet which is definitely a Good Thing. I’ve been thinking of writing an app for module editing for a while since writing them by hand is tiresome and error prone. Google Spreadsheets does half the work for me by providing the user interface for generating a table and then provides access as simple XML. Which brings me to the matter of actually accessing the data. Google provide a client library in C# for accessing quite a lot of their API. I tried using it but found it a little confusing. Luckily since I was just wanting to query data, I discovered that raw access was actually easier. You simply make a
GET request to
http://spreadsheets.google.com/feeds/worksheets/_key_/public/values (where key is provided to you when you “publish” a spreadsheet - access to unpublished spreadsheets requires authorization which is more complicated). This gives you an Atom feed of URLs to the individual worksheets which them contain Atom feeds of either rows or columns (your choice). The query power of LINQ (along with XElement, XAttribute etc.) make transforming the feeds into modules really easy. In fact the code that does the hard work (takes a spreadsheet key and generates the XML) is only 102 lines long, and that’s including unnecessary spacing to make the LINQ more readable (the main LINQ query is 35 lines).
Thought I’d offer a quick status update regarding the language learning app. After a short break I’m back at it. Appart from enough Finnish content to generate ten 15-minute lessons the biggest progress is outputting MP3 files. My original plan was just to output M3U playlists but it seems iTunes and therefore iPods don’t support M3U files (as far as I can tell iTunes can only create playlists of files in it’s library - who wants hundres of files in their library consisting of a few words each?). The sample MP3s should be available “soon”…
The secretly named language learning app has been revamped to use LINQ for most of the XML handling. For those that don’t know, LINQ is a new technology that provides querying functionality in the .NET world. In my case I’m using LINQ to XML and it has seriously cut down on the size of the heaviest methods. Also, the part of LINQ to XML that I found least interesting when I read about it is actually the part I’ve found the best - the new
XDocument API. Anyway, LINQ combined with a new USB headset that provides some actually quite good audio means that the important fundamental features have been implemented and work. At the moment it can:
- Generate lessons based on vocabulary1 modules
- Generate lessons containing past content with the correct repetition timing.
- Actually play the lessons (but only on Windows2)
There are a few more things I want to add before I release any of it (like more audio for a start). But I thought I’d at least point out development is still happening :o) 1Instead of the Conversation > Phrase > Term style of Pimsleur I’ve decided to go for a more freeform approach to start with (inspired by me listening to Michel Thomas again). A vocabulary module just contains list of words and phrases that are processed in order. 2I still need a cross platform way to play audio. At the moment I use MCI which is part of
winmm.dll which is obviously Windows only. Although Wine has apparently implemented it almost completely but I’m not sure how I’d go about making that help me.
Warning, this post is long and rambling. You have been warned! :P Part of the design philosophy of my language learning app is to reuse as much as possible. This brings up an interesting issue regarding regional variations of languages (I’m talking mainly about somewhat standardised variations) and how much should be shared between them. For example in Belgium, French is an official language. This is almost the same as French as spoken in France but with a few important differences. Firstly there are minor vocabulary variations (Belgian French has specific words for 70 and 90 for instance). There is also a lot of Flemish and Walloon vocabulary used in addition to the French vocabulary. Finally there are pronunciation differences but these seem no greater than differences in accent. So, a course on Belgian French should be almost identical to a course on Standard French. The question is how to notate that in the script files the language app uses. There are basically three ways I’ve come up with to cope with the situation, and I think I’ll support all of them since they have different advantages in different situations. The first is to allow in line region specific phrases. So for the numbers in Belgian French, the standard French files would be used but any Belgian French sections would take priority. The second is to have whole region specific files. Extra Belgian phrases not appearing in standard French would be in these and be loaded in addition to the standard French files. This is really an extension of the first. The final case is no link at all. This would be needed for Chinese. The language code for Mandarin is “zh-guoyu” and the code for Cantonese “zh-yue”. In this case however there is no such spoken language with the code “zh” and therefore nothing to inherit from. This is an specific case of the first two where no parent language exists. So far this has just been considering audio. The app already supports text and will eventually support text only lessons of some sort. The first method above could be using for spelling variations (when learning English “color” and “colour” could use the same audio while appearing differently on the screen). As more dramatic example Serbian could be taught using either the Cyrillic alphabet or the Latin alphabet with the codes “sr-cyrl” and “sr-latn” respectively. Or perhaps even both… The final point I want to make regards the actual audio files themselves. Although it is true than most of spoken French is almost the same in Belgium and France, the accents are different and generally identifiable to French speakers. Therefore regional specific audio is desirable where possible. Since the script files and the audio are kept separate this is is possible with the language app. If the Belgian French audio exists that will be used, if not the standard French is used. That means that if a standard French course is created, an adequate Belgian French course can then be created with little effort but with the possibility of improving it later
Well I’ve abandoned my plans to use Gtk# in the language app (which actually secretly has a name now). The main reason for changing is simplicity. I had a look at the TreeView control in Gtk and decided it was too much work. Although the theory of good MVC separation is good, the user interface is such a small, simple part of my app it wasn’t worth it. The stuff I need from
System.Windows.Forms should work in Mono (and .NET 1.1 and hopefully even the Compact Framework). I still prefer the way Gtk handles layout of controls in general, but I console myself with the Windows form designer in Visual C# Express…
The language learning app which I went on and on about a while ago is now under development again. When I say again I mean I started again in a completely different way (at least from a technical implementation point of view - the user experience is intended to be the same). You see I recently started a large project in C# at work (a desktop app by the way, not ASP.NET) after saying I was somewhat familiar with it and it should be easy to learn. The good news is after two days I realise it actually is really easy to learn, providing you let it do the work for you. (To any programmers intending to learn it, you’ll spend most of your time at first not actually writing code but finding whereabouts in the huge class library the functionality already exists is. Once you get used to it and get the hang of how it works it is surprisingly relaxing.) Despite all that I decided I still needed some practice in it so I came up with the idea of doing the language learning app as a fully fledged desktop application - although at work I’m using
System.Windows.Forms I’m using
Gtk# so it can hopefully run on Mono (and therefore Linux, Mac OS etc). The biggest problem I have is actually playing the audio. A quick search for “C# MP3” comes up with a solution based on MCI, some clever thing embedded in a Windows DLL that obviously won’t be cross platform. My workaround at the moment is just use an external program via the command line that I suppress the window of. If anybody knows of a better way that would work on .NET and Mono, let me know… Mono, C#, .NET, language learning, Linux, Gtk, winforms, MP3, dot net
Wikipedia is great. You can find out almost anything. The only criticisms of Wikipedia are strenuous at best and tend to either be: a) It’s unreliable (you shouldn’t use a single source anyway - that’s why Wikipedia articles are supposed to cite references) or b) It’s somehow elitist or a “members only club” - a view often held by banned users. One of its oldest sister projects, Wiktionary on the other hand is not so good. I think it’s a marvelous idea that should be done and should definitely continue, but at the moment it is frankly a mess. In case you don’t know what it is, it’s an attempt to create a free multilingual dictionary in every language. That is not a tautology - I’m emphasising the fact that it aims to translate from every language to every other language. That is the English version will contain every single word in every language with definitions and details in English. The German version will do the same but with definitions and details in German. And so on for every other language. Of course for some languages there will never be enough editors (English probably has the most and that’s nowhere near complete). Ambitious. Possibly too ambitious. The number of editors doesn’t seem to be as high as Wikipedia and editing is far less fun - there is far more grunt work to do with laying out tables, sorting out headings, getting links pointing to the right places. There are quite a few bots which can automate some of it, but it’s still a large and largely dull undertaking. Why am I telling you this? I don’t know. Maybe just to encourage a couple more editors to jump on board :)
The following search box will return results emphasising language learning resources.
Google just launched a clever new custom search thing. The idea is for people to create their own “custom” search engines that automatically give weight to certain sites, restrict others and silently append search terms therefore improving accuracy for niche topics. For example imagine an ornithologist searching for “a pair of great tits”.. So, I’ve created an engine for finding language learning resources. And it’s surprisingly good.