Internationalization and Localization of content.

William Overington

Monday 21 October 2002

Now that at least two countries, Finland and Germany, have DVB-MHP services running, I wonder if the issues of internationalization and localization of DVB-MHP content should come to the forefront amongst those of us interested in content authoring for use on the DVB-MHP platform in more than one country.

It seems to me that a good opportunity for a major step forward in relation to internationalization and localization of content for the DVB-MHP platform is the 2003 Eurovision Song Contest, which is due to be broadcast from Latvia next year.

Suppose that DVB-MHP services in Finland and Germany and hopefully elsewhere by then could carry some information beamed from Latvia. The information could be sent from Latvia in an internationalized format and could be localized for a Finnish language audience in Finland and could be localized for a German language audience in Germany.

If there were a DVB-J application running in Finnish DVB-MHP terminals and the same DVB-J application running in German DVB-MHP terminals, with the Finnish DVB-MHP terminals using a Finnish customization file and the German DVB-MHP terminals using a German customization text file, then internationalized content arriving from Latvia could be translated into Finnish in Finland and into German in Germany and into the local language in whichever other countries chose to take part in the experimental broadcast.

This need not necessarily be too hard to achieve if the internationalized information being sent from Latvia is sent using a specially devised "telegraphic code" similar to that used for many years in the old days on the railway systems of the world.

Certainly, lots of care with the choice of the phrases and how their parameters are organized and with having the special encoding language be able to have carry forward knowledge of grammatical gender when using a word such as "it" or "they" following a previous sentence, yet I feel that, with care, such an encoding can be put together.

It would be a good learning experience as to what is and is not possible at the current sate of the art in internationalization and localization of content on a real-time basis responding to the way that the content and the voting goes.

For example, such phrases as the following could be encoded.

The country now in the lead in the voting is P1.

There are P1 members in the group.

In each of the above sequences, the P1 represents the parameter of the sentence. In the first example, P1 is a country (designated by an index number, the same as used for international telephone numbers) and in the second example P1 is an integer value.

The whole phrase would be designated by an integer.

The presence of a phrase in a text stream could be designated using a key such as an otherwise highly unlikely sequence of Unicode characters.

I am already researching on internationalization and localization using such a technique, for emails, web pages and DVB-MHP applications. I use the sequence of three Unicode characters as a key, namely a comet, a combining circumflex accent and a combining enclosing keycap, so as to give a comet circumflex button when displayed using a standard all-Unicode font. This means that a text stream can be monitored by a software system looking out for a comet circumflex button without in any way disrupting or wrongly acting upon any other text which passes by it.

Internationalization and localization of information sent in real-time from the Eurovision Song Contest in this manner would, I feel, be a spectacular achievement for the DVB-MHP system and would go down in the history of broadcasting as an exciting first.

The comet_circumflex system.

