Using the U+FFFC character on the DVB-MHP platform.

William Overington

Copyright 2003 William Overington

Thursday 16 January 2003

In January 2003 I started a new thread in the discussion forum at http://forum.mhp.org entitled "Using the U+FFFC character on the DVB-MHP platform.".

I thought that it might be interesting to add a transcript of the posting into this sequence of documents.

The posting is the text that appeared, except that I have made the web addresses referenced into active links.

The transcript consists of the date and time recorded with the posting, followed by the posting itself.


2003/01/02 11:51


I write to put forward a suggestion for the way that the Unicode character U+FFFC OBJECT REPLACEMENT CHARACTER could be used on the DVB-MHP platform.

This is suggested as a programming convention which could be very useful.

My idea is that there could be a programming convention that if one has a Unicode plain text file with a file name that has the extension .uof (for Unicode object file) that accompanies another Unicode plain text file which has a file name extension such as .txt, (or indeed any other choice except .uof or file name extensions used for non-text files), then the programming convention could be that the .uof file has on its lines of text, in order, one file name on each line, the name of the other text file then the names of the files which contain the objects to which a U+FFFC character in that other text file provides the anchor.

For example, a file with a name such as story7.uof might have the following lines of text as its contents.

story7.txt
horse.png
dog.png
painting.png

The file story7.uof could thus be used with a file named story7.txt so as to indicate which objects were intended to be used for three uses of U+FFFC in the file story7.txt, in the order in which they are to be used. Thus the file story7.txt would be a Unicode plain text file rather than needing to be a mark-up format file. With a generic DVB-MHP application which takes in a Unicode plain text file and produces a display, this .uof format file idea provides a convenient method to include pictures within the file. Such a generic application could be designed to be run with a file name as a single parameter. If the file name extension of the file name stated as the single parameter is other than .uof then the named file could be used directly as the text file with no pictures being in it, whereas if the file name extension of the file name stated as the single parameter is .uof then the file name of the text file and the file names of the various graphics files could be found from the .uof file.

I have used .png graphics files for my example, but the format could be left open so that any suitable file could be used as the object that is anchored within the document. It would be a matter for the design of the program which uses the .uof file as to which types of graphic file that program would accept. Clearly .png would often be a main choice. However, other file formats such as .jpg could be accepted if desired as long as the program has the ability to deal with those formats. Specialist file formats could be included if desired. For example, I am thinking of having .eug for eutocode graphics file, so that a .eug file could be included on a line of a .uof file for use by a program which were written so as to be able to use the information in a .eug file as a picture.

There is no obligation that the first part of the file name of the .uof file and of the .txt file should be the same, yet that would typically be a useful thing to do.

I can imagine that such a widely used practice might be helpful in bridging the gap between being able to use a plain text file or maybe having to use some expensive wordprocessing package.

I am not saying that this suggestion fully solves all of the possible implications of rendering and so forth. I am simply suggesting that having such a convention would be a useful facility. Such a convention, because it uses a special file extension, would not intrude upon the right of anybody to devise their own convention for other ways of using the U+FFFC character in documents, though it would clearly be helpful if anyone devising a different format were pleased to avoid using the characters .uof as the file name extension for that different format.

----

The suggestion has been before the Unicode Technical Committee, with the idea of use in a more general computing format, but consideration was declined. However, that declining to consider should not be taken as an indication that such a format should be considered as being unsuitable for consideration for use by a user community. Indeed, I feel that the formal declining by the Unicode Technical Committee means that a user community may make such a decision within its own sphere of activity without any conflict over who makes which decisions arising over the matter with the Unicode Technical Committee.

The reference to those minutes is as follows.

http://www.unicode.org/consortium/utc-minutes/UTC-092-200208.html

There is included the following statement.

quote

[92-C4] Consensus: The UTC declines to consider a Unicode Object File (".uof") text format, because such work is out of scope for the Consortium.
[Document ref:
http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0420.html]

end quote

The document reference is to my posting. Access to the archive is available to all, but one needs to use the user name and password specified in the introduction to the archive, so it is perhaps best to seek out the mail list archive by going to http://www.unicode.org in the first place if one wishes to find it so that when one tries to use the direct web address of the document and is asked for a user name and password, those two items are known from having read the introduction to the archive, where they are both stated.

----

William Overington

2 January 2003


 

Astrolabe Channel

Copyright 2003 William Overington

This file is accessible as follows.

http://www.users.globalnet.co.uk/~ngo/ast03200.htm