String manipulation in the 1456 object code system.

William Overington

Copyright 2000 William Overington

1456 object code, (in speech, please say "fourteen fifty-six object code") provides a number of operation codes for manipulating strings. Strings are represented as a sequence of characters using the 16 bit unicode characters of Java. They are Java strings.

There are an a register and a b register for strings, namely as1456 and bs1456.

There is also an array of memory locations for strings, ms1456 from ms1456[0] to ms1456[19]. The ms1456 array has 20 elements. String manipulation has been added to the 1456 object code system after the original documents were prepared and so there is an addition to what was written before in relation to the storage of strings expressed using the STRING01 through to STRING09 parameters of an applet. This is that those strings, if used, should be processed so that any characters expressed using the 'u plus four characters format are resolved into their proper unicode characters and the resulting strings are then stored in the 1456 engine as being in ms1456[1] through to ms1456[9]. For the avoidance of doubt, if any of the STRINGnn parameters is not used, then the corresponding ms1456 string is set at a null string, using "" at start up of the 1456 engine. This change need not concern a user of 1456 software itself, it being a matter for the writer of 1456 engine software. 1456 software writers are, however, given the additional facility that action codes 60 to 69 in the eutodraw table are used with ms1456[10] to ms1456[19]. Action code 50 is used for ms1456[0]. These action codes give rise to control codes in the range 6000 to 6999 and 5000 to 5099. Please note that only 9 of the ms1456 strings are settable directly from the applet call. The other 11 are only settable from 1456 software.

The bs1456 register is not reset automatically after a 1456 operation has been carried out. The bs1456 register is not set directly by incoming numeric characters in the way that bi1456 and bd1456 are set.

There is an instruction provided to directly introduce a string from the 1456 object code into the bs1456 register. This is the [ command.

The [ command is followed by the characters of the string as desired. Most 7 bit ascii printing characters, including the space character, may be entered directly. The string is terminated by a ] character. The string thus defined is loaded into the bs1456 register. The system is intelligent in that sets of [ characters and ] characters can be included in the string, the string being regarded as completely entered when a ] character is input at a nesting level of zero.

If an accented character is needed in a string, then this can be achieved by using a six character sequence commencing with 'u as with the inputting of individual characters.

Thus

[alpha]

is some 1456 software that loads the five character string

alpha

into the bs1456 register and

[This is example[23].]

is some 1456 software that loads the twenty character string

This is example[23].

into the bs1456 register. Please note that space characters count as characters.

As an example of the use of an accented character,

[caf'u00e9]

will produce in the bs1456 register a string consisting of the word

café

with an acute accent on the letter e.

Should one ever wish to enter the two character sequence 'u into a string, one may use the two unicode characters 'u0027 and 'u0075 in succession.

The string memory is accessed as follows.

Storage into memory is performed by the index number of the memory location followed by the store command. For example 7%> will store whatever is in as1456 into ms1456[7].

Retrieval from memory is performed by the get command. For example 7%< will make a copy of whatever is in ms1456[7] and place it in the bs1456 register.

Please note, that storage is from the a register, retrieval is to the b register.

Once a retrieved string is in the bs1456 register it can be loaded into the as1456 register or compared with whatever is already in the as1456 register.

In particular please note that a string recovered from memory and wished to be placed in the as1456 register must use %w in order to achieve this result.

For example, 7%<%w needs to be used to get a string from string memory 7 and place the result in the ac1456 register.

In order to place a character version of the string that is in as1456 into mc1456, the 'S command is used. The string is broken down into individual characters and these are placed in mc1456 starting at mc1456[0] and proceeding until all of the characters in the string have been copied out. A unicode character 0000 is placed in the next element of mc1456 after that occupied by the last character from the string. It is possible to convert a null string.

In order to place a string version of the characters that are in mc1456 into as1456, the %C command is used. The string is formed from the characters in mc1456, starting at mc1456[0] and proceeding until the character before the first character reached that is unicode character 0000. It is possible to load a null string.

The linkflag1456 is set as either true or false by two string comparison instructions.

Comparison is always between the as1456 register and the bs1456 register. The value of the as1456 and bs1456 registers is not affected.

If as1456 equals bs1456 is coded by the %E command.

If as1456 is not equal to bs1456 is coded by %F where F is the next letter after E.

Thus by setting linkflag1456 by means of a compare instruction and then using !J or !C or !R as appropriate, control over program instruction obeying order can be achieved.

The strings in as1456 and bs1456 can be concatenated, placing the result in the as1456 register, by means of the %+ operation. The contents of the bs1456 register are not affected. For example, the sequence

[alpha]%w[bet]%+

will cause the as1456 register to contain the word alphabet. The bs1456 register will contain bet as it is not altered by the %+ operation.

The 'a command allows one character to be copied from the string in as1456 and placed in the ac1456 register. The character chosen is identified by the contents of the bi1456 register. Please note that the first character of the string is extracted by using a value of 0 in bi1456, and the second character of the string is extracted by using a value of 1 in bi1456, the third character of the string is extracted by using a value of 2 in bi1456, and so on. For example, if the five character string consisting of the word

okapi

is in as1456, then 3'a will place a letter p into the ac1456 register. This may perhaps seem unusual to a 1456 programmer not used to either Java or Javascript, but faced with the choice to continue the practice used in the underlying charAt function of Java or to have 1456 object code different from Java in this respect, I decided to keep compatibility with the underlying Java commands.

The &s command places into ai1456 the length of the string that is in the as1456 register. For the avoidance of any possibility of doubt, doubt perhaps being possible because of the way that the 'a command is defined, it is stated that if the string consisting of the eight letters

elephant

is in the as1456 register then the command &s will place the value 8 into ai1456.

1456 object code

Copyright 2000 William Overington