Internal streamlining of the software of the 1456 Engine

Internal streamlining of the software of the 1456 Engine.

William Overington

The design of the 1456 object code system has proceeded gradually with developments being added into the webspace from time to time. As part of this process a 1456 engine has been implemented and been gradually updated. This implementation uses a program written in Java. The main goal has been to get the 1456 object code system implemented as an effective means of producing learning material for free to the end user distance education. However, the software in the implemenation is somewhat inefficient as is explained below and an attempt is now to be made to improve its efficiency so that 1456 object code may run faster. This story has no direct connection to ordinary use of 1456 object code for programming and so may be left aside by people concerned only with programming using 1456 object code. It is here mentioned however that three new 1456 instructions |b |e and &| are introduced in this document and that these three instructions, which may be used to produce a stopwatch, may also have application in other programs.

In implementing the 1456 engine, 1456 software is interpreted by taking a character, c, and using it in if statements of the form that follows, where X is some character whose possible occurrence is being checked, such as r or & or whatever.

if (c == 'X')
  {
    //some java software in here.
  }

In experimentally implementing the 1456 engine I used multiple sequential if statements, as "if ... if ... if ..." whereas I could have used them in an if and else if format as "if ... else if ... else if ..." which would be more efficient as in this sequence of detecting a particular character all of the if statements are mutually exclusive. However, I did not use an else if construct quite deliberately as I was unsure as to what level of nesting the system would tolerate and I felt it more important to get the engine implemented. With two character 1456 instructions, the first character is detected in one sequence of if statements and then the second character detected in another sequence of if statements within an inner block. So that, for example, adding all of the quaternion instructions starting with Q and u only slowed the engine down for non-quaternion containing programs by two if checks, one for Q and one for u, rather than by a check for each of the many quaternion instructions. Nevertheless, each time any instruction is obeyed, a check for Q and a check for u are made even if some other instruction has already been detected and obeyed. The engine has grown greatly and there are now a lot more instructions than I had first imagined. It is time to try to save a copy of the present engine, just in case the attempt at streamlining does not work, and to experiment to improve the speed of the engine.

Two methods of streamlining are immediately apparent as the experimental work starts. The first is to try to change the "if ... if ... if ..." sequences to "if ... else if ... else if ..." sequences, both in the first character detecting sequence and in each of the various second character detecting sequences. The second is to try to put the detection of the various 1456 commands into an order that takes account of the frequencies of their typical usages, so that more frequently used commands are detected more quickly. This is to some extent subjective. For example &w is used much more frequently than &H so w should be checked before H within the & inner detection block.

Various matters arise. The first is that the speed of running of 1456 object code will depend on various factors. These include the speed of the computer hardware upon which the program is running, so although I may get speed figures for the PC that I happen to be using, that speed will only apply to this particular PC or type of PC. Also, the particular 1456 engine. It is possible that someone may have already implemented a 1456 engine to my stated 1456 object code design and that it runs much faster due to a different and more efficient design of the engine, either using Java or some other method. There is also the possibility, a possibility that I do not have available here, yet which someone reading this may have available, of compiling the Java bytecode from my implementation of the 1456 engine to native code and thus running 1456 object code in that manner.

So the absolute speed of running 1456 object code that I can achieve here is not the critical issue as that speed is only relevant to my engine that implements 1456 object code running on this particular PC. However absolute speed will be interesting and also the comparison of two absolute speeds for doing the same task by a 1456 program run using this engine and by a Java program coded in an ordinary applet. Such a task might be to add a million random numbers together using a loop, or some similar task. I wonder what the speed ratio might be. 1456 object code has many potential applications for programs preparing illustrations and performing computations that are not processor intensive and where a speed of one tenth or even one hundredth of a directly programmed applet might be acceptable. For example, a computation where a lot of time is spent by the processor idling waiting for user input.

Part of the, shall we say, romance, of 1456 object code is to imagine an actual 1456 microprocessor carrying out the computations. Microprocessors have traditionally been characterized as having a speed measured in mips, that is "millions of instructions per second" so it is interesting to seek to measure the speed in mips of the 1456 engine, even though this is affected greatly by the hardware system being used. So a few 1456 benchmark programs that output a speed in mips for 1456 object code using whatever engine is being used on whatever hardware is being used may be interesting. Although 1456 object code is interpreted and although Java bytecode is usually interpreted, thus giving two levels of interpreation, it will be interesting to compare the mips speed obtained with the mips speed obtainable by the most powerful microprocessor hardware that was available just a few years ago.

In order to have fun with such tests three new instructions are added.

|b Start the stopwatch. The b is from begin.
|e Stop the stopwatch. The e is from end.
&| Place the time interval from the stopwatch into ai1456. The value is in milliseconds. This is only within the accuracy limits of the local hardware and software.

Thus a test can be performed using |b before entering a loop and |e after leaving a loop. The &| may be used to find the time taken. Manually counting the number of instructions in the loop and multiplying it by the number of cycles through the loop gives the number of instructions obeyed during the stated time. A test running for, say, ten to twenty seconds should give a good idea of the mips speed.

Having implemented the new instructions, I then used the program below as a measure to have a starting point for the streamlining process and to take a few measurements to find out where things stand at the start of this adventure.

bench1.htm

This program contains a loop that performs 20 million 1456 object code instructions. They are mostly the r command that generates a random number. I have been conservative in what I term a 1456 object code instruction for these purposes. I have counted one instruction for each sequence of 1456 object code commands that perform an operation that one might reasonably expect a microprocessor running some other program to perform using one instruction in its own instruction set. Thus, for example, the sequence 11&<&w is counted as one instruction as in a microprocessor it would be one instruction to load a number from memory. The program outputs a starting message then obtains the stopwatch start time, then computes the 20 million instructions and then obtains stopwatch stop time. The stopwatch time measurement is then computed and the time in milliseconds is displayed and the speed in mips for 1456 object code is computed.

A figure of just under one million 1456 object code instructions per second was achieved. I then had a good go at changing the "if ... if ... if ..." sequences to "if ... else if ... else if ..." sequences and changing the order in which the various instruction codes are checked. I managed to achieve a speed increase of about 50% by this internal streamlining of my implemenation of the 1456 engine. I noticed that the time taken varied a little on each run.

Consideration was then given to comparing the speed of running of the 1456 object code test program to the speed of running of a direct Java program producing the same effect. Here is a program that does not use 1456 object code. The mips figure computed is the equivalent in terms of the number of 1456 object code instructions to carry out the same computation.

bench1direct.htm

The java source is as follows.

import java.awt.*;
import java.awt.event.*;

public class Bench1Direct extends java.applet.Applet
                             
  {
    int obeycode = 1;

    long stopwatchstartvalue = 0;
    long stopwatchstopvalue = 0;
    long stopwatchtimemeasurement = 0;
    int a;
    double mipsequivalent;

    Font font1180;

    public void init()
      {
        setBackground(Color.red);
        setLayout(null);
        setBounds(0,0,500,300);
        font1180 = new Font("SansSerif", Font.PLAIN, 18);
      }

    public void paint(Graphics screen)
      {

// The 1456 code that is being run in bench1.htm is as follows.

//    <param name="SOFTWARE01" value="1:
//    0&w11&>
//    1180?
//    {start message}
//    [Starting the test]%w
//    $Z$P100&w$X100&w$Y3404&w$C$E
//    |b
//    11:rrrrrrrrrrrrrrr11&<&w1&+11&>1000000&L11!J
//    |e
//    &| 12&> I1000/12>
//    {print the milliseconds}
//    $Z$P100&w$X200&w$Y3204&w$C12&<&w$E
//    {compute and print the mips value}
//    20w12</ $Z$P100&w$X300&w$Y3104&w$C$E
//    H">

        int i;
        double r;
        i=0;
        screen.setFont(font1180);
        screen.setColor(Color.yellow);
        screen.drawString("Starting the test",100,100);
        stopwatchstartvalue=System.currentTimeMillis();
        do
          {
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            r=Math.random();
            i=i+1;
          } while (i < 1000000);

        stopwatchstopvalue=System.currentTimeMillis();
        stopwatchtimemeasurement = stopwatchstopvalue - stopwatchstartvalue;
        a=(int)stopwatchtimemeasurement;
        screen.drawString("" + a,100,200);
        mipsequivalent=20000/((double)a);
        screen.drawString("" + mipsequivalent,100,300);

      }

    public void update(Graphics screen)
      {
        paint(screen);
      }

  }

The direct program ran somewhat faster. It too gave different times on different occasions. In a comparison test of a first time from loading and then four runs from clicking the refresh button, the bench1.htm program gave times, in milliseconds, of 14160, 13890, 13680, 14060 and 13890: the bench1direct.htm program gave times, in milliseconds, of 5830, 5660, 6270, 6050 and 6040. This means that the 1456 object code version of the program is running at about 40% of the speed of the direct Java coded program, on a fast looping program. Running the same two programs on an older PC gave times of 163400 milliseconds and 60360 milliseconds respectively, so although the speeds are about a tenth of the speeds on the newer PC nevertheless the ratio of the times taken is about the same, though perhaps a marginally slower ratio.

An interesting point is that the speed of the 1456 program running on the newer PC is much faster than the speed of the direct Java program running on the older PC. In fact, about four times as fast. Bearing in mind that this is in a computation intensive program without any waiting for user input, I feel that this is quite a good result.

I am thinking about whether there are any techniques for improving the speed of the 1456 engine, yet am aware that there will always be some overhead compared to a direct Java program and that improved hardware speed is the key to faster speed. Even the 1456 object code program with the unstreamlined 1456 engine ran much faster on the newer PC than the direct Java program ran on the older PC.

I am aware that people who read this document and who have a Java enabled browser and who try the speed tests may well get very different results as they may well be using very different hardware and software systems. I would be pleased to receive details of any such test results, stating the times and the hardware and software configuration used and here is an email link.

Email link to the author

1456 object code

\|b	Start the stopwatch. The b is from begin.
\|e	Stop the stopwatch. The e is from end.
&\|	Place the time interval from the stopwatch into ai1456. The value is in milliseconds. This is only within the accuracy limits of the local hardware and software.