Your Spectrum
Issue 5, July 1984 - ZIP compiler (part 3 of 4)
Home Contents KwikPik

See also these other articles & letters in the series:
Issue 3part 1: "Adding ZIP!"
Issue 4part 2: "Adding ZIP 2!"
Issue 6part 4: "ZIP to the Stars"
Issue 7letter: "Unzipping Zapped ZIP"
Issue 10letter: "ZIPping on a Pension"
This program is available on "ZIPi'T'ape",
which holds the files for all four parts of the article.

The editing mistakes in listings 4 & 5 have been corrected here,
as explained in "Unzipping Zapped ZIP" in the September issue.

series title

Following on in our epic series, Simon Goodwin unlocks some of ZIP's darkest secrets, as well as detailing how to add a couple of high resolution graphics commands. And there's still more to come ...

article title
Last month we listed the ZIP compiler, which converts programs in a subset of ZX Basic into fast machine code. This month we'll continue our discussion of the way ZIP works, and show how you can add high resolution graphics commands to the original program. The series ends next month with a fast- action game called Star Base, which shows just how effectively ZIP can speed up Basic programs.


We've explained that ZIP converts Basic into a form of 'intermediate code' which corresponds more closely with machine code, but we haven't given away many details of the translation process. That'll be put right in the next couple of pages, with a detailed example of the way ZIP synthesises machine code. (Skip this if you've already got a headache!)
If you've examined the listing published last month you may have been struck by the brevity of the 'code-generation' routine - about 30 lines from line 8000 onwards. In fact, only two lines - 8035 and 8055 - are needed most of the time. The rest handle special cases, messages to the user, and 'tweaks' to make the resultant code more efficient.
The ZIP code generator was deliberately kept short and simple since we decided at an early stage that it was, potentially, the 'messiest' part of the program. It's also the least portable - effort spent producing good Z80 code would be wasted when we came to convert the program for other machines, such as the QL. Machine code has lots of technicalities and special cases, and it's easy to get bogged down with code to handle all of these. ZIP uses a very simple protocol for code generation, to minimise the scope for confusion.
The subroutine from line 8000 onwards converts intermediate codes into machine code. It uses a library of 'templates'; groups of machine code instructions which perform a useful action, such as adding two values, printing a string or fetching the value of a variable. ZIP uses 60 templates to compile a large subset of ZX Basic.
Obviously some templates are 'general purpose' - it would be silly to have a different template to fetch each variable (FETCH A, FETCH B and so on). In this case ZIP uses a general purpose template for all 26 variables
Listing 1
A pointless program in ZX Basic.
Listing 1
and modifies it to handle the specific one required.
Only two values are passed into the code-generation routine: C, the template code (a value between one and 60) and T, an 'extra information' value used to specify specific variables, numbers, or the address of a string. Most templates ignore the value of T - they just use line 8035 to find the appropriate template, and line 8055 to copy it into place in the machine code generated by ZIP.
Line 8020 sifts out the templates which need extra information; line 8100 onwards is used to convert general purpose templates into specific ones. For instance, template 2 is the machine code to fetch a numeric value. In its raw form template 2 always fetches the value zero. Line 8110 modifies it to load the required value instead, by storing the value in T at 'pc-2' - over the 'zero' in the general purpose template. The exact part of each template altered will vary depending upon the machine code used.
Lines 8190 to 8200 handle the printing of strings. Template 26 uses the start address of the string - supplied in T, as usual - and works out the length of the string by searching for inverted commas. Once the end has been found, it plugs the start and length into the template and stores the text of the string immediately after the template.
Listing 1 is a short program which we will more-or-less compile by hand, following the rules used by ZIP. The first step is to convert the program into intermediate code. This involves breaking down and re-ordering the program,
to compensate for the odd order in which we humans write things. Last month we explained this process in detail - in effect we shuffle items so that the values needed for an operation
Listing 2
The same pointless program in intermediate code.
Listing 2
are made ready before the operation is performed. For example:

FOR Y=166 TO 162 STEP -2



The second version makes more sense to a computer, which needs three values (START, END and STEP) before it can do anything with a FOR statement. Similarly two values (X and Y coordinates) are needed by PLOT, which becomes FETCH X, FETCH Y, PLOT, instead of Basic's PLOT X,Y. Some operations take two values and leave one: for instance X+Y becomes FETCH X, FETCH Y, ADD. IF X=111 becomes FETCH X, FETCH 111, EQUALS (which gives a value zero or one depending upon whether or not X equalled 111).
Listing 2 shows the same program written as intermediate code. In each line the name of the instruction (CLS, FETCH, STORE, and so on) corresponds to a template - a value of C. The

rest of the line, be it a number, an address or a variable name, corresponds to T, the 'extra information'.
Notice how ZIP has added the instruction PRINT OPEN before the PLOT and PRINT STRING. This tells the Spectrum to route information to the top of the screen (rather than the printer or the EDIT area) and resets any temporary colours. You have to perform a PRINT OPEN before you put any item on the screen, or your efforts might end up using the wrong colours or get sent to the wrong device.
Listing 3
The program in Z80 code. You'd rather type in Listing 1 than Listing 3. That's why ZIP is useful!
Listing 3
In principle this 'OPEN' step is what allows the Spectrum to PRINT to the network, screen, printer, microdrive or anything - the PRINT OPEN command is normally performed automatically by Basic's PRINT and PLOT routines, but ZIP must make a point of carrying that step out. The sequence FETCH 13, PR-CHR$ prints the 'new line' character at the end of the message.
Listing 3 shows the final machine code produced by ZIP. The listing is in Z80 machine-code, with labels to indicate the start of each template.
A trick called 'peephole optimisation' has been used to remove redundant instructions from the program. ZIP templates use the HL registers to store temporary results. To avoid accidental loss of data, at the end of each template the result is PUSHed onto the stack so that the next template can retrieve it, if necessary, with POP HL. Quite often this leads to wasteful sequences like PUSH HL, POP HL - one template stores the value, only to have the next pull it straight back again. Without peephole improvement ZIP would produce this code for line 30, LET X=0:


ZIP avoids this waste of time and memory by using the variable PEEP, in lines 8050 and 8060. In line 8060 it sets PEEP if the last byte of a template is 229 - a PUSH HL instruction. If the next template starts with 225 (POP HL) and PEEP was set to indicate a PUSH at the end of the last template, line 8050 deletes the last byte stored and the first byte of the next template. In this way line 70, LET X=X+1, becomes:


rather than the unimproved version, which shows clearly how the templates fit together:


Notice that one PUSH HL is deliberately not removed (after the FETCH X). The next instruction, FETCH ONE, does not use the value so far - it doesn't start with POP HL.


Many operations can be written directly in machine code, but there are some which are very complicated or involve lots of code to handle special cases. Rather than produce hundreds of bytes of in-line code, ZIP performs some functions with calls to routines in the library and the Spectrum ROM. In the example program ZIP uses the ROM routines CLS, OPEN, RST ERROR, and PR-STRING to clear the screen, select an output channel, issue reports
and print a string. A ZIP library routine called PLOT is also used.
If need be, specific registers are loaded before the routine is called. OPEN requires a channel number in register A. PR-STRING requires the string address in DE and the length in BC; it returns with DE pointing just after the string. The byte following RST ERROR determines the report displayed.


Intricate templates are used to handle the statements IF, FOR and NEXT. IF is converted into a test which jumps to the next line - skipping the rest of the current one - unless the result of the condition is true. The effect is like writing:

20 rest of program

instead of:

20 rest of program

The first fragment would be impossible in normal Basic because the PRINT doesn't have a line-number, but it is easy to do in machine code, which just trundles through the statements as it finds them, unless given orders to the contrary. The GO TO destination can always be 'this line plus one', since ZX Basic doesn't require GO TOs to be exact.
FOR ... NEXT loops are more difficult to handle - in fact, many compilers handle them unreliably. As we will find, even ZIP FOR ... NEXT loops have their idiosyncracies!


The FOR statement puts the START value into the appropriate variable, and makes a note of the address of the following statement (for the NEXT to go to). The END and STEP are calculated and stored for use by NEXT.
Note that when Basic encounters the FOR statement it immediately works out the values of the START, END and STEP values. A step of one is assumed if need be. These values are not worked out again, so a program such as:

10 LET C=5: FOR L=1 TO C: LET C=2: PRINT "!": NEXT L

will print five exclamation marks. The change in C is ignored because the END value is only calculated when the looping starts.
ZX Basic is unusual in that it checks for a start value beyond the end (greater or less, depending upon the direction of the STEP) and skips over the whole loop if such a case is found. This is a useful but rare feature, and ZIP, like other computer Basics, doesn't recognise that special case. ZIP always performs loops at least once, so you must use an IF test to jump over such loops if need be.

small title
The NEXT statement retrieves the STEP, END and the address of the 'head' of the loop (the statement after the FOR). It adds the STEP (which may be negative) to the correct variable and then subtracts the END from that result. If the result is beyond the END (remember to handle negative steps again!), the program continues after the NEXT statement, otherwise it jumps back to the head of the loop.


Some compilers cheat by storing the 'head' address as part of the code of the NEXT loop. This means that the program fails if more than one FOR is associated with a NEXT, since the NEXT is 'hot-wired' so that it can only jump to one FOR.
Other compilers use the stack to store the STEP, END and 'head' values. This brings problems if Basic programmers try to RETURN out of FOR ... NEXT loops, since the values on the stack get in the way of the return address.
Both of these cheats are allowable, in the sense that good Basic programmers won't use the same NEXT for more than one FOR, and they won't try to RETURN from the middle of a loop. Sadly, such messy programming is perfectly legal Basic, and it is a shame if compilers can't handle these special cases properly.
ZIP takes those special cases into account, but loops are still very fast - 110 times faster than Basic! The ZIP approach is to reserve space for an END, STEP and 'head' value for each variable-name from A to Z. This ties up 156 bytes which are only used during FOR ... NEXT loops, but it does allow programmers to RETURN and GO TO out of loops without fear of disaster. It also solves the problem of FORs outnumbering NEXTs, since each FOR stores its own 'head' value, and NEXT always uses the most recent one - just like interpreted Basic.
Using this technique, ZIP can even detect NEXT without FOR errors. CLEAR sets all the 'head' values to the address of the ROM error routine. If a NEXT is found before a FOR, the computer tries to jump back to what it thinks is the 'head' of the loop, and ends up in the ROM issuing a helpful report instead!
Armed with this information you should be able to understand Listing 3 - the example program in fluent, non-biological machine code. When you've digested that lot, we'll explain how we got a PLOT statement in there!


One of the main design aims of ZIP was that it should be easy to add extra features.
Listings 4 and 5 can be used to add the high-resolution graphics commands PLOT and DRAW. The commands can be used almost exactly as in ZX Basic - you can specify temporary colour items such as OVER, INK and BRIGHT just as for PRINT.
Listing 4
Listing 4
The only restriction is that ZIP does not allow the format 'DRAW x,y,angle'. We omitted this because the ROM routine used to produce curved lines is very slow, even when called directly from ZIP. It is one of the parts of the Spectrum ROM written in a floating-point version of the language Forth. The equivalent machine code program would be very long, although much faster.
There are two steps involved in adding new words to ZIP. The first is to alter the compiler program so that it recognises the new words and treats them correctly. The second step is to alter the machine code library so that ZIP can generate code for the new commands.
The DATA in line 5060 tells ZIP whether or not a word can be compiled. The first value corresponds to the word RND, then comes INKEY$ and so on through the character table on pages 186 to 188 of the Spectrum manual. The fourth and tenth from last entries correspond to the words DRAW and PLOT respectively (CODEs 252 and 246). We must replace the appropriate 'n's in line 5060 with 'y's, so that ZIP allows those words. Change the line so that it ends:


Instead of:


Now Pass 1 will allow the words PLOT and DRAW.
Next we alter the 'Syntax parser' used in Pass 2 so that ZIP can process PLOT and DRAW correctly. The new program lines to do this are shown in listing 4. Add these lines to the copy of ZIP with the modified DATA statement. The first line, 7250, checks for codes 246 and 252. If either is found, line 7252 stores the request number needed by the code generator in 'assmod'. Up till now we've used request numbers 1 to 58. PLOT and DRAW will use requests 59 and 60. But first we set C to 52 and call Z80, which generates the code to 'open channel 2', resetting the colour items to their 'permanent' values.
We can't generate the code for PLOT or DRAW yet since we haven't worked out the 'parameters' of the command -
coordinates, INK, OVER, and so on. Line 7254 looks for colour items. It calls GETS to find the next symbol, and checks for symbols between INK and OVER. If one is found, it saves the symbol number and uses the 'MATHS' routine to work out the value or expression following the symbol. The calculation 'LET c=sep-189' converts the symbol into a request number. Then Z80 is called to generate the corresponding code.
The program loops at line 7254 until all of the colour items have been dealt with. Notice how MATHS is always entered with the first symbol of the expression in S. It returns with the first symbol after the expression (here, a semicolon) in S. The 'GETS' routine automatically skips over spaces and colour items in a program, so that it's fairly trivial matter to process a list such as 'INK 3; PAPER 4; OVER 1;' without problems.
At last ZIP finds a character that is not a colour item, and execution continues on line 7258. This calls the 'MATHS' routine twice, to process an X and Y coordinate. Then it retrieves the original request number (PLOT or DRAW) and generates the code to plot a point or draw a line. Execution continues at ATCOLON, which issues the error message "CALCULATION NOT ALLOWED" if any symbol other than a colon or enter is found. This detects the case 'DRAW x,y,angle' - the error message isn't very clear but it'll do.
ZIP now requests code for PLOT and DRAW, but the library still doesn't contain code to handle those commands. Listing 5 is a short program which adds extra code to the library. It should be entered and RUN while the modified version of ZIP is in memory. The FOR ... NEXT loop stores the new library routines. The two GO SUB 6300s store the vectors which ZIP uses to find the routines.
Listing 5
Listing 5
Finally line 8040 must be added to the 'code generation' routine. ZIP normally works out the length of each template by finding the address where the next one starts. The templates for PLOT and DRAW are out of order, so line 8040 bodges the 'length' value in J so that the correct length is used anyway. This is rather messy but it works, and

small title saves us having to shuffle the entire library.
Before you try out the new code, SAVE the new versions of the compiler and library on a fresh tape, just in case you've made a typing error. Delete the code of listing 5 (leaving line 8040 in place) and
try compiling a simple line like:
10 PLOT 0,0: DRAW 100,100
If all is well, experiment with colour items. Should the computer produce odd results or crash, check that you modified the library correctly. If an error message appears, check the new lines which you have added to the ZIP Basic.
May the ZIP be with you.
I know I stated in the first part of this article that this was to be a mere three- part series. Well, tough. Having lived with this project for the best part of six months and seen it grow from a mere hint of an idea to what I consider to be a cracking piece of code, I'm not letting you off so lightly. In the fourth part of Adding Zip, I will be presenting a simple yet addictive game in ZX Basic. Yes, you've guessed it - an ideal program to ZIP. Catch me next month, and let the power of the ZIP speak for itself.
Simon Goodwin

From bended knees we at YS wish to make amends for the mess we made of last issue's Adding ZIP article. There's a list of the correction in Frontlines, entitled 'THE DAY ZIP GOT ZAPPED', but if you'd like a photocopy of the article in the same condition as it left Simon't typewriter, send us an SAE and we'll bundle it off post haste.

Original text © 1984 Simon N Goodwin. Used with permission.
The ZIP Compiler offer is now closed ... but an updated printed manual is still available from Simon, with old and new programs.

Listing 1 [back]

10 CLS
20 PRINT "Hello everyone"
30 LET X=0
40 FOR Y=166 TO 162 STEP -2
70 LET X=X+1
90 GO TO 40

Listing 2 [back]

LINE 10:    CLS
            PR-STRING   ("Hello everyone")
            FETCH       13
LINE 30:    FETCH       0
            STORE IN    X
LINE 40:    FETCH       166
            FETCH       162
            FETCH       -2
            FOR         Y
LINE 50:    FETCH       X
            FETCH       Y
LINE 60:    NEXT        Y
LINE 70:    FETCH       X
            FETCH       1
            STORE IN    X
LINE 80:    FETCH       X
            FETCH       111
            IF          81
LINE 90:    GO TO       40

Listing 3 [back]

LINE 10:
CLS:        LD         A,2
            CALL       ROM-PR-OPEN
            CALL       ROM-CLS
LINE 20:
PR-OPEN:    LD         A,2
            CALL       PR-OPEN
            LD         BC,14
            CALL       ROM-PR-STRING
            EX         DE,HL
            JP         (HL)
ZIP-TEXT:   DEFM       'Hello everyone'
FETCH NUM:  LD         HL,13
PR-CHR$:    LD         A,L
            RST        ROM-PRINT-A
LINE 30:
FETCH NUM:  LD         HL,0
LINE 40:
FETCH NUM:  LD         HL,166
            PUSH       HL
FETCH NUM:  LD         HL,162
            PUSH       HL
FETCH NUM:  LD         HL,-2
FOR Y:      LD         (STEP Y),HL
            POP        HL
            LD         (END Y),HL
            POP        HL
            LD         (VARIABLE Y),HL
            LD         HL,LINE 50
            LD         (HEAD Y),HL
LINE 50:
            PUSH       HL
PLOT:       POP        DE
            CALL       ZIP-PLOT
LINE 60:
NEXT Y:     LD         HL,(VARIABLE Y)
            LD         BC,(STEP Y)
            ADD        HL,BC
            LD         (VARIABLE Y),HL
            LD         DE,(END Y)
            OR         A
            SBC        HL,DE
            JR         Z,ZIP-LOOP
            LD         A,H
            XOR        B
            AND        128
            JR         Z,LINE 70
ZIP-LOOP:   LD         HL,(HEAD Y)
            JP         (HL)
LINE 70:
            PUSH       HL
FETCH NUM:  LD         HL,1
ADD:        POP        DE
            ADD        HL,DE
LINE 80:
            PUSH       HL
FETCH NUM:  LD         HL,111
EQUAL:      POP        DE
            OR         A
            SBC        HL,DE
            LD         HL,1
            JR         Z,IF
            DEC        HL
IF:         LD         A,H
            OR         L
            JP         Z,LINE 90
STOP:       RST        ERROR
            DEFB       STOP-REPORT
GO TO:      JP         LINE 40

Listing 4 [back]

7250 IF s<>246 THEN IF s<>252 THEN GO TO 7260
7252 LET assmod=59+(s=252): LET c=52: GO SUB Z80: REM Open #2
7254 GO SUB gets: IF s>=ink AND s<=over THEN LET sep=s: GO SUB gets: GO SUB maths:
     LET c=sep-189: GO SUB Z80: GO TO 7254
7258 GO SUB maths: GO SUB gets: GO SUB maths: LET c=assmod: GO SUB Z80: GO TO atcolon

Listing 5 [back]

  90 REM **** PLOT & DRAW loader
 100 LET t=0
 105 FOR i=53392 TO 53455
 110 READ d: LET t=t+d
 115 POKE i,d
 120 NEXT i
 125 IF t=7364 THEN GO TO 140
 135 STOP
 140 LET i=53365: LET t=53392: GO SUB 6300
 145 LET i=53367: LET t=53398: GO SUB 6300
 150 STOP
 200 REM
1000 DATA 225,193,69,205,229,34,225,209
1010 DATA 205,155,208,217,229,217,203,124
1020 DATA 40,11,124,47,103,125,47,111
1030 DATA 35,38,255,24,2,38,1,203
1040 DATA 122,40,11,122,47,87,123,47
1050 DATA 95,19,22,255,24,2,22,1
1060 DATA 69,75,90,84,205,186,36,217
1070 DATA 225,217,201,1,166,211,1,190
8040 IF c=58 OR c=60 THEN LET j=4+(c=60)

Home Contents KwikPik