|Issue 3||part 1: "Adding ZIP!"|
|Issue 4||part 2: "Adding ZIP 2!"|
|Issue 6||part 4: "ZIP to the Stars"|
|Issue 7||letter: "Unzipping Zapped ZIP"|
|Issue 10||letter: "ZIPping on a Pension"|
|This program is available on "ZIPi'T'ape",|
which holds the files for all four parts of the article.
Following on in our epic series, Simon Goodwin unlocks some of ZIP's darkest secrets, as well as detailing how to add a couple of high resolution graphics commands. And there's still more to come ...
|Last month we listed the ZIP compiler, which converts programs in a subset of ZX Basic into fast machine code. This month we'll continue our discussion of the way ZIP works, and show how you can add high resolution graphics commands to the original program. The series ends next month with a fast- action game called Star Base, which shows just how effectively ZIP can speed up Basic programs. |
GENERATION GAMESWe've explained that ZIP converts Basic into a form of 'intermediate code' which corresponds more closely with machine code, but we haven't given away many details of the translation process. That'll be put right in the next couple of pages, with a detailed example of the way ZIP synthesises machine code. (Skip this if you've already got a headache!)
If you've examined the listing published last month you may have been struck by the brevity of the 'code-generation' routine - about 30 lines from line 8000 onwards. In fact, only two lines - 8035 and 8055 - are needed most of the time. The rest handle special cases, messages to the user, and 'tweaks' to make the resultant code more efficient.
The ZIP code generator was deliberately kept short and simple since we decided at an early stage that it was, potentially, the 'messiest' part of the program. It's also the least portable - effort spent producing good Z80 code would be wasted when we came to convert the program for other machines, such as the QL. Machine code has lots of technicalities and special cases, and it's easy to get bogged down with code to handle all of these. ZIP uses a very simple protocol for code generation, to minimise the scope for confusion.
The subroutine from line 8000 onwards converts intermediate codes into machine code. It uses a library of 'templates'; groups of machine code instructions which perform a useful action, such as adding two values, printing a string or fetching the value of a variable. ZIP uses 60 templates to compile a large subset of ZX Basic.
Obviously some templates are
'general purpose' - it would be silly to have a different template to fetch each variable (FETCH A, FETCH B and so on). In this case ZIP uses a general purpose template for all 26 variables |
Only two values are passed into the code-generation routine: C, the template code (a value between one and 60) and T, an 'extra information' value used to specify specific variables, numbers, or the address of a string. Most templates ignore the value of T - they just use line 8035 to find the appropriate template, and line 8055 to copy it into place in the machine code generated by ZIP.
Line 8020 sifts out the templates which need extra information; line 8100 onwards is used to convert general purpose templates into specific ones. For instance, template 2 is the machine code to fetch a numeric value. In its raw form template 2 always fetches the value zero. Line 8110 modifies it to load the required value instead, by storing the value in T at 'pc-2' - over the 'zero' in the general purpose template. The exact part of each template altered will vary depending upon the machine code used.
Lines 8190 to 8200 handle the printing of strings. Template 26 uses the start address of the string - supplied in T, as usual - and works out the length of the string by searching for inverted commas. Once the end has been found, it plugs the start and length into the template and stores the text of the string immediately after the template.
Listing 1 is a short program which we will more-or-less compile by hand, following the rules used by ZIP. The first step is to convert the program into intermediate code. This involves breaking down and re-ordering the program,
to compensate for the odd order in which we humans write things. Last month we explained this process in detail - in effect we shuffle items so that the values needed for an operation |
The second version makes more sense to a computer, which needs three values (START, END and STEP) before it can do anything with a FOR statement. Similarly two values (X and Y coordinates) are needed by PLOT, which becomes FETCH X, FETCH Y, PLOT, instead of Basic's PLOT X,Y. Some operations take two values and leave one: for instance X+Y becomes FETCH X, FETCH Y, ADD. IF X=111 becomes FETCH X, FETCH 111, EQUALS (which gives a value zero or one depending upon whether or not X equalled 111).
Listing 2 shows the same program written as intermediate code. In each line the name of the instruction (CLS, FETCH, STORE, and so on) corresponds to a template - a value of C. The
rest of the line, be it a number, an address or a variable name, corresponds to T, the 'extra information'. |
Notice how ZIP has added the instruction PRINT OPEN before the PLOT and PRINT STRING. This tells the Spectrum to route information to the top of the screen (rather than the printer or the EDIT area) and resets any temporary colours. You have to perform a PRINT OPEN before you put any item on the screen, or your efforts might end up using the wrong colours or get sent to the wrong device.
Listing 3 shows the final machine code produced by ZIP. The listing is in Z80 machine-code, with labels to indicate the start of each template. |
A trick called 'peephole optimisation' has been used to remove redundant instructions from the program. ZIP templates use the HL registers to store temporary results. To avoid accidental loss of data, at the end of each template the result is PUSHed onto the stack so that the next template can retrieve it, if necessary, with POP HL. Quite often this leads to wasteful sequences like PUSH HL, POP HL - one template stores the value, only to have the next pull it straight back again. Without peephole improvement ZIP would produce this code for line 30, LET X=0:
ZIP avoids this waste of time and memory by using the variable PEEP, in lines 8050 and 8060. In line 8060 it sets PEEP if the last byte of a template is 229 - a PUSH HL instruction. If the next template starts with 225 (POP HL) and PEEP was set to indicate a PUSH at the end of the last template, line 8050 deletes the last byte stored and the first byte of the next template. In this way line 70, LET X=X+1, becomes:
rather than the unimproved version, which shows clearly how the templates fit together:
Notice that one PUSH HL is deliberately not removed (after the FETCH X). The next instruction, FETCH ONE, does not use the value so far - it doesn't start with POP HL.
RIGHT AND ROMMany operations can be written directly in machine code, but there are some which are very complicated or involve lots of code to handle special cases. Rather than produce hundreds of bytes of in-line code, ZIP performs some functions with calls to routines in the library and the Spectrum ROM. In the example program ZIP uses the ROM routines CLS, OPEN, RST ERROR, and PR-STRING to clear the screen, select an output channel, issue reports
and print a string. A ZIP library routine called PLOT is also used. |
If need be, specific registers are loaded before the routine is called. OPEN requires a channel number in register A. PR-STRING requires the string address in DE and the length in BC; it returns with DE pointing just after the string. The byte following RST ERROR determines the report displayed.
JUMPING AND LOOPINGIntricate templates are used to handle the statements IF, FOR and NEXT. IF is converted into a test which jumps to the next line - skipping the rest of the current one - unless the result of the condition is true. The effect is like writing:
The first fragment would be impossible in normal Basic because the PRINT doesn't have a line-number, but it is easy to do in machine code, which just trundles through the statements as it finds them, unless given orders to the contrary. The GO TO destination can always be 'this line plus one', since ZX Basic doesn't require GO TOs to be exact.
FOR ... NEXT loops are more difficult to handle - in fact, many compilers handle them unreliably. As we will find, even ZIP FOR ... NEXT loops have their idiosyncracies!
WHAT'S IT FOR?The FOR statement puts the START value into the appropriate variable, and makes a note of the address of the following statement (for the NEXT to go to). The END and STEP are calculated and stored for use by NEXT.
Note that when Basic encounters the FOR statement it immediately works out the values of the START, END and STEP values. A step of one is assumed if need be. These values are not worked out again, so a program such as:
will print five exclamation marks. The change in C is ignored because the END value is only calculated when the looping starts.
ZX Basic is unusual in that it checks for a start value beyond the end (greater or less, depending upon the direction of the STEP) and skips over the whole loop if such a case is found. This is a useful but rare feature, and ZIP, like other computer Basics, doesn't recognise that special case. ZIP always performs loops at least once, so you must use an IF test to jump over such loops if need be.
The NEXT statement retrieves the STEP, END and the address of the 'head' of the loop (the statement after the FOR). It adds the STEP (which may be negative) to the correct variable and then subtracts the END from that result. If the result is beyond the END (remember to handle negative steps again!), the program continues after the NEXT statement, otherwise it jumps back to the head of the loop.
CHEATING IS BAD!Some compilers cheat by storing the 'head' address as part of the code of the NEXT loop. This means that the program fails if more than one FOR is associated with a NEXT, since the NEXT is 'hot-wired' so that it can only jump to one FOR.
Other compilers use the stack to store the STEP, END and 'head' values. This brings problems if Basic programmers try to RETURN out of FOR ... NEXT loops, since the values on the stack get in the way of the return address.
Both of these cheats are allowable, in the sense that good Basic programmers won't use the same NEXT for more than one FOR, and they won't try to RETURN from the middle of a loop. Sadly, such messy programming is perfectly legal Basic, and it is a shame if compilers can't handle these special cases properly.
ZIP takes those special cases into account, but loops are still very fast - 110 times faster than Basic! The ZIP approach is to reserve space for an END, STEP and 'head' value for each variable-name from A to Z. This ties up 156 bytes which are only used during FOR ... NEXT loops, but it does allow programmers to RETURN and GO TO out of loops without fear of disaster. It also solves the problem of FORs outnumbering NEXTs, since each FOR stores its own 'head' value, and NEXT always uses the most recent one - just like interpreted Basic.
Using this technique, ZIP can even detect NEXT without FOR errors. CLEAR sets all the 'head' values to the address of the ROM error routine. If a NEXT is found before a FOR, the computer tries to jump back to what it thinks is the 'head' of the loop, and ends up in the ROM issuing a helpful report instead!
Armed with this information you should be able to understand Listing 3 - the example program in fluent, non-biological machine code. When you've digested that lot, we'll explain how we got a PLOT statement in there!
EXTENDING ZIPOne of the main design aims of ZIP was that it should be easy to add extra features.
Listings 4 and 5 can be used to add the high-resolution graphics commands PLOT and DRAW. The commands can be used almost exactly as in ZX Basic - you can specify temporary colour items such as OVER, INK and BRIGHT just as for PRINT. |
There are two steps involved in adding new words to ZIP. The first is to alter the compiler program so that it recognises the new words and treats them correctly. The second step is to alter the machine code library so that ZIP can generate code for the new commands.
The DATA in line 5060 tells ZIP whether or not a word can be compiled. The first value corresponds to the word RND, then comes INKEY$ and so on through the character table on pages 186 to 188 of the Spectrum manual. The fourth and tenth from last entries correspond to the words DRAW and PLOT respectively (CODEs 252 and 246). We must replace the appropriate 'n's in line 5060 with 'y's, so that ZIP allows those words. Change the line so that it ends:
Now Pass 1 will allow the words PLOT and DRAW.
Next we alter the 'Syntax parser' used in Pass 2 so that ZIP can process PLOT and DRAW correctly. The new program lines to do this are shown in listing 4. Add these lines to the copy of ZIP with the modified DATA statement. The first line, 7250, checks for codes 246 and 252. If either is found, line 7252 stores the request number needed by the code generator in 'assmod'. Up till now we've used request numbers 1 to 58. PLOT and DRAW will use requests 59 and 60. But first we set C to 52 and call Z80, which generates the code to 'open channel 2', resetting the colour items to their 'permanent' values.
We can't generate the code for PLOT or DRAW yet since we haven't worked out the 'parameters' of the command -
coordinates, INK, OVER, and so on. Line 7254 looks for colour items. It calls GETS to find the next symbol, and checks for symbols between INK and OVER. If one is found, it saves the symbol number and uses the 'MATHS' routine to work out the value or expression following the symbol. The calculation 'LET c=sep-189' converts the symbol into a request number. Then Z80 is called to generate the corresponding code. |
The program loops at line 7254 until all of the colour items have been dealt with. Notice how MATHS is always entered with the first symbol of the expression in S. It returns with the first symbol after the expression (here, a semicolon) in S. The 'GETS' routine automatically skips over spaces and colour items in a program, so that it's fairly trivial matter to process a list such as 'INK 3; PAPER 4; OVER 1;' without problems.
At last ZIP finds a character that is not a colour item, and execution continues on line 7258. This calls the 'MATHS' routine twice, to process an X and Y coordinate. Then it retrieves the original request number (PLOT or DRAW) and generates the code to plot a point or draw a line. Execution continues at ATCOLON, which issues the error message "CALCULATION NOT ALLOWED" if any symbol other than a colon or enter is found. This detects the case 'DRAW x,y,angle' - the error message isn't very clear but it'll do.
ZIP now requests code for PLOT and DRAW, but the library still doesn't contain code to handle those commands. Listing 5 is a short program which adds extra code to the library. It should be entered and RUN while the modified version of ZIP is in memory. The FOR ... NEXT loop stores the new library routines. The two GO SUB 6300s store the vectors which ZIP uses to find the routines.
saves us having to shuffle the entire library. |
Before you try out the new code, SAVE the new versions of the compiler and library on a fresh tape, just in case you've made a typing error. Delete the code of listing 5 (leaving line 8040 in place) and
try compiling a simple line like: |
If all is well, experiment with colour items. Should the computer produce odd results or crash, check that you modified the library correctly. If an error message appears, check the new lines which you have added to the ZIP Basic.
May the ZIP be with you.
Original text © 1984 Simon N Goodwin. Used with permission.|
The ZIP Compiler offer is now closed ... but an updated printed manual is still available from Simon, with old and new programs.
10 CLS 20 PRINT "Hello everyone" 30 LET X=0 40 FOR Y=166 TO 162 STEP -2 50 PLOT X,Y 60 NEXT Y 70 LET X=X+1 80 IF X=111 THEN STOP 90 GO TO 40
LINE 10: CLS LINE 20: PR-OPEN PR-STRING ("Hello everyone") FETCH 13 PR-CHR$ LINE 30: FETCH 0 STORE IN X LINE 40: FETCH 166 FETCH 162 FETCH -2 FOR Y LINE 50: FETCH X FETCH Y PLOT LINE 60: NEXT Y LINE 70: FETCH X FETCH 1 ADD STORE IN X LINE 80: FETCH X FETCH 111 EQUAL IF 81 STOP LINE 90: GO TO 40
LINE 10: CLS: LD A,2 CALL ROM-PR-OPEN CALL ROM-CLS LINE 20: PR-OPEN: LD A,2 CALL PR-OPEN PR-STRING: LD DE,ZIP-TEXT LD BC,14 CALL ROM-PR-STRING EX DE,HL JP (HL) ZIP-TEXT: DEFM 'Hello everyone' FETCH NUM: LD HL,13 PR-CHR$: LD A,L RST ROM-PRINT-A LINE 30: FETCH NUM: LD HL,0 STORE VAR: LD (VARIABLE X),HL LINE 40: FETCH NUM: LD HL,166 PUSH HL FETCH NUM: LD HL,162 PUSH HL FETCH NUM: LD HL,-2 FOR Y: LD (STEP Y),HL POP HL LD (END Y),HL POP HL LD (VARIABLE Y),HL LD HL,LINE 50 LD (HEAD Y),HL LINE 50: FETCH VAR: LD HL,(VARIABLE X) PUSH HL FETCH VAR: LD HL,(VARIABLE Y) PLOT: POP DE CALL ZIP-PLOT LINE 60: NEXT Y: LD HL,(VARIABLE Y) LD BC,(STEP Y) ADD HL,BC LD (VARIABLE Y),HL LD DE,(END Y) OR A SBC HL,DE JR Z,ZIP-LOOP LD A,H XOR B AND 128 JR Z,LINE 70 ZIP-LOOP: LD HL,(HEAD Y) JP (HL) LINE 70: FETCH VAR: LD HL,(VARIABLE X) PUSH HL FETCH NUM: LD HL,1 ADD: POP DE ADD HL,DE STORE VAR: LD (VARIABLE X),HL LINE 80: FETCH VAR: LD HL,(VARIABLE X) PUSH HL FETCH NUM: LD HL,111 EQUAL: POP DE OR A SBC HL,DE LD HL,1 JR Z,IF DEC HL IF: LD A,H OR L JP Z,LINE 90 STOP: RST ERROR DEFB STOP-REPORT LINE-90: GO TO: JP LINE 40
7250 IF s<>246 THEN IF s<>252 THEN GO TO 7260 7252 LET assmod=59+(s=252): LET c=52: GO SUB Z80: REM Open #2 7254 GO SUB gets: IF s>=ink AND s<=over THEN LET sep=s: GO SUB gets: GO SUB maths: LET c=sep-189: GO SUB Z80: GO TO 7254 7258 GO SUB maths: GO SUB gets: GO SUB maths: LET c=assmod: GO SUB Z80: GO TO atcolon
90 REM **** PLOT & DRAW loader 100 LET t=0 105 FOR i=53392 TO 53455 110 READ d: LET t=t+d 115 POKE i,d 120 NEXT i 125 IF t=7364 THEN GO TO 140 130 PRINT "ERROR IN DATA!" 135 STOP 140 LET i=53365: LET t=53392: GO SUB 6300 145 LET i=53367: LET t=53398: GO SUB 6300 150 STOP 200 REM 1000 DATA 225,193,69,205,229,34,225,209 1010 DATA 205,155,208,217,229,217,203,124 1020 DATA 40,11,124,47,103,125,47,111 1030 DATA 35,38,255,24,2,38,1,203 1040 DATA 122,40,11,122,47,87,123,47 1050 DATA 95,19,22,255,24,2,22,1 1060 DATA 69,75,90,84,205,186,36,217 1070 DATA 225,217,201,1,166,211,1,190 8040 IF c=58 OR c=60 THEN LET j=4+(c=60)