Your Spectrum
Issue 3, May 1984 - ZIP compiler (part 1 of 4)
Home Contents KwikPik


See also these other articles & letters in the series:
Issue 4part 2: "Adding ZIP 2!"
Issue 5part 3: "Unzipping ZIP"
Issue 6part 4: "ZIP to the Stars"
Issue 7letter: "Unzipping Zapped ZIP"
Issue 10letter: "ZIPping on a Pension"
This program is available on "ZIPi'T'ape",
which holds the files for all four parts of the article.


title text

PART 1


Designing good software is all a matter of organisation. Starting this month, Simon Goodwin begins a three- part feature that not only shows you how best to construct your Basic programs - but which also gives you an excellent compiler program into the bargain!

title picture

COMPILERS

A compiler is a program which converts Basic into machine code. Ideally the machine code performs in exactly the same way as the Basic - but much more quickly. If you've had a Spectrum for long you must have noticed that all of the fastest, flashiest programs are written in machine code rather than Basic.
At the heart of any micro is a fairly crude gadget - the processor. In essence this can do three things - it can move small values in memory, it can add and subtract them, and it can 'JUMP' to look at different areas depending upon the results of its arithmetic. The processor's saving grace is that it works very fast - at roughly half a million operations a second.
Even though the processor can't directly multiply or divide, make sounds or print messages (plus hundreds of other things), it is possible to perform those operations with combinations of the simple steps which the processor can handle. For instance, multiplication can be done by repeatedly adding, printing can be done by moving patterns to the display memory, and so on. When you turn on your Spectrum, the thousands of instructions in the 16K ROM read commands from the key switches and perform appropriate actions - step by tiny step.
The 16K ROM is called an interpreter, because (just like a human translator) it converts words in one language - Basic - into another language, machine code. This involves two operations; first the instruction must be recognised, then it must be acted upon. The snag is that recognising simple Basic commands often takes much longer than performing the required action. An interpreter never 'learns by it's mistakes'. You can run this program:
10 FOR i=1 TO 1000
20 LET x=2+2
30 NEXT i

and the value of X will be worked out as slowly the thousandth time as it was the first. Each time Basic looks through the line to make sure it isn't anything nonsensical (like LET 7="WALLY"), it finds out where it keeps the value of 'X' (by searching a list) then it finds the binary form of the number '2'. Computers use binary arithmetic, unlike the decimal which caught on among humans a couple of millennia ago. Values have to be converted accordingly before they can be accepted or displayed.
Next, Basic makes a note that it will need to do an 'add' (once it has two numbers
Starting this month, we set out to illustrate the design thinking which goes into a complicated program, and to develop a useful software tool in the process. We'll list and explain ZIP, a powerful Basic compiler for the 48K Spectrum, saying how it was designed. The compiler produces very fast code - often 100 times faster than interpreted Basic. It's also very easy to use, with unusually good error- trapping, and yet it's a fairly short program - about 20K of commented Basic and 1.5K of machine code.
Basic programmers have to put up with a lot of criticism from academics and whizz-kids. They might not actually say that too much Basic will make you go blind, but they certainly imply it! This program shows that you can write intricate, powerful programs in Basic and there's no reason why they shouldn't be concise and readable too.


small title



to play with), it finds another '2', laboriously works out the binary form again, adds the two numbers using complicated instructions which are designed to handle all cases (the same code would perform 0.0002 + -99999) and puts the result away under the name 'X'.
The Spectrum has used hundreds of simple operations, where four (fetch, fetch, add, store) would have sufficed. A Basic compiler looks at the listing of a program - in the jumbled order which we humans can understand - and converts it into simple steps in the order favoured by the computer. The juggling about, testing and searching are almost eliminated and you end up with a machine code program which works just like the Basic, but faster.
The main problem is that it is hard to alter the machine code - each operation is dependent upon the ones before it. In practice you must re-translate (or 'compile') the entire program every time you want to make a change. This is a complicated (and hence, slow) task.
There are two points to be noticed here. Firstly, the compiler (the program which does the translation) doesn't have to be very fast, since it will only need to be used infrequently - you can test your programs in 'slow motion' with the interpreter and only compile them when they work. Secondly, a compiler doesn't have to recognise the whole language to be useful. So long as you

WORDS ALLOWED BY ZIP COMPILER

ABSGO SUB*OVER
ANDGO :TO*PAPER
ATIFPAUSE
ATTRINPEEK
BININKPRINT
BORDERINPUTPOKE
BRIGHTINT*REM
CHR$INVERSERETURN
CLEAR*LETRANDOMIZE
CLSNEXTSGN
DIM*NOTSTOP
FLASHORTAB
FOROUTUSR

PROGRAMMER PROFILE

As Simon Goodwin seems to have nestled quite comfortably between the pages of Your Spectrum - and you're bound to have the pleasure of his scribblings for some time to come - we thought it might be nice to tell you just a little of the man behind the keyboard. He began his career in '79 on the original Apple, but soon progressed through the Video Genie, Dragon, Memotech MTX512, Atari 800, Commodore 64 and, of course, our own lovable Speccy. He was also software designer on Central TV's Magic Micro Mission, presenter of Radio Wyvern's Computer Club and author of DK'Tronics' Gold Mine. And if that's not impressive enough, he also had a stint as a


computer- aided design programmer - and if you were watching Tomorrow's World the other night, you'll have seen some of his stuff in action.

Simon Goodwin
know which commands are allowed (and they're sensibly chosen) you can tailor your program to suit. After all, it's sure to be easier to use even a cut-down Basic than pure machine-code! However, the compiler must be able to spot - and warn you of - instructions that it doesn't recognise, and make it easy for you to go back and change them.

HOW GOOD IS ZIP?

The standard way of testing Basic is to run eight short programs, called Benchmarks. The longer these take to run, the less the Basic can do in a given time. Spectrum Basic is generally considered slow, but the figures in the Benchmark Table show that compilation makes it very much faster. The first column shows the time (in seconds) taken for the interpreter to perform a test, then comes the time taken by compiled Basic, then the ratio of the two. ZIP achieves some of its speed by only allowing whole numbers. Benchmark 8 uses complicated arithmetic which ZIP can't compile, so normal Basic beats ZIP hands down in that case!
The second table lists the Spectrum Basic words allowed by ZIP. If you use other words, the compiler will display an error message at the appropriate point in the listing and refuse to compile the program.
In the interests of speed and simplicity some Basic words aren't recognised by ZIP. The remainder is still enough to write almost any program, given sufficient effort! In future issues we'll show you how to add features
(including hires graphics!) by altering the compiler.
You may use 26 single letter numeric variables and 26 arrays. The usual arithmetic operators '+', '-', '*' and '/' are allowed, but ZIP only stores whole numbers between -32767 and 32767. For example, PRINT 99/10,5/6 displays 9 and 0. In fact you can get away with values between 32768 and 65535, so long as you don't try to multiply or divide them (they may be the result of a multiplication).
No decimal arithmetic is allowed, so SIN, COS, RND and so on are obviously not usable. Likewise strings and streams (PRINT # etc.) can't be used, although simple routines can be used to replace RND and INKEY$ for games. The words in Words Allowed Table are marked with an asterisk if their usage is restricted by ZIP.

RESTRICTIONS

The ZIP CLEAR command doesn't allow a following number. Since the compiled program is machine code it doesn't use the Basic memory areas which CLEAR <number> protects.
The DIM command differs in that all arrays must be dimensioned - but only once, each - and the DIM statement must be followed by a number, not an expression. The compiler uses DIM statements to tell how much memory to reserve for arrays. DIM is only recognised while compiling, so you can't clear an array by redimensioning it, as you can in Basic. All arrays are cleared when the program starts or a compiled CLEAR statement is executed.
ZIP speeds up GO TO and GO SUB statements by converting them into direct JUMPs to the instructions concerned. Whereas Basic has to search for lines by number the compiler knows exactly where each line is stored. This is much quicker, especially in long programs where Basic would have to search through many lines. Even in a 17 line program, a compiled GO TO was found to be 1,400 times faster than the original Basic! GO SUB and RETURN are improved almost as much. The flaw of

COMPARATIVE BENCHMARK TIMINGS

TESTDESCRIPTIONZX BasicZIPRATIO
BM1FOR loop4.880.044110 times
BM2IF loop9.020.058155 times
BM3BM2 + variable arithmetic21.930.77028 times
BM4BM2 + number arithmetic20.680.64032 times
BM5BM4 + GO SUB / RETURN25.220.66038 times
BM6BM5 + FOR loop62.800.91069 times
BM7BM6 + array storage89.961.07084 times
BM8Maths functions25.07CAN'T0 times

small title



this approach is that ZIP must know exactly which line a GO TO or GO SUB is aiming at. Lines such as GO TO 100+X or GO TO L can't be compiled.
INPUT reads numbers from the keyboard into arrays or other variables, but ZIP doesn't let you specify items to be printed at the bottom of the screen. If you put more than one variable in an input statement, they will be assigned from the keyboard one by one. The ZIP INPUT routine won't let you enter a number greater than 32767 or less than -32767. The only keys recognised are the digits, DELETE, ENTER and an optional minus sign at the start.
The INT function is ignored by the compiler, since ZIP always uses integer (whole number) arithmetic. However it is useful to put INT statements in your programs so that they give the same results - whether Basic or compiled. To be sure, always INT the value produced after a division.

FILLING YOUR 48K

The ZIP compiler places more demands on your RAM than most programs. The panel shows the memory map used - the way that ZIP shares out the 48K between BASIC, machine code, and the various tables used during compilation.
The memory map is one of the most important features of the compiler - it has a crucial bearing on the speed, size and power of the program. The map really consists of two independent sections. One is used by the Sinclair ROM, the other by the compiler. The dividing line is the point labelled 'SPECTRUM Basic AREA'. From that point downwards the map is essentially the normal SPECTRUM one. The only difference is the division of the program area into two sections.
The Basic of the ZIP compiler uses about 20K of memory. The listing which we will publish next month uses line numbers from 5000 upwards, leaving lines 1 to 4999 for the Basic to be compiled. These should be plenty - remember that ZIP bans computed line references, so it should be easy to renumber a program before compilation. We could have chosen to compile from tape, gaining memory space in the process, but we would then have lost the advantage of being able to RUN the original Basic or the compiled code at will, without loading or saving.

OVER THE TOP

The top of the Basic area is set, as usual, by a CLEAR statement. Depending upon the relative sizes of the Basic and machine code, the top can be adjusted to give more memory to Basic or ZIP. There is about 21K to be shared out between the two. Programs with lots of arrays and comments usually end up shorter in machine code than they were in BASIC. Compiled programs packed with comparisons, loops and arithmetic will probably be longer than BASIC.
It is useful to be able to save and run a compiled program on its own. ZIP automatically displays the required SAVE statement which should be used to store the machine code, library and variables on
tape (notice how they are all stored one after another). User defined graphics can optionally be stored as part of the same block.
Any area of memory is best used when there are only two 'dynamic' (growing or shrinking) tables. If you have more than two dynamic areas it is difficult to know when you have run out of space, since some of them must grow in the same direction. Collisions have to be sorted out by moving an entire table out of the way. ZIP uses dynamic areas to store compiled code, and cross-references with the original BASIC. We can accomodate these neatly by making one (the code) grow 'upwards' from the bottom of memory, and growing the cross-references down from the top.
compiler code example compiler memory map

ZIP INSTRUCTIONS

VAR FETCHNUMBER FETCHARRAY FETCHGO TO
GO SUBRETURNFORNEXT
INPUT*CLS*VAR STOREPOKE
ARRAY STOREOUTMULTIPLY*ADD
SUBTRACTDIVIDE*BORDER*INK*
PAPER*FLASH*BRIGHT*INVERSE
OVERPR-STRINGPR-NUMBER*PR-INK
PR-PAPERPR-FLASHPR-BRIGHTPR-INVERSE
PR-OVERPR-ATPR-TABPEEK
USRINSGNABS
NOTLESSLESS/EQUALGREATER
GREATER/EQUALUNEQUALPR-CHR$*IF
CLEAR*ATTR*EQUALPR-OPEN*
STOP*ORANDNEGATE
RANDOMIZEPAUSE*(PLOT*)(DRAW*)

small title



Sinclair Basic uses a plethora of dynamic areas, which is one of the reasons it's rather slow. For a full explanation of the way the Spectrum uses memory, read chapter 24 of the manual.

PASSING THE BUCK

ZIP is a 'three pass' compiler, meaning that it scans three times through the program to be compiled. In theory you can compile Basic with two passes, one to generate code and the other to 'patch' GO TOs and GO SUBs which jump forward in the program. However, these can't be turned into direct JUMPs immediately, since they refer to lines which have not yet been compiled (so the compiler doesn't know what their addresses will be).
ZIP uses an extra pass before any code is generated; a 'quick scan' through the program, checking for the most common and obvious mistakes. The program is listed onto the screen as it's processed, and helpful messages appear at the position of each error. The main idea of this pass is to quickly and accurately report simple errors - if you've made a trivial mistake at the end of the program, you don't have to wait for almost all of the code to be generated before you discover the problem.
Some, more complicated, errors are only found during pass 2, when the compiler examines your program in detail. Unlike other Spectrum compilers, ZIP doesn't give up at the first error it finds - it carries on and lists all of the errors before stopping.
By now you may have smelled a rat. It is useful to have some general information about a program, before code generation begins. It helps to know how many program lines there are, so you can make room for the table of line addresses. It's also handy to know the size all the arrays, so you can allocate fixed space for them. ZIP works out both these figures during Pass 1, and builds two fixed-size entries at either end of the 'Compiler Dynamic Area' in the ZIP Compiler Memory Map Box.
The snag of using a fixed array area is that you can't allow statements like 'DIM A(X+1)' - the size of each array must be known when the program is compiled. We
library vector table

The Library 'Vector' Table illustrates how the library can be modified without altering the Basic. Using simple offsets into the list (ie. '4' for GO TO, '5' for GO SUB, etc) code is copied from the address indicated in the list to the required template.
decided that the increased speed of the resultant code made the limitation worthwhile - see Benchmark 7 in the Benchmark Table.

ALGORITHMS AND OTHER DISEASES

Programming guru Niklaus Wirth (inventor of the languages 'Pascal' and 'Modula') coined the phrase 'algorithms + data structures = programs';
'algorithm' is a posh programming word for 'methods'. In general terms we've explained how ZIP handles data - now we must look at the 'algorithms' used to convert Basic into machine code.
Compiling a program involves three main steps - reading the original, translating it and storing the results. ZIP shares these out as follows: Pass 1 and Pass 2 read the original. Pass 2 does the translation and stores most of the results. Pass 3 just finishes off storing the compiled code.
ZIP compiler offer

small title



Micro compilers are often messy programs. They have to convert a rather ad hoc 'evolved' language - Basic - into what is - by larger computer standards - a horribly disorganised machine code. ZIP has been kept relatively simple by keeping Basic and machine code as far apart as possible. Totally separate routines handle each language. The internal 'translation' routine links the two, via just two numeric variables.
Rather than convert Basic directly into machine code, ZIP uses an intermediate language to link the two. Each intermediate instruction corresponds
directly to a short group of machine code instructions, yet less than seventy instructions are enough to represent the entire subset of BASIC. The Z80 allows over 600 instructions, yet those don't include simple operations like reading the keyboard, multiplication or division. The listings show a program in BASIC, intermediate language and machine code.
Seasoned machine-coders may criticise the code produced. It could certainly be shortened and accelerated if it was re-written by hand. However, it works just as the Basic does - except it is 150 times faster and 14 bytes shorter!
If the compiler were more complicated it could look for special cases and produce more efficient code - but, frankly, who cares? It would be impossible to generate the best possible code in every instance, even if your compiler ran for months and used a million K. For most purposes a speed-up of
30-100 times, using about as much memory, is ample improvement.

HOW IT WORKS

Once you've decided to use an intermediate code in a compiler there are three different ways you can process it. The easiest is to write a simple interpreter which scans the intermediate codes one by one and performs appropriate actions. This is the approach used in the UCSD p-System, where the 'p' stands for pseudocode. It produces concise programs but they are slow since they still have to be interpreted, although the p-code is closer to machine code than the original language. The simple interpreter must be loaded or the code can't run.
Another technique is to write subroutine calls for every operation - the intermediate code is converted into a mixture of data and GO SUB operations to perform each action. This is
THE ZIP LIBRARY First, enter the Basic listing, the ZIP Library Hex Loader, and RUN it. On-screen you will be greeted with a message telling you which line address you are working on and another asking for the first byte (in this case, byte 0) of data (ie. you should enter '6E'). Once you have filled all eight bytes, you will be asked to enter the checksum for that line (which for the first line of data is '43'). At this stage, if all is hunky dory, you will pass on to the next line - if you get it wrong, not to worry, you'll get a "CHECK ERROR" message and you can start over.
300 REM ZIP LIBRARY HEX LOADER
305 POKE 23658,8
310 FOR i=53247 TO 54466 STEP 8
315 CLS
320 LET check=0
330 PRINT "LINE ADDRESS ";i
340 FOR j=0 TO 7
350 INPUT "Enter byte ";(j);" ";
d$
360 IF LEN d$<>2 THEN GO TO 350
370 LET d=FN d(d$(1 TO 1))*16+FN
 d(d$(2 TO 2))
380 IF d<0 OR d>255 THEN GO TO 3
50
382 PRINT AT 2,j*3;d$
385 LET check=check+d
387 POKE i+j,d
390 NEXT j
400 INPUT "Enter checksum ";d$
410 IF LEN d$<>2 THEN GO TO 400
420 LET c=FN d(d$(1 TO 1))*16+FN
 d(d$(2 TO 2))
430 IF c<0 OR c>255 THEN GO TO 4
00
440 IF c<>check-INT (check/256)*
256 THEN PRINT "CHECK ERROR: Ple
ase retype line": GO TO 320
450 NEXT i
460 PRINT "Please prepare to sav
e the code."
470 SAVE "ZIP LIB 1"CODE 53247,1
300
480 GO TO 460
500 DEF FN d(t$)=CODE (t$)-7*(t$
>="A")-48

53247:6E D4 D7 D0 DB D0 DF D0=43
53255:EB D0 EE D0 F1 D0 F2 D0=FC
53263:04 D1 22 D1 26 D1 2E D1=BE
53271:32 D1 36 D1 42 D1 46 D1=34
53279:4C D1 50 D1 57 D1 5D D1=94
53287:62 D1 66 D1 6A D1 6E D1=E4
53295:72 D1 83 D1 90 D1 9B D1=64
53303:9F D1 A8 D1 A8 D1 A8 D1=DB
53311:A8 D1 A8 D1 9F D1 A8 D1=DB
53319:AE D1 B3 D1 BA D1 C0 D1=1F
53327:D3 D1 E1 D1 EB D1 F9 D1=DC
53335:05 D2 11 D2 1F D2 2A D2=A7
53343:2D D2 33 D2 36 D2 3C D2=1A
53351:48 D2 4D D2 4F D2 58 D2=84
53359:5E D2 67 D2 6B D2 6F D2=E7
53367:4D D2 6F D2 16 D3 2D D3=49
53375:44 D3 62 D3 86 D3 95 D3=0D
53383:B5 D3 CD D3 12 D4 32 D4=14
53391:01 2D D2 01 33 D2 01 36=3D
53399:D2 01 3C D2 01 48 D2 01=FD
53407:4D D2 01 4F D2 01 58 D2=6C
53415:01 5E D2 01 67 D2 01 6B=D7
53423:D2 01 6F D2 01 4D D2 01=35
53431:6F D2 01 16 D3 01 2D D3=2C
53439:01 44 D3 01 62 D3 01 86=D5
53447:D3 01 95 D3 01 B5 D3 01=C6
53455:CD D3 01 12 D4 01 32 D4=8E
53463:2A 00 00 E5 21 00 00 E5=15
53471:E1 11 00 00 2B 29 19 5E=BD
53479:23 56 EB E5 C3 00 00 CD=D9
53487:00 00 C9 E1 22 00 00 E1=AD
53495:22 00 00 E1 22 00 00 21=46
53503:00 00 22 00 00 2A 00 00=4C
53511:ED 4B 00 00 09 22 00 00=63
53519:ED 5B 00 00 B7 ED 52 28=66
53527:06 7C A8 E6 80 28 04 2A=E6
53535:00 00 E9 CD 6F D2 E5 3E=1A
53543:02 CD 01 16 CD 6B 0D E1=0C
53551:22 00 00 E1 D1 EB 73 E1=13
53559:D1 EB 01 00 00 2B 29 09=1A
53567:73 23 72 E1 C1 ED 69 E1=E1
53575:D1 CD 12 D4 E5 E1 D1 19=34
53583:E5 E1 D1 EB B7 ED 52 E5=5D
53591:E1 D1 CD CD D3 E5 E1 7D=62
53599:CD 9B 22 E1 CD 44 D3 E1=30
53607:CD 62 D3 E1 CD 16 D3 E1=7A
53615:CD 2D D3 E1 7D B7 28 02=0C
53623:2E 08 3A 91 5C E6 F7 00=3A
53631:85 32 91 5C E1 3A 91 5C=AC
53639:E6 FD 85 85 00 00 32 91=B0
53647:5C 11 00 00 01 00 00 CD=3B
53655:3C 20 EB E9 E1 CD 32 D4=E4
53663:E1 3E 16 D7 D1 7B D7 7D=AC
53671:D7 E1 3E 17 D7 7D D7 E1=19
53679:6E 26 00 E5 E1 11 00 00=6B
53687:D5 E9 C5 C1 ED 68 26 00=BF
53695:E5 E1 7C B5 28 0D 7C E6=8E
53703:80 20 05 21 01 00 18 03=E2
53711:21 FF FF E5 E1 7C E6 80=C7
53719:28 07 7C 2F 67 7D 2F 6F=5C
53727:23 E5 E1 7C B5 21 00 00=3B
53735:20 01 23 E5 E1 D1 CD B2=5A
53743:D4 21 00 00 38 03 28 01=59
53751:23 E5 E1 D1 CD B2 D4 21=2E
53759:00 00 38 01 23 E5 E1 D1=F3
53767:CD B2 D4 21 00 00 30 01=A5
53775:23 E5 E1 D1 CD B2 D4 21=2E
53783:01 00 38 03 28 01 2B E5=75
53791:E1 D1 B7 ED 52 28 03 21=F4
53799:01 00 E5 E1 7D D7 E1 7C=78
53807:B5 CA 00 00 CD 95 D3 E1=95
53815:D1 CD 86 D3 E5 E1 D1 B7=45
53823:ED 52 21 01 00 28 01 2B=B5
53831:E5 3E 02 CD 01 16 CF 08=E0
53839:E1 7C B5 28 04 21 01 00=60
53847:E3 E1 7C B5 20 01 E3 E1=DA
53855:7D 2F 6F 7C 2F 67 23 E5=35
53863:E1 22 76 5C C1 CD 3D 1F=BF
53871:AF CD 01 16 CD 6E 0D 21=FC
53879:00 5B 0E 00 AF 32 08 5C=AE
53887:3E 8F D7 3E 08 D7 3A 08=03
53895:5C FE 0D 28 38 FE 0C 28=F9
53903:22 FE 30 38 10 FE 3A 30=00
53911:0C 47 3E 06 BD 28 DD 78=D1
53919:77 2C D7 18 D7 FE 2D 20=B4
53927:D3 7D B7 20 CF 0E FF 3E=41
53935:2D 18 ED 7D B7 28 C5 3E=91
53943:20 D7 3E 08 D7 3E 08 D7=31
53951:2D 28 B4 18 B7 77 11 00=60
53959:00 21 00 5B 79 B7 28 01=D5
53967:23 7E D6 30 30 0A AF B9=49
53975:28 01 3C BD 28 9E 18 2A=2A
53983:E5 EB 11 CC 0C B7 ED 52=AF
53991:38 10 20 04 FE 08 38 0A=B4
53999:EB E1 23 7E FE 0D 20 FA=92
54007:18 82 19 29 E5 29 29 D1=E4
54015:19 EB E1 83 5F 30 01 14=0C
54023:18 C6 79 B7 28 07 7A 2F=E6
54031:57 7B 2F 5F 13 EB C9 7D=A4
54039:FE 08 28 0D B7 20 05 FD=14
54047:CB 53 BE C9 FD CB 53 FE=BE
54055:C9 FD CB 54 FE C9 7D FE=27
54063:08 28 0D B7 20 05 FD CB=E1
54071:53 B6 C9 FD CB 53 F6 C9=AC
54079:FD CB 54 F6 C9 7D FE 08=5E
54087:28 0C 30 12 3E F8 FD A6=4F
54095:53 85 32 8D 5C C9 3A 8E=84
54103:5C F6 07 32 8E 5C FD CB=3D
54111:57 EE C9 7D FE 08 28 11=CA
54119:30 18 3E C7 FD A6 53 67=AA
54127:7D 07 07 07 B4 32 8D 5C=61
54135:C9 3A 8E 5C F6 38 32 8E=DB
54143:5C C9 FD CB 57 FE C9 EB=F6
54151:29 29 29 29 29 19 11 00=F7
54159:58 19 6E 26 00 C9 21 00=EF
54167:D6 E5 75 54 5D 13 01 CF=C4
54175:00 ED B0 01 D8 1D 11 07=AB
54183:00 E1 19 2B 3E 1A 71 23=11
54191:70 19 3D 20 F9 C9 21 38=01
54199:1F E5 3E CF ED 47 ED 5E=90
54207:18 0C 7F 31 39 38 34 53=CC
54215:4E 47 26 4A 41 53 7C B5=CA
54223:20 02 CF 05 0E 00 CB 7C=4B
54231:28 08 7C 2F 67 7D 2F 6F=5D
54239:23 0C CB 7A 28 08 7A 2F=4D
54247:57 7B 2F 5F 13 0C C5 EB=2F
54255:7C 4D 21 00 00 06 10 CB=CB
54263:21 17 ED 6A ED 52 30 03=01
54271:19 18 01 0C 10 F1 67 69=0F
54279:C1 0D 20 06 2F 67 7D 2F=36
54287:6F 23 C9 7C B7 20 0C 7D=37
54295:6C 06 08 29 17 30 01 19=04
54303:10 F9 C9 4D 06 10 21 00=56
54311:00 29 CB 11 17 30 01 19=66
54319:10 F7 C9 CB 7C 28 0A 7C=C5
54327:2F 67 7D 2F 6F 23 3E 2D=3F
54335:D7 7C 4D 21 01 00 06 04=CC
54343:E5 29 54 5D 29 29 19 10=3A
54351:F7 E5 67 69 AF D1 ED 52=6B
54359:38 03 3C 18 F9 19 4F B0=A0
54367:79 20 04 CB 43 28 04 C6=9D
54375:30 D7 04 1D 20 E6 C9 F5=EC
54383:3E FE DB FE 1F 38 07 3E=B1
54391:BF DB FE 1F 30 04 F1 FF=DB
54399:ED 4D 2A B2 5C 2B F9 2B=C1
54407:2B 22 3D 5C AF 32 71 5C=94
54415:3D 32 70 5C 3C CD 01 16=5B
54423:CD 6E 0D 21 3B 5C CB 9E=69
54431:23 CB EE 3E 15 11 91 13=E4
54439:CD 0A 0C FB C3 A9 12 00=5C
54447:00 00 00 B7 CB 7C 28 07=2D
54455:CB 7A 20 07 AF 37 C9 CB=E6
54463:7A 20 02 ED 52 C9 00 00=A4

small title



(roughly) the approach used by the language FORTH. It also produces short code and it is faster than p-code (since addresses rather than code are used), but there are still lots of instructions which aren't really needed - an extra GO SUB or CALL for every code, plus operations to fetch and store data between routines (you can't just leave it in registers since temporary results would get overwritten). A library of subroutines must be present when the program is run.
The last technique is to produce a lump of 'in-line' machine code for each intermediate code. The result is a long, fast program. The lumps of code (which we'll call templates) are read from a table and then patched if need be - modified slightly so that they reference the correct data. It would be daft to have different templates for every value or variable-name, so we use generalised templates and POKE the copy so that it refers to the right data.
ZIP uses a mixture of these last two techniques. Where possible, templates are used for each operation, or where the
operation is slow or complicated calls (such as INPUT or CLS) to library routines are used - though only where the overhead of calling makes very little difference to the execution time. The ZIP Instructions panel lists the ZIP intermediate code instructions in code order (so VARiable FETCH is code 1, RETURN is code 6 and so on). Instructions are marked by an '*' if a library call is used.
Notice that words such as INVERSE, OVER and so on occur in two places. The first is for their use as statements on their own, where they change the effect of all subsequent screen-use - the second is for their use within PRINT, PLOT or DRAW statements, where they only affect items drawn in the current statement. The special code PR-OPEN is used to mark the start of a new PRINT statement (so that the temporary colours can be discarded). The last two codes - PLOT and DRAW - will be added to the compiler in part 3 of this series, as an example of the way ZIP can be extended.

THE LIBRARY

One problem with using a library is that there are inevitably errors in it, and whenever you correct part of it the addresses of subsequent routines change. If the Basic of ZIP contained the addresses of templates and library routines this would be very annoying, since every bug-fix would require
changes to a whole set of constants. ZIP doesn't work that way (if it did, we'd probably never have finished it!).
The library starts with a list of addresses, one after another. The first is the address of the 'VARiable FETCH' code, and so on. The list is produced by the assembler, as the 800-odd lines of library assembler are converted to machine code. To fix a bug in any template or routine you simply re-assemble the corrected code, and all of the addresses are put right. You can modify the library without altering the BASIC, since it just uses simple offsets into the list - '4' for GO TO, '5' for GO SUB, etc, and copies code from the address indicated in the list at position '4', '5' or whatever. The Library 'Vector' Table illustrates this.
The program at the end of the listings can be used to enter the library data. The table contains checksums (modulus 256 totals of every eight bytes) so there's a check for typing errors after each eight values. The left-hand column is the address (in decimal) and should not be entered.

AND NEXT ...

Besides the Basic listing of ZIP, next month we'll discuss the process of compilation in detail, with help from programming guru Jon Smith. If compilers confuse you, machine code is a mystery and your Spectrum is sluggish, don't miss ZIP next issue!

Original text © 1984 Simon N Goodwin. Used with permission.
The ZIP Compiler offer is now closed ... but an updated printed manual is still available from Simon, with old and new programs.

Home Contents KwikPik