Your Spectrum
Issue 12, March 1985 - Multisearch
Home Contents KwikPik

This program is available on "ZIPi'T'ape".

How many times have you laboriously gone through a ZX Basic program, replacing one item with another? Well, despair no more, Multisearch will quickly and automatically find and replace almost any selected item. This routine is easy to use and is only 225 bytes long. It'll run anywhere in memory (so it doesn't interfere with other utilities) and, what's more, turns out to have lots of useful and unexpected applications.


The possibilities of Multisearch aren't limited to changing one message for another. You can use it to edit long program lines, to replace keywords or to document programs (replacing line number references with names). Multisearch will also work the other way, replacing names with numbers - which is very useful if you intend to compile a Basic program into machine code.
Most interesting of all is the possibility of writing programs which edit themselves; Multisearch can easily be called while a program runs. In this article we will investigate the internal format of ZX Basic and show how you can use Multisearch to make programs faster, more concise, or to protect them against people who want to fiddle with them (Troubleshootin' Pete, please note).


The idea of Multisearch came when YS reviewed a job lot of 'programmers' toolkits' a number of months ago. [see Talking of Toolkits in issue 6] These are designed to make life easier for Basic programmers, but they all turn out to
After a brief sojourn writing commercial software, we welcome programming guru Simon Goodwin back to the pages of YS with his first major utility since ZIP! Multisearch might be somewhat smaller than its predecessor but, as a fully relocatable 'search and replace' utility in just 225 bytes, it too is dedicated to the art of speeding up your Basic programs. Don't limit yourself to any other utility - make more of Multisearch!
have a common flaw - they won't let you replace numbers in a program automatically.
Some of the toolkits had a 'search and replace' facility, but they all had annoying limitations - for example, Super Toolkit would only replace single keywords. The suggested use was to change LPRINT into PRINT or vice versa, but in fact that's pretty pointless because you can get the same effect on any Spectrum with a standard (but undocumented) command:

OPEN #2,"p"

This sends the output of PRINT statements to the printer until you cancel it with:

OPEN #2,"s"

If you want to work the other way, you can use:

OPEN #3,"s"
to send the results of every LPRINT statement to the screen. When you want to use the printer again, the command:

OPEN #3,"p"

will set things back to normal.
It's a bit more useful to be able to replace text in a program - perhaps you might want to Americanise the word 'colour' by replacing it with 'color', or enforce some similar indignity. But by far the most useful application baffles every single toolkit - the problem of changing numeric values within a program.


The accompanying figure shows the rather complicated way the Spectrum stores a simple Basic program:

10 PRINT 2+VAL "2"
20 GO TO 10
memory diagram


Most of the data is ASCII code - for instance, 34 is the code of inverted commas and 236 is the code of the keyword GO TO. A full list of the keyword values is in Appendix A of the Spectrum manual - take a look at the strange way the Spectrum stores numbers.
Most numbers in a program are also stored in a hidden 'binary form' which takes up six extra bytes. This is meant to make programs run more quickly, by removing the need for the computer to convert numbers from text to binary whenever they are found. In practice, VAL "2323" can be handled almost as fast as the number 2323, and the first version uses three less bytes, because the string value doesn't have a hidden 'binary form'.
In the figure, you can see that VAL "2" needs three less bytes than '2' on its own. The number '2' is followed by a 'marker' byte (code 14) which tells the LIST routine to skip the next five bytes - the binary form of the number. When the program RUNs, the text is ignored and the binary form is used.
The binary is in a rather odd format - one which is explained in Dr Ian Logan's excellent book,

The assembler listing for Multisearch. Grab an assembler (or a Hex loader if you're going to enter the Hex code on the left of each column) and get typing!
Understanding Your Spectrum (published by Melbourne House). Luckily, with the aid of Multisearch, you don't need to understand the format to manipulate it.
The upshot is that numbers in ZX Basic programs need careful treatment, as they can gobble up memory at an alarming rate. Some expressions for numbers are even more concise than the 'VAL' version, because they use the keyword PI instead of a number. PI only occupies one byte in a program. The accompanying table lists a few common values and the expressions to replace them, along with the number of bytes saved ('n' represents any number).
You could use variables with preset values instead of numbers to get a similar saving in space, but beware - ZX Basic is rather slow at finding the value of variables; expressions like SGN PI may be worked out more quickly, especially if your code uses lots of variables anyway.
Interestingly, values expressed using the BIN function are also stored in two forms, so that BIN 1 soaks up eight bytes - one for the keyword, one for the digit, and an extra six for the genuine binary form.
The line numbers at the start of each line are stored in a more sensible 'packed' format - each number occupying just two bytes. They are converted into decimal
ValueAbbreviationSaving (bytes)
-3-INT PI5
-1-SGN PI5
nVAL "n"3
The table above shows you just how many bytes you can save if you start using constant expressions.

by the LIST routine in the ROM. The two bytes after each line number hold the length of the line, so that Basic can skip quickly from one line to the next. An 'ENTER' character is at the end of every line. This format is briefly explained in the Spectrum manual, on page 166.
The first program given is a simple loader which will store the machine code for Multisearch at address 30000. To use it, simply RUN the program and if you've made no typing mistakes, the correct code will be stored. If there's a mistake in the data, an appropriate message should appear. It's wise to SAVE the program as soon as it has apparently run correctly, just in case an error has slipped through. If you save the code you can then load it again - without the Basic - at any address.


The routine is very easy to use, and all you need to do is load the code into any
                ;          "Find Search string S$"
7530 2A4B5C     FINDS LD   HL,(VARS)
7533 7E         NEXT1 LD   A,(HL)
7534 FE53             CP   "S"
7536 280E             JR   Z,GOT_S
7538 FE80             CP   T_END
753A 2806             JR   Z,ERROR
753C CDB819           CALL F_VAR
753F EB               EX   DE,HL
7540 18F1             JR   NEXT1
                ;          "Variable not found!"
7542 CF         ERROR RST  8
7543 01               DEFB 1
                ;          "Parameter error!"
                ;          "(Wrong string length)"
7544 CF         L_ERR RST  8
7545 19               DEFB 25
                ;          "HL points at name S$"
7546 23         GOT_S INC  HL
                ;          "Check length is >0"
7547 7E               LD   A,(HL)
7548 B7               OR   A
7549 28F9             JR   Z,L_ERR
754B 23               INC  HL
                ;          "Check length is <256"
754C 7E               LD   A,(HL)
754D B7               OR   A
754E 20F4             JR   NZ,L_ERR
7550 23               INC  HL
7551 E5               PUSH HL
                ;          "IX points at S$ text"
7552 DDE1             POP  IX
                ;          "Find replacement, R$"
7554 2A4B5C           LD   HL,(VARS)
7557 7E         NEXT2 LD   A,(HL)
7558 FE52             CP   "R"
755A 280A             JR   Z,GOT_R
755C FE80             CP   T_END
755E 28E2             JR   Z,ERROR
7560 CDB819           CALL F_VAR
7563 EB               EX   DE,HL
7564 18F1             JR   NEXT2
                ;          "HL points at name R$"
7566 23         GOT_R INC  HL
                ;          "R_LEN points at R$"
7567 22AE5C           LD   (R_LEN),HL
                ;          "Check length is <256"
756A 23               INC  HL
756B 7E               LD   A,(HL)
756C B7               OR   A
756D 20D5             JR   NZ,L_ERR
756F ED5B535C         LD   DE,(PROG)
7573 1B               DEC  DE
                ;          "**** MAIN SEARCH LOOP"
                ;          "Find length of line"
7574 13         LINE  INC  DE
7575 13               INC  DE
7576 13               INC  DE
7577 ED53AC5C         LD   (L_LEN),DE
757B 13               INC  DE
757C 13               INC  DE
757D D5         FIND  PUSH DE
                ;          "Get old data length &"
                ;          "point HL at old data"
757E DD46FE           LD   B,(IX-2)
7581 DDE5             PUSH IX
7583 E1               POP  HL
                ;          "Match B characters"
7584 1A         MATCH LD   A,(DE)
7585 BE               CP   (HL)
7586 2067             JR   NZ,GO_ON
7588 23               INC  HL
7589 13               INC  DE
758A 10F8             DJNZ MATCH
                ;          "Match found, work out"
                ;          "difference of lengths"
758C 2AAE5C           LD   HL,(R_LEN)
758F 7E               LD   A,(HL)
7590 DD96FE           SUB  (IX-2)
                ;          "A = extra bytes needed"
7593 2849             JR   Z,NO_OK
7595 302C             JR   NC,ADD_A
                ;          "Discard 256-A bytes"
7597 ED44             NEG
7599 4F               LD   C,A
                ;          "Line length=length-BC"
759A 2AAC5C           LD   HL,(L_LEN)
759D 5E               LD   E,(HL)
759E 23               INC  HL
759F 56               LD   D,(HL)

free area of memory. It's 225 bytes long, so if you've already got another machine code routine from address 53246 onwards, you might CLEAR 53020 and load the code at 53021. Multisearch will work happily on a 16K computer. If you're really pushed for space you could load it into the printer buffer at 23296, so long as you don't use the printer until you've finished with Multisearch.
Wherever it ends up, you call the routine by jumping to its start - with RANDOMIZE USR 53021, for example. But before you do this you must tell Multisearch the text you want to alter. You do this by setting the Basic variables S$ and R$.
Logically enough, S$ should contain the text you want to search for, and R$ should contain the replacement. This is the essence of the power of Multisearch - the text can be program- generated, so you're not just limited to what you can type in. You can enter keywords in strings by typing THEN (Symbol Shift 'G'), followed by the keyword, and then stepping back to scrub out the THEN before you press Enter.
If you load Multisearch into the printer buffer you could try it out with this simple program:


When you RUN the code and
PROG23635Pointer to program
VARS23627Pointer to variables
R_LEN23726Pointer to replacement
L_LEN23724Pointer to line length
SHRNK19E8HBasic delete routine
XPAND1655HBasic insert routine
F_VAR19B8HFind next entry (ROM)
NUMBR14Hidden number marker
ENTER13Line end marker
T_END128Table end marker
You'll find these labels in the assembler listing; we've separated them for those of you having problems converting the assembly code for your particular assembler.
LIST it you'll find that S$ and R$ now refer to the same text. Of course, S$ and R$ don't have to be the same length. The only restrictions are that both strings must be less than 256 characters long, and S$ mustn't be empty (!). In either case, Multisearch detects the problem before it tries to alter anything, and reports a 'Parameter error'. If S$ or R$ are not set, you'll receive a 'Variable not found' message and the program will be unchanged.
Multisearch is very fast, but it can take a few seconds to make major changes to a long program. You can break into it while it's working by pressing the Space key. The routine stops once it's made the current change and spits out a 'Break into program' message. If the routine runs out of room to make changes it'll do as much as it can and then report 'Out of memory'.
It's important to realise that Multisearch doesn't check the syntax of lines as it alters them - this would make it slow and much less versatile. However it means that you can thoroughly mess up a program by, say, changing all the LET keywords into POKEs.
If you corrupt a program in this way you'll get a 'Nonsense in Basic' error when you try to RUN it. Be careful if you change the keywords back automatically - you could end up changing genuine POKEs into 'nonsense' LETs. The moral of the story is to be careful before you use Multisearch ... if in doubt, SAVE your Basic before you mangle it.


This business of using strings is all very well, but it doesn't help us replace numbers
75A0 EB               EX   DE,HL
75A1 B7               OR   A
75A2 ED42             SBC  HL,BC
75A4 EB               EX   DE,HL
75A5 72               LD   (HL),D
75A6 2B               DEC  HL
75A7 73               LD   (HL),E
                ;          "Adjust R$, S$ pointers"
75A8 DDE5             PUSH IX
75AA E1               POP  HL
75AB ED42             SBC  HL,BC
75AD E5               PUSH HL
75AE DDE1             POP  IX
75B0 2AAE5C           LD   HL,(R_LEN)
75B3 ED42             SBC  HL,BC
75B5 22AE5C           LD   (R_LEN),HL
75B8 E1               POP  HL
                ;          "Shrink from start"
75B9 E5               PUSH HL
75BA CDE819           CALL SHRNK
75BD 181F             JR   NO_OK
                ;          "Extended jumps"
75BF 18BC       FINDX JR   FIND
75C1 18B1       LINEX JR   LINE
                ;          "Add A bytes"
75C3 4F         ADD_A LD   C,A
                ;          "Add BC to line length"
75C4 D5               PUSH DE
75C5 2AAC5C           LD   HL,(L_LEN)
75C8 5E               LD   E,(HL)
75C9 23               INC  HL
75CA 56               LD   D,(HL)
75CB EB               EX   DE,HL
75CC 09               ADD  HL,BC
75CD EB               EX   DE,HL
75CE 72               LD   (HL),D
75CF 2B               DEC  HL
75D0 73               LD   (HL),E
                ;          "Update S$, R$ pointers"
75D1 DD09             ADD  IX,BC
75D3 2AAE5C           LD   HL,(R_LEN)
75D6 09               ADD  HL,BC
75D7 22AE5C           LD   (R_LEN),HL
75DA E1               POP  HL
75DB CD5516           CALL XPAND
                ;          "Copy new data to prog"
75DE D1          NO_OK POP  DE
75DF 2AAE5C           LD   HL,(R_LEN)
75E2 0600             LD   B,0
75E4 4E               LD   C,(HL)
                ;          "Check R$ isn't empty"
75E5 79               LD   A,C
75E6 B7               OR   A
75E7 2808             JR   Z,NEXT
                ;          "Bounce HL past length"
75E9 23               INC  HL
75EA 23               INC  HL
75EB EDB0             LDIR
                ;          "Search on from (DE)"
75ED 1802             JR   NEXT
                ;          "Try the next position"
75EF D1         GO_ON POP  DE
75F0 13               INC  DE
                ;          "Check user isn't bored"
75F1 3E7F       NEXT  LD   A,127
75F3 DBFE             IN   A,(254)
75F5 1F               RRA
75F6 3802             JR   C,CONT
                ;          "Generate BREAK error!"
75F8 CF               RST  8
75F9 14               DEFB 20
                ;          "Locate end of program"
75FA 2A4B5C     CONT  LD   HL,(VARS)
75FD B7               OR   A
75FE ED52             SBC  HL,DE
                ;          "Return at end of prog"
7600 D8               RET  C
                ;          "Check for new line no."
7601 1A               LD   A,(DE)
7602 FE0D             CP   ENTER
7604 28BB             JR   Z,LINEX
                ;          "Don't scan hidden nums"
7606 FE0E             CP   NUMBR
7608 20B5             JR   NZ,FINDX
                ;          "Skip over the number"
760A 210600           LD   HL,6
760D 19               ADD  HL,DE
760E EB               EX   DE,HL
760F 18E9             JR   CONT


in program lines. We can't store a number in a string without putting it in quotes (or using STR$). LET A$="1" is OK, but LET A$=1 gives an error, and we've already discovered that numbers outside quotes have a special format. To illustrate this, try out the following program:

10 LET S$="40"
20 LET R$="60"
40 PRINT "Hello";
50 GO TO 40

When you RUN this program it'll replace the text '40' in line 50 with the text '60'. However, it won't replace the hidden binary form; the program still prints out 'Hello' over and over again, because ZX Basic uses the binary form of the line number (still 40), and ignores the text completely. You end up with a line that reads GO TO 60 and performs a GO TO 40!
This is a very useful trick to discourage people from editing your programs - you can jumble up the text of the line numbers but the program will still work correctly because the binary forms are unchanged. The hidden binary is removed when a line is edited (to stop it getting in the way as you move along the line) and the binary is re-calculated from the text when you press Enter. This means that the jumbled values are taken literally after a line is edited, changing the way the program works and hence discouraging fiddlers.
You can save a little memory by replacing the text of each number by a single digit. However you can't dispense with the text altogether - there must be some numeric text between the GO TO and the CHR$ 14, or Basic will spot the subterfuge and give the game away with a 'Nonsense in Basic' error.


We still can't alter numbers properly. The routine so far will only change text within a program ... it can't replace the binary form of numbers. The solution is to distinguish between numbers and strings, and use a small Basic program to work out the binary form of a number. An appropriate routine is given, which should be MERGEd with your Basic program once the Multisearch code is loaded.
Rather than use a complicated routine to generate binary forms, this program 'cheats' by storing the required number in a variable and then PEEKing the contents of the variable area (which always contains binary values in the same form as that used within programs).
To use the program type GO TO 9990 and press 'T' or 'N' to indicate whether you want to search for text or a number. Then type the data required, exactly as it appears in the program. If you select 'N', the program adds the numeric form to S$. Next you specify the replacement, which may (once again) be text or a number. The program STOPs once the requested changes have been made.
This technique is not ideal, but it does allow numbers to be changed properly without denying you the ability to alter numeric text and leave binary forms unchanged. If you need to process a pattern which contains a number, you'll need to add other characters around the search or replacement string, using the normal Spectrum string handling commands.
You can use the 'binary form' program as a subroutine if you replace the STOP in line 9902 with a RETURN and get rid of the CLEAR statement in line 9900. However you must make sure that V is the first variable encountered when your program is RUN. The routine finds the binary form of a number by storing it in variable V, and then PEEKing the first entry in the variable table. If V isn't the
first entry you'll get incorrect results.


Multisearch uses a number of interesting routines and could form the basis of a complete Basic toolkit. The assembly code of the routine, produced by the whizzo new Microdrive version of the Picturesque Editor Assembler, is a little more repetitious than it need be, since it's written in relocatable code. This means it'll run anywhere in memory without modification, but also that it can't use any internal subroutine calls, since the location of each subroutine is not fixed.
Broadly speaking, the program can be divided into two sections. The first part (up to the label LINE) is used to find the variables S$ and R$ and check that they contain correct values. The code to find S$ is duplicated to locate R$ - the only difference is the letter of the name and the extra check to make sure that S$ contains at least one character.
At FINDS, the program points HL into the variable area and then looks for a capital 'S'. This indicates the start of the storage allocated to S$, as explained on page 168 of the Spectrum manual. The ROM routine F_VAR is used to step from one entry to the next until the required letter is found, or the end of the table is reached - in which case a 'Variable not found' error is generated.
Strings stored in the variable area are preceded by their length, recorded in two bytes in normal Z80 fashion - low byte first. Multisearch can't cope with strings of more than 255 bytes (the code is kept simple! ) so it generates a 'Parameter error' if the most significant byte of either string length is not zero. If all goes well IX is left pointing to the text of S$.
From NEXT2 onwards the routine looks for R$. The address of the string (a pointer to the length, in this case) is stored at R_LEN, at the end of a Basic work area called MEMBOT. DE is pointed just before the start of the Basic program (as if the Enter at the end of a
120 CLEAR 29999
130 LET c=-26434
140 FOR i=30000 TO 30224
150 READ a
160 LET c=c+a
170 POKE i,a
180 NEXT i
200 SAVE "Megasearch" CODE 30000,225
210 SAVE "Megasearch"
1000 DATA 42,75,92,126,254,83,40,14
1010 DATA 254,128,40,6,205,184,25,235
1020 DATA 24,241,207,1,207,25,35,126
1030 DATA 183,40,249,35,126,183,32,244
1040 DATA 35,229,221,225,42,75,92,126
1050 DATA 254,82,40,10,254,128,40,226
1060 DATA 205,184,25,235,24,241,35,34
1070 DATA 174,92,35,126,183,32,213,237
1080 DATA 91,83,92,27,19,19,19,237
1090 DATA 83,172,92,19,19,213,221,70
1100 DATA 254,221,229,225,26,190,32,103
1110 DATA 35,19,16,248,42,174,92,126
1120 DATA 221,150,254,40,73,48,44,237
1130 DATA 68,79,42,172,92,94,35,86
1140 DATA 235,183,237,66,235,114,43,115
1150 DATA 221,229,225,237,66,229,221,225
1160 DATA 42,174,92,237,66,34,174,92
1170 DATA 225,229,205,232,25,24,31,24
1180 DATA 188,24,177,79,213,42,172,92
1190 DATA 94,35,86,235,9,235,114,43
1200 DATA 115,221,9,42,174,92,9,34
1210 DATA 174,92,225,205,85,22,209,42
1220 DATA 174,92,6,0,78,121,183,40
1230 DATA 8,35,35,237,176,24,2,209
1240 DATA 19,62,127,219,254,31,56,2
1250 DATA 207,20,42,75,92,183,237,82
1260 DATA 216,26,254,13,40,187,254,14
1270 DATA 32,181,33,6,0,25,235,24
1280 DATA 233
If you haven't got an assembler or Hex loader to hand, just type in the Basic listing of Multisearch given above and let the DATA statements work their magic.
9990 CLEAR: LET v=0: PRINT "Look for (N)umber or (T)ext?": GO SUB 9993: LET s$=a$
9991 PRINT "Replace with (N)umber or (T)ext?": GO SUB 9993: LET r$=a$
9992 RANDOMIZE USR 30000: STOP: REM 30000 is the CODE address
9993 PAUSE 0: LET b$=INKEY$: IF b$<>"N" AND b$<>"T" AND b$<>"n" AND b$<>"t" THEN GO TO 9993
9994 INPUT "Enter data ";a$: IF b$="T" OR b$="t" THEN RETURN
9995 LET v=VAL a$: LET a$=a$+CHR$ 14: LET i=PEEK 23627+256*PEEK 23628: FOR j=i+1 TO i+5: LET a$=a$+CHR$ PEEK j: NEXT i: RETURN
Once you've got Multisearch up and running, use this short routine to get the show on the road!

previous line had just been reached) and the main loop through the program begins at LINE.
At LINE the routine expects the end of a line and the start of a new one. It skips over three bytes - the Enter and line number - and stores a pointer to the line length in L_LEN. We need to know where the line length is recorded since we may need to alter it if we add or delete characters in the line.
FIND is the point at which Megasearch [sic] tries to locate the search string. DE is saved, so that we know where the match did (or didn't) occur, and then the loop at MATCH is used to see if the characters from DE onwards match those from IX onwards. Register B contains the length of S$. If the comparison fails before B reaches zero, the program leaps off to GO_ON, but if all goes well, the length of R$ is fetched and compared with that of S$. If the two are the same, execution continues at NO_OK (pronounced 'number OK'!) - otherwise some characters must be inserted or deleted so that the replacement text fits in the line.
The job of adding or removing characters is not trivial, since any change in the program size also alters the location of variables, and other useful pieces of information. Luckily, ROM routines exist to adjust the program size and make sure that nothing gets lost. SHRNK and XPAND remove or add BC characters at the location pointed to by HL. XPAND produces an 'Out of memory' error if
there's no room for the extra characters.
If S$ and R$ are different lengths then Multisearch must adjust the line length (as explained earlier) and alter the pointers to S$ and R$. Any movement of the program also sends the variables skidding around memory, since they're stored at the end of the program. This took a little while to puzzle out when we tested the machine code!
A couple of extra jumps are located between the Delete and Insert instructions - the main loop is too long to be traversed in a single relative jump (it can only cross 126 bytes at one mighty bound) so FINDX and LINEX are used as 'staging posts' on the way to FIND and LINE respectively.
Various paths meet at NO_OK. At this point a correct match has been found and the address on the stack points to the place where R$ must be stored. An LDIR is used to copy the new text into the program. This leaves DE pointing to the character after the new data, from whence the search can re-start. If S$ didn't match the program we have to advance DE and start again one byte further through the program. This step is performed at GO_ON.
Whether or not a match was found, we end up at NEXT, where the Break key is polled in case the user has decided to give up. The routine stops with a BREAK error if bit zero at port address 32766 (the Space key) is reset. At CONT the contents of the system variable VARS are compared with the address in DE.
If DE is pointing into the variable area we've finished, and the routine RETurns. Otherwise we must look further through the program, although before that we check for a couple of 'special cases'. If DE points to an 'ENTER' character we've reached the end of a line, so we should pick up the new line length by looping back to LINE.
If DE points at a number marker - CHR$ 14 - we must skip over the binary data since it could contain values which appear to be text or keywords, but aren't really. This doesn't stop us finding numbers, since those will always start with an ASCII character (probably a digit). If we've reached the CHR$ 14 we've gone too far.


There are lots of ways in which Multisearch could be improved, but the existing code works and it doesn't take long to type in! It might be useful to make it return a count of the number of replacements found, and perhaps a list of the lines in which changes were made. It would be convenient (but perhaps rather difficult) to re-code the 'binary form' program in machine code.
As it stands, Multisearch is a simple but very effective routine with a multiplicity of uses. There can't be many short routines which can be used to make ZX Basic edit-proof, faster, more concise, more readable, and more versatile. Do let me know what you make of Multisearch.
Home Contents KwikPik