(The code in this article has been tested under FPC 3.56 by Jeff Fox)
As a rule the arrival of a new programming language is caused by the arrival of some new programming method or paradigm. Thus Algol-60 had marked the appearance of the structured programming approach and Pascal the advanced user-defined data types. Object-oriented programming was born coupled with Smalltalk, but C++ has marked it's coming to the "professional league". New programming paradigms demand new programming language. But Forth users need not change languages. Of course Forth is extensible and can easily adopt new paradigms. The subject of this paper demonstrates how Forth can adapt a parallel programming paradigm.
One of the "HOT" areas of Programming Science is parallel computing. C.A.R. Hoar, the author of "Communication Sequental Processes"-book is the first theoretical founder of Parallel Computing. The programming language OCCAM is based on Hoar's theory. It is considered to be one of a main parallel programming languages. There are two main features of this interesting language:
Another popular parallel programming model is LINDA by D. Gelernter.
LINDA is not a programming language, but a parallelizing extension to
a conventional programming language. There is C-LINDA, FORTRAN-LINDA
and FORTH-LINDA. LINDA the paradigm is based on the idea of active
and passive tuples. Active tuples are the processes started by some
other process and running with it in co-operation.
Passive tuples are the tools for information exchange. The passive tuples exchange information like notes on a bulletin board.
Paradigms of OCCAM and LINDA do not conflict with one another, but rather complement each other. It is hard to emulate LINDA in OCCAM and it is hard to emulate OCCAM in LINDA efficiently.
The subject of this paper is a FORTH Parallel Extension. What new has arrived with Parallel Forth? Very little.
MULTI Enable multiprocessing mode; SINGLE Disable multiprocessing mode; ||( ... ) Coupled words to add a new parallel branch (all words inside parenthesis) into the current mounted (not running!) processes' ring; EV( ... ) Coupled words to add a new parallel branch into the current running processes' ring. It's like the active tuple in LINDA and the syntax has come from FORTH-LINDA by Jeff Fox; PAR Stop the current process and start a "son" processes' ring mounted by the "parent" process by means of a ||( ... ) -construction. Wait until all "son" processes are done and then run the "parent" process again. PAUSE Deferred (for now!) word. In MULTI -mode - switch to the next process in the current ring. In SINGLE -mode - NOOP; STOP Stop a process and remove it from the current processes' ring; STOP-ALL Stop all processes in the current ring and resume the "parent" process. Very useful to emulate OCCAM ALT -processes; DATA-STACK-SIZE RETURN-STACK-SIZE USER-AREA-SIZE Variables to control USER -area's and stack's sizes. It's because different processes have different demands for stacks and the number of user-variables; BUFFER Generic word to define buffers. Use buffers for the most efficient data exchange between processes (without synchronisation); >B ( n b -- ) Output a 16-bit value to the buffer b; B> ( b -- n ) Input a 16-bit value from the buffer b; CHAN Generic word to define channels. A channel is the buffer of one-equal-length. A channel is useful for the exchange of information with synchronization. CHAN Generic word to define channels' area. >C ( n c -- ) Output a 16-bit value to the channel c; C> ( c -- n ) Input word a 16-bit value from the channel c; WORD Just like F-PC's ARRAY with the same aim.Several notes, about the source code:
1. All the user-variables of "son" that are defined in a program have the same value as its "parent's" ones before ||( ... ) or EV( ... ) -constructions. Thus it's possible to use user-variables to pass parameters from a "parent" to a "son" or to a co-operative "parent".
2. Make sure the USER-AREAs' and stacks' size for each process are big enough to run without hanging up. It's easy to write special words to fill these areas by some char and then to check how many of them were used.
3. If You've start to mount "son" processes' ring by ||( ... ) then you can't run co-operator by EV( ... ) until PAR starts this ring.
4. There is automatic memory allocation for ||(...) and EV(...) processes. But automatic memory deallocation is only for ||(...) -processes (and ALL processes inside it)! Can you guess why?
5. All processes start with empty stacks.
About the programs now. The source of the parallel FORTH extension is in Figure 1. I've used F-PC ver. 3.53. It's one of most complete FORTHs for the IBM-PC compatible and it's PUBLIC DOMAIN! The source doesn't contain CODE words, so you can transfer it easily to another FORTH platform with only minor changes. You can rewrite PAUSE in assembler to increase efficiency, but the best way is to implement Parallel Forth on some real multiprocessor hardware :-)
A parallel program example is shown in Figure 2. I've choosen a scalar vector multiplication as example of using the parallel wordset. The task is like the task X4 from chapter 4.3 of C.A.R.Hoare's "Communication Sequental Processes". I've included the equivalent OCCAM program code as comments. Sorry, I have no transputer so I have not checked out this OCCAM code, but I hope it's valid.
The second example program (Figure 3.) is less "multiprocessing" but more useful. It's an alarm clock. Type:
21 00 ALARM" It's time to go home!"
to start experimenting with Parallel FORTH!
\ ========================== Fig. 1 ============================ \ Dr. Michael B. Montvelishsky, Saransk Russia 1993 \ PARALLELISM WORDSET by Michael Montvelishsky, 25-Sep-93 ANEW PARALLEL CR .FREE \ ============================================================ \ \ It is just a simple troock :-) "TROOCK" is the Russian \ To replace the pause hook :-) pronounce of the "TRICK" \ PAUSE is CODE-word now. ' PAUSE \ OLD DEFER PAUSE ' NOOP IS PAUSE ' PAUSE \ OLD NEW 2DUP SWAP - \ OLD NEW SHIFT OVER 1+ +! \ OLD NEW ( ADJUSTED!) TUCK \ NEW OLD NEW HERE SWAP - \ NEW OLD SIZE CMOVE FORGET PAUSE \ PAUSE is DEFERed NOW! \ ============================================================ \ USER VARIABLE (NEST) \ Point to the last mounted proc VARIABLE (PARENT) \ Point to the parent proc VARIABLE (NEXT) \ Point to the next proc in ring VARIABLE RP \ To save VARIABLE SP \ stacks' value FORTH VARIABLE RETURN-STACK-SIZE \ To change return stack size VARIABLE DATA-STACK-SIZE \ -\\- data stack size VARIABLE USER-AREA-SIZE \ -\\- user area size \ GLOBAL memory allocation tool: DP CONSTANT (HEAP) : MALLOC (HEAP) +! ; : HEAP (HEAP) @ ; \ Initiate variables : UP @ (PARENT) ! UP @ (NEXT) ! 0 (NEST) ! 64 USER-AREA-SIZE ! 128 RETURN-STACK-SIZE ! 64 DATA-STACK-SIZE ! : PROC-SIZE ( -- current-proc-size ) USER-AREA-SIZE @ RETURN-STACK-SIZE @ + DATA-STACK-SIZE @ + ; : GO-NEST SP@ RP@ RP ! SP ! (NEST) @ UP ! RP @ RP! SP @ SP! ; : (PAUSE) SP@ RP@ RP ! SP ! (NEXT) @ UP ! RP @ RP! SP @ SP! ; \ Stop all current processes and go to parent process : STOP-ALL (PARENT) @ (NEXT) ! (PAUSE) ; \ Parallelism enable/disable : MULTI ['] (PAUSE) IS PAUSE ; : SINGLE ['] NOOP IS PAUSE ; \ Stop itself : STOP ( --- : Excludes proc from procs' ring ) UP @ DUP (NEXT) @ - IF \ Last in the ring? UP @ BEGIN (NEXT) 2DUP @ - WHILE @ UP ! REPEAT SWAP UP ! (NEXT) @ SWAP ! (PAUSE) ELSE STOP-ALL THEN ; \ Create NEW process with specified PFA PARRENT -process and \ process to be PREVious in processes' ring : MOUNT ( PFA PARRENT PREV --- : Create and mount new process) HEAP PROC-SIZE MALLOC \ Allocate proc space UP @ DUP >R OVER USER-AREA-SIZE @ CMOVE \ Copy parent's users OVER UP ! (NEXT) @ SWAP UP ! (NEXT) ! \ Adjusts new (NEXT) UP @ USER-AREA-SIZE @ + RETURN-STACK-SIZE @ + \ Adjust for the NEW : DUP DUP RP0 ! 4 - RP ! \ RP0 & RP DATA-STACK-SIZE @ + DUP SP0 ! SP ! \ SP0 & SP SWAP (PARENT) ! \ (PARENT) (NEST) 0! \ (NEST) -ROT RP @ 2! \ PFA to begin with UP @ SWAP UP ! (NEXT) ! \ Point PREV's (NEXT) R> UP ! \ to the NEW ; \ Run-time words for mounting and evaluating new process : (||() 2R@ 4 + UP @ (NEST) @ ?DUP 0= IF HEAP DUP DP ! THEN HEAP (NEST) ! MOUNT ; : (EV() (NEST) @ 0= IF 2R@ 4 + (PARENT) @ (NEXT) @ MOUNT THEN ; : ||( COMPILE (||() COMPILE BRANCH ?>MARK ; IMMEDIATE : EV( COMPILE (EV() COMPILE BRANCH ?>MARK ; IMMEDIATE : ) COMPILE STOP ?>RESOLVE ; IMMEDIATE : PAR (NEST) @ IF GO-NEST DP @ (HEAP) ! (NEST) 0! THEN ; \ Use chans for information exchanging & SYNCHRONISATION : CHAN 2VARIABLE ; : C> ( CHAN -- W : Get word from the channel ) >R BEGIN R@ @ 0= WHILE (PAUSE) REPEAT R@ 2+ @ R> 0! ; : >C ( W CHAN -- : put word to the channel ) >R BEGIN R@ @ WHILE (PAUSE) REPEAT R@ 2+ ! R> -1! ; \ Use buffers for EFFICIENT information exchanging \ BUFF-SIZE and MASK may be increased! 16 CONSTANT BUFF-SIZE 15 CONSTANT MASK : BUFF CREATE 0 , BUFF-SIZE ALLOT ; : .OUT 1+ ; ( IN +0 ) : .BUF 2+ ; : B> ( BUFF -- W : Get word from the buffer ) >R BEGIN R@ C@ R@ .OUT C@ = WHILE (PAUSE) REPEAT R@ .OUT C@ DUP R@ .BUF + @ SWAP 2+ MASK AND R> .OUT C! ; : >B ( W BUFF -- : Put word to the buffer ) >R R@ C@ 2+ MASK AND BEGIN DUP R@ .OUT C@ = WHILE (PAUSE) REPEAT SWAP R@ C@ R@ .BUF + ! R> C! ; : 4* 2* 2* ; \ Generic words for word and channel array : CHAN CREATE 4* HERE OVER ERASE ALLOT ?STACK DOES> SWAP 4* + ; \ IMHO WORD is more useful than F-PC's ARRAY : WORD CREATE 2* HERE OVER ERASE ALLOT ?STACK DOES> SWAP 2* + ; MULTI CR .FREE \ ================================================================ \ ========================== Fig. 2 ============================ \ Dr. Michael B. Montvelishsky, Saransk Russia 1993 fload figure1 30 CONSTANT SIZE \ DEF SIZE = 30 : USER VARIABLE U.I FORTH \ VAR U.I , SIZE WORD V1 \ V1[SIZE], SIZE WORD V2 \ V2[SIZE]: SIZE 1+ CHAN N \ CHAN N[SIZE+1], SIZE CHAN W \ W[SIZE] , SIZE CHAN E \ E[SIZE] : \ VAR T1,T2,T3,T4: \ : VECT-MULT \ PROC VECT.MULT = \ SEQ SIZE 0 DO \ U.I = [0 FOR SIZE] I 1+ I V1 ! \ V1[U.I] := U.I + 1 I 1+ I V2 ! \ V2[U.I] := U.I + 1 LOOP \ PAR SIZE 0 DO I U.I ! \ U.I = [0 FOR SIZE] I U.I ! \ PAR ||( U.I @ V1 @ U.I @ W >C ) \ W[U.I] ! V1[U.I] ||( U.I @ V2 @ U.I @ E >C ) \ E[U.I] ! V2[U.I] ||( \ SEQ U.I @ W C> \ W[U.I] ? T1 U.I @ E C> * \ E[U.I] ? T2 U.I @ N C> + \ N[U.I] ? T3 U.I @ 1+ N >C \ N[U.I+1] ! T1*T2+T3 ) \ LOOP \ ||( 0 0 N >C ) \ N ! 0 ||( SIZE N C> U. ) \ SEQ \ N[SIZE] ? T4 \ WRITE(T4) : PAR ; \ ================================================================ \ ========================== Fig. 3 ============================ \ Dr. Michael B. Montvelishsky, Saransk Russia 1993 fload figure1 \ Alarm clock ANEW ALARM HIDDEN ALSO EDITOR ALSO USER VARIABLE ALARM-HM VARIABLE ALARM-STRING FORTH : ALARM" ( HOURS MINUTES |Dr. Michael B. Montvelishksy:
" ) SWAP FLIP + 0 0. B>T DROP ALARM-HM ! \ Get time HERE ," ALARM-STRING ! \ Get alarming string EV( ALARM-HM @ BEGIN \ Wait ... 1000 FOR PAUSE NEXT \ Skipping for efficiency DUP GETTIME DROP U<= UNTIL DROP SINGLE \ Disable parallelism SAVESCR SAVECURSOR \ Save screen & cursor DKGRAY >BG WHITE >FG \ Select colours TRUE ALARM-STRING @ COUNT ?SOFTERROR \ Alarm !! RESTCURSOR RESTSCR MULTI \ Restore screen&cursor ) \ Enable parallelism ; ONLY FORTH ALSO DEFINITIONS \ ================================================================+
I'm from Saransk (republic Mordovia capital, 600kM to the South-East from the Moscow). I've get my Electronics engineer diploma in 1982 in Electronics department of Mordovia State University. I've came to Leningrad (St.Petersbourg now) University in 1982 to get the postgraduate. I dealt with Robot software and got my Ph.D degree on this matter in 1989. I've get start with Forth in 1988. It's my favourite so far, but I write programs on C and Pascal sometimes. There are many of project I've done (alone or in couple with Max) on Forth. Latest and greatest from its is the embedded system for CNC of Electroerosion machinetool with High Level Geometric (Forth-based) language and CAD-futures. I'm reader (docent) in Mordovia University now. I'm using Forth in my course of programming.
From: Jeff Fox
Dr. Montvelishsky is currently doing work for Ultra Technology. He has submitted several routines for the MuP21 and F21 microprocessors under development at Computer Cowboys by Charles Moore. Dr. Montvelishsky's CORDIC function executes 50 times faster on MuP21 than on a 387. Here at Ultra Technology Dr. Montvelishsky will be consulting on the design of parallel Forth extensions for the F21 Parallel Forth engine, robotics, scientific programming, and on using Forth to teach computer science.
Dr. Michael B. Montvelishsky
avenue Lenin 21 - 29
RUSSIA 430000 Saransk