\ XBENCH32.FTH - benchmark tests from a variety of sources 0 [IF] Change Notes ============ 20050106 MPE002 Modified Dhrystone code to use more macros and the return stack in PROC1. 20041106 MPE001 Adapted from the VFX Forth for Windows version. Introduction ============ These benchmarks have been collated from a variety of sources, and were originally used to test MPE's VFX code generator and optimiser. This version is for embedded system cross compiler code generator testing. The ANS cross compiler draft proposal and 32 bit targets are assumed. To paraphrase a comment by Marcel Hendrix about the Dhrystone benchmark: If you understand what both the C and Forth versions are doing, you'll never trust a benchmark again. Note that the Dhrystone code is a conversion of C code and tells you very little about performance of good Forth code. Except for legacy reasons, no attempt is made to maintain this code for non-optimising systems. This file is maintained by: Stephen Pelc MicroProcessor Engineering 133 Hill Lane Southampton SO15 5AF England tel: +44 (0)23 8063 1441 fax: +44 (0)23 8033 9691 email: stephen@mpeltd.demon.co.uk MicroProcessor Engineering claims no copyright to this code. You can do what you like with it. The code may be freely redistributed. If you modify this code, and want the changes to be incorporated in future releases, please send the changes to Stephen Pelc. The application tests have been separated from the primitive tests. Constants have been declared and modified so that the runtimes of the application tests (Sieve, Fibonacci, QuickSort) can be made similar on MPE's XVFX Forth for ARM. With the values SPECIFICS and HACKING set false, the benchmarks will test the system "out of the box". To set up the benchmark for a new Forth system, follow the harness approach supplied for the MPE Forth 6 systems. Do not modify anything in the main body of the code if you want the results to be comparable. The word COUNTER ( -- ms ) is used to return a timer tick value in milliseconds. The value SPECIFICS is provided to show the effect of particular techniques that are specific to an implementation. The value HACKING is provided to show the impact of general (but guru) code that one should not really have to write when using an optimising compiler. Most of this code makes assumptions about the nature of code generation. The QuickSort test has been refactored to reduce the effect of the array initialisation, and this is included in a separate test. Results for optimising compilers, no specifics, no hackery ========================================================== The first two results are for VFX Forth for Windows and show the impact of different memory layout and cache strategies. The results are times in milliseconds, lower=better. System (Compiler) Total Sieve Fib Sort Random LZ77 Dhry ============================================================================= P4 2.8GHz DDR266 (VFXW) 57 8 16 7 10 6 10 without +IDATA 103 9 16 7 17 44 10 Forth 6 for ARM, compiler build 883 LPC2106 60MHz ARM (VFX) 6060 1190 1090 990 910 1050 830 [THEN] DECIMAL \ ************************************************ \ Select system to be tested, set FORTHSYSTEM \ to value of selected target. \ Set SPECIFICS false to avoid system dependencies. \ Set SPECIFICS true to show off implementation tricks. \ Set HACKING false to use the base source code. \ Set HACKING true to optimise the source code. \ ************************************************ \ Hosted systems should use negative numbers. Some target \ code macros are defined differently for hosted and cross \ compiled systems. This allows us to test both a cross compiler \ and its standalone target. -1 CONSTANT VFXW \ nz for VFX Forth for Windows compatibility check 1 CONSTANT MPEv6 \ MPE Forth 6 compilers MPEv6 ( VFXW ) CONSTANT ForthSystem \ select system to test 0 CONSTANT specifics \ true to use system dependent code 0 CONSTANT hacking \ true to use "guru" level code that \ makes assumptions of an optimising compiler. : .specifics \ -- ; display trick state ." with" specifics 0= IF ." out" THEN ." extensions" ; : .hacking \ -- ; display hack state ." with" hacking 0= IF ." out" THEN ." hackery" ; : .testcond \ -- ; display test conditions .specifics ." and" .hacking ; \ ***************************** \ VFX Forth for Windows harness \ ***************************** VFXW ForthSystem = [IF] : interpreter \ -- ; ANS-XC compatibility ; : compiler \ -- ; ANS-XC compatibility ; : target \ -- ; ANS-XC compatibility ; : idata \ -- ; ANS-XC compatibility ; immediate : udata \ -- ; ANS-XC compatibility ; immediate : cdata \ -- ; ANS-XC compatibility ; immediate [defined] +idata [if] +idata \ turn on P4 optimisations variable zzz \ trigger IDATA allocation [then] [undefined] m*/ [if] [-sin : m*/ \ d1 n2 +n3 -- dquot \ *G The result dquot=(d1*n2)/n3. The intermediate value d1*n2 \ ** is triple-precision. In an ANS Forth standard program n3 \ ** can only be a positive signed number and a negative value \ ** for n3 generates an ambiguous condition, which may cause \ ** an error on some implementations. >r \ -- d1 n2 ; R: -- n3 s>d >r abs \ -- d1 |n2| ; R: -- n3 sign(n2) -rot \ -- |n2| d1 ; R: -- n3 sign(n2) s>d r> xor \ -- |n2| d1 d1h*sign(n2) ; R: -- n3 r> swap >r >r \ -- |n2| d1 ; R: -- d1h*sign(n2) n3 dabs rot \ -- |d1| |n2| ; R: -- d1h*sign(n2) n3 tuck um* 2swap um* \ -- d1h*n2 d1l*n2 ; R: -- d1h*sign(n2) n3 swap >r 0 d+ r> -rot \ -- t ; R: -- d1h*sign(n2) n3 r@ um/mod -rot r> um/mod nip swap r> IF dnegate THEN ; sin] [then] Extern: BOOL PASCAL QueryPerformanceCounter( void * int64 ); Extern: BOOL PASCAL QueryPerformanceFrequency( void * int64 ); : Counter \ -- ms \ *G Return a ticker count in milliseconds. \ *E seconds = count / freq \ ** ms = (count * 1000) / freq \ *P Note that we assume that frequency can be expressed as \ ** a positive 32 bit number. { | count[ 2 cells ] freq[ 2 cells ] -- ms } count[ QueryPerformanceCounter drop freq[ QueryPerformanceFrequency drop count[ 2@ swap \ count #1000 freq[ @ m*/ drop \ ms ; [undefined] >pos [if] : >pos \ n -- ; step to position n out @ - spaces ; [then] [THEN] \ ******************* \ MPE Forth 6 harness \ ******************* MPEv6 ForthSystem = [IF] \ #8 LoopAlignment \ ARM and 386 only : Counter \ -- ms \ *G Return a ticker count in milliseconds. Ticks ; [undefined] >pos [if] : >pos \ n -- ; step to position n out @ - spaces ; [then] : page \ -- cls ; [THEN] \ ************************************* \ Let's measure the generated code size \ ************************************* here value start-here \ ************************************ \ FORTH, Inc. 32 Bit Benchmark Source \ ************************************ CELL NEGATE CONSTANT -CELL CR .( Loading benchmark routines ) \ *********************** \ Benchmark support words \ *********************** \ column positions 40 constant time-pos 50 constant iter-pos 60 constant each-pos 70 constant extra-pos : .HEADER \ -- ; display test header cr ." Test time including overhead" time-pos 3 + >pos ." ms" iter-pos >pos ." times" each-pos >pos ." ns (each)" ; variable ms-elapsed \ elapsed time for one test variable ms-total \ cumulative time for a series of tests : TIMER ( ms iterations -- ) >r \ number of iterations counter swap - \ elapsed time in ms dup ms-elapsed ! \ save for later dup ms-total +! \ accumulate in series time-pos >pos dup 5 .r iter-pos >pos r@ . r@ 1 > if each-pos >pos 1000000 r> */ 5 .r else drop r> drop then ; : .ann \ -- ; banner announcment CR ; (( : [$ \ -- ms COUNTER ; )) : [$ \ -- ms COUNTER begin \ reduce timer granularity counter tuck <> \ by waiting for the next tick until ; : [[$$ \ -- ; initialises a set of tests 0 ms-total ! ; \ $] is the suffix to a testing word. It takes the fast ticks \ timer value and then calculates and sisplays the elapsed time. : $] ( ms n -- ) TIMER ; : $$]] \ iterations -- >r \ number of iterations ms-total @ time-pos >pos dup 5 .r iter-pos >pos r@ . r@ 1 > if each-pos >pos 1000000 r> */ 5 .r else drop r> drop then ; \ ****** \ Arrays \ ****** [UNDEFINED] CARRAY [IF] interpreter : CARRAY ( n -- ) \ CARRAY creates a byte size array. udata CREATE ALLOT cdata DOES> ( n -- a ) + ; target [THEN] [UNDEFINED] ARRAY [IF] interpreter : ARRAY ( n -- ) \ ARRAY creates a cell size array. udata CREATE CELLS ALLOT cdata DOES> ( n -- a ) SWAP CELLS + ; target [THEN] \ **************************** \ Basic FORTH, Inc. Benchmarks \ **************************** \ This series of tests analyses the Forth primitives. 100000 constant /prims \ -- #iterations; all of these words return the number of iterations : $DO$ .ann ." DO LOOP" [$ /prims DUP 0 DO I DROP LOOP $] ; : $*$ .ann ." *" [$ /prims DUP 0 DO I I * DROP LOOP $] ; : $+$ .ann ." +" [$ /prims DUP 1+ 1 DO 1000 I + DROP LOOP $] ; : $M*$ .ann ." M*" [$ /prims DUP 0 DO I I M* 2DROP LOOP $] ; : $/$ .ann ." /" [$ /prims DUP 1+ 1 DO 1000 I / DROP LOOP $] ; : $M/$ .ann ." M/" [$ /prims DUP 1+ 1 DO 1000 0 I M/ DROP LOOP $] ; : $M+$ .ann ." M+" [$ /prims DUP 1+ 1 DO 1000 0 I M+ 2DROP LOOP $] ; : $/MOD$ .ann ." /MOD" [$ /prims DUP 1+ 1 DO 1000 I /MOD 2DROP LOOP $] ; \ $*/$ tests the math primitive */ . This may or may not tell \ you how the other math primitives perform depending on \ how */ has been coded. : $*/$ .ann ." */" [$ /prims DUP 1+ 1 DO I I I */ DROP LOOP $] ; \ ********************* \ Shared results buffer \ ********************* \ All tests needing a results buffer share a common buffer \ to save RAM space. By default this is 2048 cells (8k bytes). 2048 constant /OpCells /OpCells cells constant /OpBuff /OpBuff BUFFER: Opbuff \ **************************************** \ Eratosthenes sieve benchmark program \ This is NOT the original BYTE benchmark. \ **************************************** #100 CONSTANT /primes \ -- n ; number of times to execute test 8190 CONSTANT SIZE \ -- n SIZE /OpBuff u> [IF] cr ." Sieve: Output buffer size exceeded" abort [THEN] OpBuff CONSTANT FLAGS \ Alias for buffer : DO-PRIME \ -- FLAGS SIZE -1 FILL 0 SIZE 0 DO I FLAGS + C@ IF I 2* 3 + DUP I + BEGIN DUP SIZE < WHILE DUP FLAGS + 0 SWAP C! OVER + REPEAT 2DROP 1+ THEN LOOP DROP ; : $SIEVE$ \ -- .ann ." Eratosthenes sieve " [$ /primes 0 DO DO-PRIME LOOP SIZE /primes * ." 1899 Primes" $] ; \ ******************* \ Fibonacci recursion \ ******************* 30 constant /fib : FIB ( n -- n' ) DUP 1 > IF DUP 1- RECURSE SWAP 2- RECURSE + THEN ; : $FIB$ .ann ." Fibonacci recursion ( " [$ /fib dup . ." -> " FIB dup . ." )" /fib - $] ; \ ********************************* \ QuickSort from Hoare & Wil Baden \ also contains the array fill test \ ********************************* 20 constant /fill \ -- n ; number of times to execute fill test 100 constant /sort \ -- n ; number of times to execute sort test /OpCells 2 - constant /array \ -- n ; number of elements to sort OpBuff constant PointerArray \ -- addr ; synonym for buffer 7 CELLS CONSTANT THRESHOLD compiler : Precedes ( n n - f ) u< ; : Exchange ( a1 a2 -- ) 2DUP @ SWAP @ ROT ! SWAP ! ; target : Both-Ends ( f l pivot - f l ) >R BEGIN OVER @ R@ precedes WHILE CELL 0 D+ REPEAT BEGIN R@ OVER @ precedes WHILE CELL - REPEAT R> DROP ; : Order3 ( f l - f l pivot) 2DUP OVER - 2/ -CELL AND + >R DUP @ R@ @ precedes IF DUP R@ Exchange THEN OVER @ R@ @ SWAP precedes IF OVER R@ Exchange DUP @ R@ @ precedes IF DUP R@ Exchange THEN THEN R> ; : Partition ( f l - f l' f' l) Order3 @ >R 2DUP CELL -CELL D+ BEGIN R@ Both-Ends 2DUP 1+ precedes IF 2DUP Exchange CELL -CELL D+ THEN 2DUP SWAP precedes UNTIL R> DROP SWAP ROT ; : Sink ( f key where - f) ROT >R BEGIN CELL - 2DUP @ precedes WHILE DUP @ OVER CELL + ! DUP R@ = IF ! R> EXIT THEN ( key where) REPEAT CELL + ! R> ; : Insertion ( f l) 2DUP precedes IF CELL + OVER CELL + DO I @ I Sink CELL +LOOP DROP ELSE ( f l) 2DROP THEN ; : Hoarify ( f l - ...) BEGIN 2DUP THRESHOLD 0 D+ precedes WHILE Partition 2DUP - >R 2OVER - R> > IF 2SWAP THEN REPEAT Insertion ; : QUICK ( f l) DEPTH >R BEGIN Hoarify DEPTH R@ < UNTIL R> DROP ; : SORT ( a n) DUP 0= ABORT" Nothing to sort " 1- CELLS OVER + QUICK ; : fillp \ -- ; fill sort array once /array 0 ?DO /array I - I cells PointerArray + ! LOOP ; : $FILL$ \ -- .ann ." ARRAY fill" [$ /fill 0 DO fillp LOOP /fill /array * $] ; : $SORT$ .ann ." Hoare's quick sort (reverse order) " [$ /sort 0 DO fillp PointerArray /array SORT LOOP /sort /array * $] ; \ ******************************* \ End of Forth Inc benchmark code \ ******************************* \ ********************************* \ "Random" Numbers \ ********************************* 15 constant /randtest \ -- n ; times to test /Opcells constant /random \ -- n OpBuff constant RandBuff variable ShiftRegister 1 ShiftRegister ! : RandBit \ -- 0..1 ; Generates a "random" bit. ShiftRegister @ dup dup 3 rshift xor 1 and if \ XOR of bits 31 and 28, where bit 31=lsb dup 1 RSHIFT $40000000 OR ShiftRegister ! else dup 1 RSHIFT ShiftRegister ! then \ Store new shift register value. 1 and \ return original bit 31 ; : RandBits \ n -- 0..2^(n-1) ; Generate an n-bit "random" number. 0 \ Result's start value. swap 0 do 2* RandBit or loop \ Generate next "random" bit. ; : (randtest) \ -- 1 ShiftRegister ! RandBuff /random 0 do 32 RandBits over i cells + ! loop drop ; : $RAND$ .ann ." Generate " /random . ." random number array" [$ /randtest 0 do (randtest) loop /random /randtest * $] ; \ ********************************* \ LZ77 compression \ ********************************* 3 constant /lz77test \ number of test iterations 0 Value lz77-buffer 0 Value lz77-Pos 0 Value lz77-BytesLeft /OpCells constant /lz77-size : init-test-buffer \ -- /lz77-size cells to lz77-BytesLeft OpBuff dup to lz77-buffer to lz77-Pos /lz77-size 0 do 32 randbits lz77-buffer i cells + ! loop ; : free-test-buffer \ -- ; : getnextchar \ -- char true | false lz77-BytesLeft dup if drop lz77-BytesLeft 1- to lz77-BytesLeft lz77-Pos dup 1+ to lz77-Pos c@ true then ; : lz77-read-file \ addr len fileid -- u2 ior drop 0 rot rot 0 do \ done addr -- getnextchar if over c! 1+ swap 1+ swap else leave then loop drop 0 ; : lz77-write-file \ addr len fileid -- ior drop 2drop 0 ; : closed \ fileid -- drop ; : checked \ flag -- ABORT" File Access Error. " ; : read-char \ file -- char drop getnextchar 0= if -1 then ; ( LZSS -- A Data Compression Program ) ( 89-04-06 Standard C by Haruhiko Okumura ) ( 94-12-09 Standard Forth by Wil Baden ) ( Use, distribute, and modify this program freely. ) 1024 CONSTANT NRB ( Size of Ring Buffer ) 18 CONSTANT F ( Upper Limit for match-length ) 2 CONSTANT Threshold ( Encode string into position & length ( if match-length is greater. ) NRB CONSTANT Nil ( Index for Binary Search Tree Root ) VARIABLE textsize ( Text Size Counter ) VARIABLE codesize ( Code Size Counter ) \ VARIABLE printcount ( Counter for Reporting Progress ) ( These are set by InsertNode procedure. ) VARIABLE match-position VARIABLE match-length NRB F + 1 - carray text-buf ( Ring buffer of size NRB, with extra ( F-1 bytes to facilitate string comparison. ) ( Left & Right Children and Parents -- Binary Search Trees ) NRB 1 + array lson NRB 257 + array rson NRB 1 + array dad ( Input & Output Files ) 0 VALUE infile 0 VALUE outfile \ For i = 0 to NRB - 1, rson[i] and lson[i] will be the right and \ left children of node i. These nodes need not be initialized. \ Also, dad[i] is the parent of node i. These are initialized to \ Nil = NRB, which stands for `not used.' \ For i = 0 to 255, rson[NRB + i + 1] is the root of the tree \ for strings that begin with character i. These are initialized \ to Nil. Note there are 256 trees. ( Initialize trees. ) : InitTree ( -- ) NRB 257 + NRB 1 + DO Nil I rson ! LOOP NRB 0 DO Nil I dad ! LOOP ; ( Insert string of length F, text_buf[r..r+F-1], into one of the ( trees of text_buf[r]'th tree and return the longest-match position ( and length via the global variables match-position and match-length. ( If match-length = F, then remove the old node in favor of the new ( one, because the old one will be deleted sooner. ( Note r plays double role, as tree node and position in buffer. ) : InsertNode ( r -- ) Nil OVER lson ! Nil OVER rson ! 0 match-length ! DUP text-buf C@ NRB + 1 + ( r p) 1 ( r p cmp) BEGIN ( r p cmp) 0< not IF ( r p) DUP rson @ Nil <> IF rson @ ELSE 2DUP rson ! SWAP dad ! ( ) EXIT THEN ELSE ( r p) DUP lson @ Nil <> IF lson @ ELSE 2DUP lson ! SWAP dad ! ( ) EXIT THEN THEN ( r p) 0 F DUP 1 DO ( r p 0 F) 3 PICK I + text-buf C@ ( r p 0 F c) 3 PICK I + text-buf C@ - ( r p 0 F diff) ?DUP IF NIP NIP I LEAVE THEN ( r p 0 F) LOOP ( r p cmp i) DUP match-length @ > IF 2 PICK match-position ! DUP match-length ! F < not ELSE DROP FALSE THEN ( r p cmp flag) UNTIL ( r p cmp) DROP ( r p) 2DUP dad @ SWAP dad ! 2DUP lson @ SWAP lson ! 2DUP rson @ SWAP rson ! 2DUP lson @ dad ! 2DUP rson @ dad ! DUP dad @ rson @ OVER = IF TUCK dad @ rson ! ELSE TUCK dad @ lson ! THEN ( p) dad Nil SWAP ! ( Remove p ) ( ) ; ( Deletes node p from tree. ) : DeleteNode ( p -- ) DUP dad @ Nil = IF DROP EXIT THEN ( Not in tree. ) ( CASE ) ( p) DUP rson @ Nil = IF DUP lson @ ELSE DUP lson @ Nil = IF DUP rson @ ELSE DUP lson @ ( p q) DUP rson @ Nil <> IF BEGIN rson @ DUP rson @ Nil = UNTIL DUP lson @ OVER dad @ rson ! DUP dad @ OVER lson @ dad ! OVER lson @ OVER lson ! OVER lson @ dad OVER SWAP ! THEN OVER rson @ OVER rson ! OVER rson @ dad OVER SWAP ! ( ESAC ) THEN THEN ( p q) OVER dad @ OVER dad ! OVER DUP dad @ rson @ = IF OVER dad @ rson ! ELSE OVER dad @ lson ! THEN ( p) dad Nil SWAP ! ( ) ; 17 carray code-buf VARIABLE len VARIABLE last-match-length VARIABLE code-buf-ptr VARIABLE mask : Encode ( -- ) 0 textsize ! 0 codesize ! InitTree ( Initialize trees. ) \ code_buf[1..16] saves eight units of code, and code_buf[0] \ works as eight flags, "1" representing that the unit is an \ unencoded letter in 1 byte, "0" a position-and-length pair \ in 2 bytes. Thus, eight units require at most 16 bytes \ of code. 0 0 code-buf C! 1 mask C! 1 code-buf-ptr ! 0 NRB F - ( s r) ( Clear the buffer with any character that will appear often. ) 0 text-buf NRB F - BL FILL ( Read F bytes into the last F bytes of the buffer. ) DUP text-buf F infile LZ77-READ-FILE checked ( s r count) DUP len ! DUP textsize ! 0= IF EXIT THEN ( s r) ( Insert the F strings, each of which begins with one or more ( `space' characters. Note the order in which these strings ( are inserted. This way, degenerate trees will be less ( likely to occur. ) F 1 + 1 DO ( s r) DUP I - InsertNode LOOP ( Finally, insert the whole string just read. The ( global variables match-length and match-position are set. ) DUP InsertNode BEGIN ( s r) \ key? drop \ del SFP001 ( match_length may be spuriously long at end of text. ) match-length @ len @ > IF len @ match-length ! THEN match-length @ Threshold > not IF ( Not long enough match. Send one byte. ) 1 match-length ! ( `send one byte' flag ) mask C@ 0 code-buf C@ OR 0 code-buf C! ( Send uncoded. ) DUP text-buf C@ code-buf-ptr @ code-buf C! 1 code-buf-ptr +! ELSE ( Send position and length pair. ( Note match-length > Threshold. ) match-position @ code-buf-ptr @ code-buf C! 1 code-buf-ptr +! match-position @ 8 RSHIFT 4 LSHIFT ( . . j) match-length @ Threshold - 1 - OR code-buf-ptr @ code-buf C! ( . .) 1 code-buf-ptr +! THEN ( Shift mask left one bit. ) ( . .) mask C@ 2* mask C! mask C@ 0= IF ( Send at most 8 units of code together. ) 0 code-buf code-buf-ptr @ ( . . a k) outfile LZ77-WRITE-FILE checked ( . .) code-buf-ptr @ codesize +! 0 0 code-buf C! 1 code-buf-ptr ! 1 mask C! THEN ( s r) match-length @ last-match-length ! last-match-length @ DUP 0 DO ( s r n) infile read-char ( s r n c) DUP 0< IF 2DROP I LEAVE THEN ( Delete old strings and read new bytes. ) 3 PICK DeleteNode DUP 3 1 + PICK text-buf C! ( If the position is near end of buffer, extend ( the buffer to make string comparison easier. ) 3 PICK F 1 - < IF ( s r n c) DUP 3 1 + PICK NRB + text-buf C! THEN DROP ( s r n) ( Since this is a ring buffer, increment the ( position modulo NRB. ) >R >R ( s) 1 + NRB 1 - AND R> ( s r) 1 + NRB 1 - AND R> ( s r n) ( Register the string in text_buf[r..r+F-1]. ) OVER InsertNode LOOP ( s r i) DUP textsize +! \ textsize @ printcount @ > IF \ ( Report progress each time the textsize exceeds \ ( multiples of 1024. ) \ textsize @ 12 .R \ 1024 printcount +! \ THEN ( After the end of text, no need to read, but ( buffer may not be empty. ) last-match-length @ SWAP ?DO ( s r) OVER DeleteNode >R 1 + NRB 1 - AND R> 1 + NRB 1 - AND -1 len +! len @ IF DUP InsertNode THEN LOOP len @ 0> not UNTIL 2DROP ( Send remaining code. ) code-buf-ptr @ 1 > IF 0 code-buf code-buf-ptr @ outfile LZ77-WRITE-FILE checked code-buf-ptr @ codesize +! THEN ; : code77 \ -- init-test-buffer encode free-test-buffer ; : $CODE77$ .ann ." LZ77 Comp. (" /lz77-size . ." cells Data Mem>Mem)" [$ /lz77test 0 do code77 loop /lz77test $] ; \ ********************************************* \ DHRYSTONE integer benchmark by Marcel Hendrix \ ********************************************* 0 [IF] "DHRYSTONE" Benchmark Program Version: Forth/1 Date: 05/03/86 Author: Reinhold P. Weicker, CACM Vol 27, No 10, 10/84 pg. 1013 C version translated from ADA by Rick Richardson. Every method to preserve ADA-likeness has been used, at the expense of C-ness. Modula-2 version translated from C by Kevin Northover. Again every attempt made to avoid distortions of the original. Forth version translated from Modula-2 by Marcel Hendrix. Distorting the original was inevitable, given the differences between a strongly typed and a user-extensible language. Moreover, there is serious doubt of the instruction mix being appropriate for Forth. The following program contains statements of a high-level programming language (Forth) in a distribution considered representative: statements 53% control statements 32% procedures, function calls 15% 100 statements are dynamically executed. The program is balanced with respect to the three aspects: - statement type - operand type (for simple data types) - operand access operand global, local parameter, or constant. The combination of these three aspects is balanced only approximately. The program does not compute anything meaningful, but it is syntactically and semantically correct. The source code was "pre-optimized" on a word-to-word basis with the programmer acting as a pre-processor to the compiler. Real Forth programmers would rather be found dead than write disgusting programs like this. If you understand what both the C and Forth versions are doing, you'll never trust a benchmark again. [THEN] DECIMAL \ -- Control human fatigue factor 30000 VALUE LOOPS \ -- Some types 1 CONSTANT Ident1 2 CONSTANT Ident2 3 CONSTANT Ident3 4 CONSTANT Ident4 5 CONSTANT Ident5 0 CONSTANT NIL CHAR A CONSTANT 'A' CHAR B CONSTANT 'B' CHAR C CONSTANT 'C' CHAR W CONSTANT 'W' CHAR X CONSTANT 'X' CHAR Z CONSTANT 'Z' 0 VALUE /bytes \ -- Some global variables 0 VALUE IntGlob 0 VALUE BoolGlob 0 VALUE Char1Glob 0 VALUE Char2Glob NIL VALUE PtrGlb NIL VALUE PtrGlbNext 50 CELLS buffer: Array1Glob 50 DUP * CELLS buffer: Array2Glob \ The following record declaration is only kept for hosted \ systems as most cross compilers cannot handle second order \ defining words. The draft ANS cross compiler recommendations \ also handle macros differently. ForthSystem 0< [IF] \ Hosted version : RECORD CREATE 0 TO /bytes HERE 0 , \ -- sys DOES> @ ALLOCATE THROW \ -- addr IMMEDIATE ; : END /bytes SWAP ! ; \ ( sys -- ) : SIMPLE-TYPE CREATE , \ ( fieldlength> -- ) DOES> @ CREATE IMMEDIATE /bytes , /bytes + TO /bytes DOES> @ \ ( 'record -- 'offset ) S" LITERAL + " EVALUATE ; 1 CELLS SIMPLE-TYPE RecordPtr 1 CELLS SIMPLE-TYPE Enumeration \ one of Ident1 .. Ident5 1 CELLS SIMPLE-TYPE OneToFifty 31 CHARS SIMPLE-TYPE String30 \ extra count byte RECORD RecordType \ offset RecordPtr PtrComp \ 0 Enumeration Discr \ 1 CELLS Enumeration EnumComp \ 2 CELLS OneToFifty IntComp \ 3 CELLS String30 StringComp \ 4 CELLS END : AllocStr \ -- addr addr1 31 allocate throw 31 allocate throw ; : AllocPtr \ -- RecordType TO PtrGlb \ constructor, allocates ! RecordType TO PtrGlbNext ; : FreeAllocs \ a1 a2 a3 a4 -- FREE THROW FREE THROW FREE THROW FREE THROW ; \ -- Some obvious macros : [] S" CELLS" EVALUATE ; IMMEDIATE : [][] S" 50 * + CELLS" EVALUATE ; IMMEDIATE : ADDRESS ; IMMEDIATE : Proc7 S" + 2 + " EVALUATE ; IMMEDIATE \ n1 n2 -- n3 : Func3 S" Ident3 = " EVALUATE ; IMMEDIATE : Proc4 S" 'B' TO Char2Glob " EVALUATE ; IMMEDIATE : Proc5 S" 'A' TO Char1Glob FALSE TO BoolGlob " EVALUATE ; IMMEDIATE : Func1 ( char1 char2 -- n ) S" = IF Ident2 ELSE Ident1 THEN " EVALUATE ; IMMEDIATE [ELSE] \ Cross compiler versions 0 \ name size offset dup constant cell + \ 0 dup constant cell + \ 1 CELLS dup constant cell + \ 2 CELLS dup constant cell + \ 3 CELLS dup constant 31 + \ 4 CELLS constant /RecordType compiler : PtrComp ( + ) ; : Discr + ; : EnumComp + ; : IntComp + ; : StringComp + ; target 31 buffer: 31 buffer: /RecordType buffer: /RecordType buffer: : AllocStr \ -- addr addr1 ; : AllocPtr \ -- TO PtrGlb TO PtrGlbNext ; : FreeAllocs \ a1 a2 a3 a4 -- 2drop 2drop ; \ -- Some obvious macros compiler : [] CELLS ; : [][] 50 * + CELLS ; : ADDRESS ; : Proc7 + 2 + ; \ n1 n2 -- n3 : Func3 Ident3 = ; : Func1 ( char1 char2 -- n ) = IF Ident2 ELSE Ident1 THEN ; : Proc4 'B' TO Char2Glob ; : Proc5 'A' TO Char1Glob FALSE TO BoolGlob ; target [THEN] : Proc3 PtrGlb IF PtrGlb PtrComp @ \ 'record -- SWAP ! ELSE DROP 100 TO IntGlob THEN 10 IntGlob Proc7 PtrGlb IntComp ! ; : Proc6 ( n1 n2 -- n ) OVER LOCALS| n n2 n1 | n1 Func3 0= IF Ident4 TO n THEN CASE n1 Ident1 OF Ident1 ENDOF Ident2 OF IntGlob 100 > IF Ident1 ELSE Ident4 THEN ENDOF Ident3 OF Ident2 ENDOF Ident4 OF n ENDOF Ident5 OF Ident3 ENDOF ABORT" Proc6: argument out of range" ENDCASE ; (( 0 VALUE p^ : Proc1 ( 'record -- ) TO p^ PtrGlb p^ PtrComp ! 5 p^ IntComp ! p^ IntComp @ p^ PtrComp @ IntComp ! p^ PtrComp @ p^ PtrComp @ PtrComp @ ! p^ PtrComp @ PtrComp @ Proc3 p^ PtrComp @ Discr @ Ident1 = IF 6 p^ PtrComp @ IntComp ! p^ PtrComp @ EnumComp p^ EnumComp @ OVER @ Proc6 SWAP ! PtrGlb PtrComp p^ PtrComp @ PtrComp ! p^ PtrComp @ IntComp DUP @ 10 Proc7 SWAP ! ELSE p^ PtrComp @ p^ ! THEN ; )) (( : Proc1 ( 'record -- ) LOCALS| p^ | PtrGlb p^ PtrComp ! 5 p^ IntComp ! p^ IntComp @ p^ PtrComp @ IntComp ! p^ PtrComp @ p^ PtrComp @ PtrComp @ ! p^ PtrComp @ PtrComp @ Proc3 p^ PtrComp @ Discr @ Ident1 = IF 6 p^ PtrComp @ IntComp ! p^ PtrComp @ EnumComp p^ EnumComp @ OVER @ Proc6 SWAP ! PtrGlb PtrComp p^ PtrComp @ PtrComp ! p^ PtrComp @ IntComp DUP @ 10 Proc7 SWAP ! ELSE p^ PtrComp @ p^ ! THEN ; )) : Proc1 ( 'record -- ) >r PtrGlb r@ PtrComp ! 5 r@ IntComp ! r@ IntComp @ r@ PtrComp @ IntComp ! r@ PtrComp @ r@ PtrComp @ PtrComp @ ! r@ PtrComp @ PtrComp @ Proc3 r@ PtrComp @ Discr @ Ident1 = IF 6 r@ PtrComp @ IntComp ! r@ PtrComp @ EnumComp r@ EnumComp @ OVER @ Proc6 SWAP ! PtrGlb PtrComp r@ PtrComp @ PtrComp ! r@ PtrComp @ IntComp DUP @ 10 Proc7 SWAP ! ELSE r@ PtrComp @ r@ ! THEN r> drop ; : Proc2 ( val -- val' ) DUP 10 + LOCALS| IntLoc | BEGIN Char1Glob 'A' = \ This one never ends WHILE IntLoc 1- TO IntLoc \ unless Char = 'A' ?? DROP IntLoc IntGlob - TRUE UNTIL THEN ; : Proc8 ( 'array1 'array2 n1 n2 -- ) SWAP 5 + LOCALS| IntLoc IntParI2 Array2Par Array1Par | IntLoc [] Array1Par + IntParI2 OVER ! ( addr) @ IntLoc 1+ [] Array1Par + ! IntLoc DUP 30 + [] Array1Par + ! IntLoc DUP DUP [][] Array2Par + ! IntLoc DUP DUP 1+ [][] Array2Par + ! 1 IntLoc DUP 1- [][] Array2Par + +! IntLoc [] Array1Par + @ IntLoc DUP 20 + SWAP [][] Array2Par + ! 5 TO IntGlob ; : Func2 ( '$1 '$2 -- bool ) 2 BL LOCALS| CharLoc IntLoc '$2 '$1 | BEGIN IntLoc 2 <= WHILE IntLoc 1+ '$1 + C@ IntLoc 2 + '$2 + C@ Func1 Ident1 = IF 'A' TO CharLoc IntLoc 1+ TO IntLoc THEN REPEAT CharLoc 'W' >= IF CharLoc 'Z' <= IF 7 TO IntLoc THEN \ dead code, IntLoc never used! THEN CharLoc 'X' = IF TRUE EXIT THEN '$1 COUNT '$2 COUNT COMPARE 0> IF 7 IntLoc + TO IntLoc TRUE \ dead code, IntLoc is local ELSE FALSE THEN ; : Proc0 \ -- AllocStr 0 0 0 0 0 0 \ The following must be on ONE line or Win32Forth will crash. LOCALS| CharIndex CharLoc EnumLoc IntLoc3 IntLoc2 IntLoc1 String2Loc String1Loc | AllocPtr PtrGlbNext PtrGlb PtrComp ! Ident1 PtrGlb Discr ! Ident3 PtrGlb EnumComp ! 40 PtrGlb IntComp ! C" DHRYSTONE PROGRAM, SOME STRING" DUP C@ 1+ PtrGlb StringComp SWAP CMOVE LOOPS 0 DO Proc5 Proc4 2 TO IntLoc1 3 TO IntLoc2 C" DHRYSTONE PROGRAM, 2'ND STRING" DUP C@ 1+ String2Loc SWAP CMOVE Ident2 TO EnumLoc String1Loc String2Loc Func2 INVERT TO BoolGlob BEGIN IntLoc1 IntLoc2 < WHILE IntLoc1 5 * IntLoc2 - TO IntLoc3 IntLoc1 IntLoc2 Proc7 TO IntLoc3 \ The Forth way IntLoc1 1+ TO IntLoc1 REPEAT ADDRESS Array1Glob ADDRESS Array2Glob IntLoc1 IntLoc2 Proc8 PtrGlb Proc1 'A' TO CharIndex BEGIN CharIndex Char2Glob <= WHILE CharIndex 'C' Func1 EnumLoc = IF Ident1 EnumLoc Proc6 TO EnumLoc THEN CharIndex 1+ TO CharIndex REPEAT IntLoc1 IntLoc2 * TO IntLoc3 IntLoc3 IntLoc1 / TO IntLoc2 IntLoc3 IntLoc2 - 7 * IntLoc1 - TO IntLoc2 IntLoc1 Proc2 TO IntLoc1 \ the Forth way LOOP PtrGlb PtrGlbNext String1Loc String2Loc FreeAllocs ; : $DHRY$ \ -- .ann ." Dhrystone (integer)" [$ proc0 loops $] extra-pos >pos ms-elapsed @ if LOOPS 1000 ms-elapsed @ */ . ." Dhrystones/sec" then ; \ ************************* \ The main benchmark driver \ ************************* 0 value /CodeSize : .CodeSize cr ." Benchmark code size" time-pos 2- >pos /CodeSize 7 .r ." bytes." ; : BENCHMARK page .CodeSize cr .ann ." Primitives" .testcond cr .header [[$$ $DO$ $+$ $M+$ $*$ $M*$ $/$ $M/$ $/MOD$ $*/$ $FILL$ CR ." Total:" 1 $$]] cr cr .ann ." Application performance" .testcond CR .header [[$$ $SIEVE$ $FIB$ $SORT$ $RAND$ $CODE77$ $DHRY$ CR ." Total:" 1 $$]] ; decimal here start-here - to /CodeSize here start-here - CR CR .( Code size = ) . .( bytes ) CR CR .( To run the benchmark program, type BENCHMARK )