home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,619 of 117,927   
   Hans Bezemer to Anton Ertl   
   Re: 3dup again (1/2)   
   05 Oct 25 11:29:41   
   
   From: the.beez.speaks@gmail.com   
      
   On 02-10-2025 22:44, Anton Ertl wrote:   
    > Locals-haters, come to Gforth, where locals are implemented   
    > inefficiently:-).  The code for 3DUP.2 is actually optimal for   
    > Gforth's calling convention.   
      
   I think I can beat you at that :)   
      
   1. This is the LOCAL definition   
      
      Addr| Opcode                        Operand   Argument   
      
        35| branch                             46   local   
        36| r>                                  0   
        37| swap                                0   
        38| dup                                 0   
        39| >r                                  0   
        40| @                                   0   
        41| >r                                  0   
        42| execute                             0   
        43| r>                                  0   
        44| r>                                  0   
        45| !                                   0   
        46| exit                                0   
      
   2. This is the 3DUP with locals   
      
      
      Addr| Opcode                        Operand   Argument   
      
        54| branch                             82   3dup.1   
        55| literal                             0   
        56| to                                  0   a   
        57| literal                             0   
        58| environ                             1   
        59| +                                   0   
        60| call                               35   local   
        61| literal                             0   
        62| to                                  1   b   
        63| literal                             1   
        64| environ                             1   
        65| +                                   0   
        66| call                               35   local   
        67| literal                             0   
        68| to                                  2   c   
        69| literal                             2   
        70| environ                             1   
        71| +                                   0   
        72| call                               35   local   
        73| to                                  2   c   
        74| to                                  1   b   
        75| to                                  0   a   
        76| value                               0   a   
        77| value                               1   b   
        78| value                               2   c   
        79| value                               0   a   
        80| value                               1   b   
        81| value                               2   c   
        82| exit                                0   
      
   3. This is 3DUP *without* locals   
      
      Addr| Opcode                        Operand   Argument   
      
        83| branch                             91   3dup.2   
        84| >r                                  0   
        85| over                                0   
        86| over                                0   
        87| r@                                  0   
        88| rot                                 0   
        89| rot                                 0   
        90| r>                                  0   
        91| exit                                0   
      
   And this is the sourcecode:   
      
   include lib/anstools.4th   
   include 4pp/lib/alocals.4pp   
      
   : clear depth 0 ?do drop loop ;   
   : 3dup.1 {: a b c -- a b c a b c :} a b c a b c ;   
   : 3dup.2 >r 2dup r@ -rot r> ;   
      
   1 2 3 3dup.1 .s clear   
   4 5 6 3dup.2 .s clear   
      
   Yeah, heavy use of the preprocessor. This is the expanded source:   
      
   : 3dup.1   
   [UNDEFINED] a [IF] 0 value a [THEN] ['] a >body local   
   [UNDEFINED] b [IF] 0 value b [THEN] ['] b >body local   
   [UNDEFINED] c [IF] 0 value c [THEN] ['] c >body local   
   to c to b to a a b c a b c ;   
      
   Maybe now the decompilation makes sense ;-)   
      
   Hans Bezemer   
      
   > dxf  writes:   
   >> For 3DUP I believe this is the one to beat:   
   >>   
   >> : 3DUP ( a b c -- a b c a b c )  dup 2over rot ;   
   >>   
   >> With NTF/LFX the locals version will break even.   
   >   
   > As we already discussed in the thread including   
   > <2021Sep11.083507@mips.complang.tuwien.ac.at>, NTF/LXF produces the   
   > same (optimal for the calling convention used by NTF/LXF) code for   
   > 3DUP versions using the data stack, return stack, or locals.  That's   
   > because the actual data flow is always the same, and NTF/LXF can see   
   > this data flow in all three cases.   
   >   
   >> For others, well, it may   
   >> be better not to look.  For a straight-forward example of 'stack juggling',   
   >> locals handle it rather poorly.   
   >   
   > Other Forth systems implement locals poorly.  LXF/NTF demonstrates   
   > that this is not due to some natural law, however.   
   >   
   > There have been some improvements in Gforth since that time.  Let's   
   > see how the versions used in that thread look on today's gforth-fast.   
   > Here are the versions of 3DUP:   
   >   
   > : 3dup.1 ( a b c -- a b c a b c ) >r 2dup r@ -rot r> ;   
   > : 3dup.2 ( a b c -- a b c a b c ) 2 pick 2 pick 2 pick ;   
   > : 3dup.3 {: a b c :} a b c a b c ;   
   > : 3dup.4 ( a b c -- a b c a b c ) dup 2over rot ;   
   >   
   > And here's the gforth-fast code on AMD64:   
   >   
   > 3dup.1              3dup.2             3dup.3              3dup.4   
   >> r    1->0          third    1->2      >l >l 1->1          dup    1->1   
   >    mov -$08[r14],r13   mov r15,$10[r10] >l    1->1            mov [r10],r13   
   >    sub r14,$08       third    2->3        mov -$08[rbp],r13   sub r10,$08   
   > 2dup    0->2          mov r9,$08[r10]    mov rdx,$08[r10]  2over    1->3   
   >    mov r13,$10[r10]  third    3->1        mov rax,rbp         mov r15,$18[r10   
   >    mov r15,$08[r10]    mov [r10],r13      add r10,$10         mov r9,$10[r10]   
   > i    2->3             sub r10,$18        lea rbp,-$10[rbp] rot    3->1   
   >    mov r9,[r14]        mov $10[r10],r15   mov -$10[rax],rdx   mov [r10],r15   
   > -rot    3->2          mov $08[r10],r9    mov r13,[r10]       sub r10,$10   
   >    mov [r10],r9      ;s    1->1         >l @local0 1->1       mov $08[r10],r9   
   >    sub r10,$08         mov rbx,[r14]    @local0    1->1     ;s    1->1   
   > r>    2->1            add r14,$08        mov rax,rbp         mov rbx,[r14]   
   >    mov -$08[r10],r15   mov rax,[rbx]      lea rbp,-$08[rbp]   add r14,$08   
   >    sub r10,$10         jmp eax            mov -$08[rax],r13   mov rax,[rbx]   
   >    mov $10[r10],r13                     @local1    1->2       jmp eax   
   >    mov r13,[r14]                          mov r15,$08[rbp]   
   >    add r14,$08                          @local2    2->1   
   > ;s    1->1                               mov -$08[r10],r15   
   >    mov rbx,[r14]                          sub r10,$10   
   >    add r14,$08                            mov $10[r10],r13   
   >    mov rax,[rbx]                          mov r13,$10[rbp]   
   >    jmp eax                              @local0    1->2   
   >                                           mov r15,$00[rbp]   
   >                                         @local1    2->3   
   >                                           mov r9,$08[rbp]   
   >                                         @local2    3->1   
   >                                           mov -$10[r10],r9   
   >                                           sub r10,$18   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca