home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,604 of 117,927   
   Anton Ertl to dxf   
   3dup again (was: Generating a random seq   
   02 Oct 25 20:44:40   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   dxf  writes:   
   >For 3DUP I believe this is the one to beat:   
   >   
   >: 3DUP ( a b c -- a b c a b c )  dup 2over rot ;   
   >   
   >With NTF/LFX the locals version will break even.   
      
   As we already discussed in the thread including   
   <2021Sep11.083507@mips.complang.tuwien.ac.at>, NTF/LXF produces the   
   same (optimal for the calling convention used by NTF/LXF) code for   
   3DUP versions using the data stack, return stack, or locals.  That's   
   because the actual data flow is always the same, and NTF/LXF can see   
   this data flow in all three cases.   
      
   >For others, well, it may   
   >be better not to look.  For a straight-forward example of 'stack juggling',   
   >locals handle it rather poorly.   
      
   Other Forth systems implement locals poorly.  LXF/NTF demonstrates   
   that this is not due to some natural law, however.   
      
   There have been some improvements in Gforth since that time.  Let's   
   see how the versions used in that thread look on today's gforth-fast.   
   Here are the versions of 3DUP:   
      
   : 3dup.1 ( a b c -- a b c a b c ) >r 2dup r@ -rot r> ;   
   : 3dup.2 ( a b c -- a b c a b c ) 2 pick 2 pick 2 pick ;   
   : 3dup.3 {: a b c :} a b c a b c ;   
   : 3dup.4 ( a b c -- a b c a b c ) dup 2over rot ;   
      
   And here's the gforth-fast code on AMD64:   
      
   3dup.1              3dup.2             3dup.3              3dup.4   
   >r    1->0          third    1->2      >l >l 1->1          dup    1->1   
     mov -$08[r14],r13   mov r15,$10[r10] >l    1->1            mov [r10],r13   
     sub r14,$08       third    2->3        mov -$08[rbp],r13   sub r10,$08   
   2dup    0->2          mov r9,$08[r10]    mov rdx,$08[r10]  2over    1->3   
     mov r13,$10[r10]  third    3->1        mov rax,rbp         mov r15,$18[r10   
     mov r15,$08[r10]    mov [r10],r13      add r10,$10         mov r9,$10[r10]   
   i    2->3             sub r10,$18        lea rbp,-$10[rbp] rot    3->1   
     mov r9,[r14]        mov $10[r10],r15   mov -$10[rax],rdx   mov [r10],r15   
   -rot    3->2          mov $08[r10],r9    mov r13,[r10]       sub r10,$10   
     mov [r10],r9      ;s    1->1         >l @local0 1->1       mov $08[r10],r9   
     sub r10,$08         mov rbx,[r14]    @local0    1->1     ;s    1->1   
   r>    2->1            add r14,$08        mov rax,rbp         mov rbx,[r14]   
     mov -$08[r10],r15   mov rax,[rbx]      lea rbp,-$08[rbp]   add r14,$08   
     sub r10,$10         jmp eax            mov -$08[rax],r13   mov rax,[rbx]   
     mov $10[r10],r13                     @local1    1->2       jmp eax   
     mov r13,[r14]                          mov r15,$08[rbp]   
     add r14,$08                          @local2    2->1   
   s    1->1                               mov -$08[r10],r15   
     mov rbx,[r14]                          sub r10,$10   
     add r14,$08                            mov $10[r10],r13   
     mov rax,[rbx]                          mov r13,$10[rbp]   
     jmp eax                              @local0    1->2   
                                            mov r15,$00[rbp]   
                                          @local1    2->3   
                                            mov r9,$08[rbp]   
                                          @local2    3->1   
                                            mov -$10[r10],r9   
                                            sub r10,$18   
                                            mov $10[r10],r15   
                                            mov $18[r10],r13   
                                            mov r13,$10[rbp]   
                                          lit    1->2   
                                          #24   
                                            mov r15,$50[rbx]   
                                          lp+!    2->1   
                                            add rbp,r15   
                                          ;s    1->1   
                                            mov rbx,[r14]   
                                            add r14,$08   
                                            mov rax,[rbx]   
                                            jmp eax   
      
   Locals-haters, come to Gforth, where locals are implemented   
   inefficiently:-).  The code for 3DUP.2 is actually optimal for   
   Gforth's calling convention.   
      
   - anton   
   --   
   M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
    New standard: https://forth-standard.org/   
   EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html   
   EuroForth 2025 registration: https://euro.theforth.net/   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca