home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,446 of 117,927   
   dxf to Anton Ertl   
   Re: Parsing timestamps?   
   10 Jul 25 21:09:21   
   
   From: dxforth@gmail.com   
      
   On 10/07/2025 6:35 pm, Anton Ertl wrote:   
   > dxf  writes:   
   >> The catch with SSE is there's nothing like FCHS or FABS   
   >> so depending on how one implements them, results vary across    
   mplementations.   
   >   
   > You can see in Gforth how to implement FNEGATE and FABS with SSE2:   
   >   
   > see fnegate   
   > Code fnegate   
   >    0x000055e6a78a8274:   add    $0x8,%rbx   
   >    0x000055e6a78a8278:   xorpd  0x24d8f(%rip),%xmm15        # 0x55e6a78cd010   
   >    0x000055e6a78a8281:   mov    %r15,%r9   
   >    0x000055e6a78a8284:   mov    (%rbx),%rax   
   >    0x000055e6a78a8287:   jmp    *%rax   
   > end-code   
   >  ok   
   > 0x55e6a78cd010 16 dump   
   > 55E6A78CD010: 00 00 00 00  00 00 00 80 - 00 00 00 00  00 00 00 00   
   >  ok   
   > see fabs   
   > Code fabs   
   >    0x000055e6a78a84fe:   add    $0x8,%rbx   
   >    0x000055e6a78a8502:   andpd  0x24b15(%rip),%xmm15        # 0x55e6a78cd020   
   >    0x000055e6a78a850b:   mov    %r15,%r9   
   >    0x000055e6a78a850e:   mov    (%rbx),%rax   
   >    0x000055e6a78a8511:   jmp    *%rax   
   > end-code   
   >  ok   
   > 0x55e6a78cd020 16 dump   
   > 55E6A78CD020: FF FF FF FF  FF FF FF 7F - 00 00 00 00  00 00 00 00   
   >   
   > The actual implementation is the xorpd instruction for FNEGATE, and in   
   > the andpd instruction for FABS.  The memory locations contain masks:   
   > for FNEGATE only the sign bit is set, for FABS everything but the sign   
   > bit is set.   
   >   
   > Sure you can implement FNEGATE and FABS in more complicated ways, but   
   > you can also implement them in more complicated ways if you use the   
   > 387 instruction set.  Here's an example of more complicated   
   > implementations:   
   >   
   > see fnegate   
   > FNEGATE   
   > ( 004C4010    4833C0 )                XOR     RAX, RAX   
   > ( 004C4013    F34D0F7EC8 )            MOVQ    XMM9, XMM8   
   > ( 004C4018    664C0F6EC0 )            MOVQ    XMM8, RAX   
   > ( 004C401D    F2450F5CC1 )            SUBSD   XMM8, XMM9   
   > ( 004C4022    C3 )                    RET/NEXT   
   > ( 19 bytes, 5 instructions )   
   >  ok   
   > see fabs   
   > FABS   
   > ( 004C40B0    E8FBEFFFFF )            CALL    004C30B0  FS@   
   > ( 004C40B5    4885DB )                TEST    RBX, RBX   
   > ( 004C40B8    488B5D00 )              MOV     RBX, [RBP]   
   > ( 004C40BC    488D6D08 )              LEA     RBP, [RBP+08]   
   > ( 004C40C0    0F8D05000000 )          JNL/GE  004C40CB   
   > ( 004C40C6    E845FFFFFF )            CALL    004C4010  FNEGATE   
   > ( 004C40CB    C3 )                    RET/NEXT   
   > ( 28 bytes, 7 instructions )   
      
   The latter were basically what was existed in the implementation.  As they   
   don't handle -ve zero (or NANs) I swapped them out for the former ones you   
   mention.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca