... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,590 of 131,241
EricP to Anton Ertl
Re: Compilers and flags
05 Sep 25 11:00:55
   From: ThatWouldBeTelling@thevillage.com   
      
   Anton Ertl wrote:   
   > EricP  writes:   
   >> That shows about 12% instructions are conditional branch and 9% CMP.   
   >> That says to me that almost all Bcc are paired with a CMP,   
   >> and very few use the flags set as a side effect of ALU ops.   
   >>   
   >> I would expect those two numbers to be closer as even today compilers don't   
   >> know about those side effect flags and will always emit a CMP or TST first.   
   >   
   > Compilers certainly have problems with single flag registers, as they   
   > run contrary to the base assumption of register allocation.  But you   
   > don't need full-blown tracking of flags in order to make use of flags   
   > side effects in compilers.  Plain peephole optimization can be good   
   > enough.  E.g., if you have   
   >   
   > if (a+b<0) ...   
   >   
   > the compiler may naively translate this to   
   >   
   > add tmp = a, b   
   > tst tmp   
   > bge cont   
   >   
   > The peephole optimizer can have a rule that says that this is   
   > equivalent to   
   >   
   > add tmp = a, b   
   > bge cont   
   >   
   > When I compile   
   >   
   > long foo(long a, long b)   
   > {   
   >   if (a+b<0)   
   >     return a-b;   
   >   else   
   >     return a*b;   
   > }   
   >   
   > with gcc-12.2.0 -O -c on AMD64, I get   
   >   
   > 0000000000000000 :   
   >    0:   48 89 f8                mov    %rdi,%rax   
   >    3:   48 89 fa                mov    %rdi,%rdx   
   >    6:   48 01 f2                add    %rsi,%rdx   
   >    9:   78 05                   js     10    
   >    b:   48 0f af c6             imul   %rsi,%rax   
   >    f:   c3                      ret   
   >   10:   48 29 f0                sub    %rsi,%rax   
   >   13:   c3                      ret   
   >   
   > Look, Ma, no tst.   
   >   
   > - anton   
      
   This could be 1 MOV shorter.   
   It didn't need to MOV %rdi, %rdx as it already copied rdi to rax.   
   Just ADD %rsi,%rdi and after that use the %rax copy.   
      
   For that optimization { ADD CMP Bcc } => { ADD Bcc }   
   to work those three instructions must be adjacent.   
   In this case it wouldn't make a difference but in general   
   I think they would want the freedom to move code about and not have   
   the ADD bound to the Bcc too early so this would have to be about   
   the very last optimization so it didn't interfere with code motion.   
      
   The Microsoft compiler uses LEA to do the add which doesn't change flags   
   so even if it has a flags optimization it would not detect it:   
      
   long foo(long,long) PROC                                  ; foo, COMDAT   
            lea     eax, DWORD PTR [rcx+rdx]   
            test    eax, eax   
            jns     SHORT $LN2@foo   
            sub     ecx, edx   
            mov     eax, ecx   
            ret     0   
   $LN2@foo:   
            imul    ecx, edx   
            mov     eax, ecx   
            ret     0   
      
   Also if MS had moved ecx to eax first as GCC does then it could have   
   the function result land in eax and eliminate the final two MOV eax,ecx.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]