home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 3,802 of 4,675   
   James Harris to wolfgang kern   
   Re: Fast conversion to a boolean of 0 or   
   07 Mar 19 09:00:16   
   
   From: james.harris.1@nospicedham.gmail.com   
      
   On 06/03/2019 22:19, wolfgang kern wrote:   
   > On 06.03.2019 20:37, James Harris wrote:   
   >> A small programming challenge if you are interested, just for fun.   
   >>   
   >> Say we want any value other than 0 in EAX to be reduced to 1 - e.g. so   
   >> that if non-zero means true but we want true to be 1.   
   >>   
   >> Is there a faster solution than the naive   
   >>   
   >>       cmp eax, 0   
   >>       je done   
   >>       mov eax, 1   
   >>     done:   
   >>   
   >> Naturally, the speed of that will depend on how predictable the input   
   >> values are. Taking into account the usual suspect characteristics of x86   
   >> CPUs is there a generally faster solution?   
   >>   
   >> I can think of three bit-twiddling ways but I'm not sure they would be   
   >> faster than the above. In fact, I think they might be slower. In any   
   >> case I'd be interested to see what others think. I suspect some of you   
   >> may already have a preferred solution.   
   >>   
   >> So, any suggestions?   
   >   
   > it will depend on CPU in use,   
   > newer types fusion "TEST reg,any" with an immediate following "Jcc" so   
   > this two act like one single instruction (but it's still a cc-branch):   
   >   
   >      test eax,-1  ;short with imm Sext byte 0xff -> 0xffffffff   
   >      jz done   
   >      mov eax,1   
   >   
   > some CPUs are faster with CMOV because it can save on the branch.   
      
   True. Have to say I once tested a CMOV against the simple compare+jump   
   on what I thought were random data. To my surprise the compare+jump was   
   faster. That was on just one CPU but it was an interesting finding,   
   nonetheless.   
      
   >   
   > my solution would be shorter (perhaps not faster):   
   >   
   >      or eax,eax   ;just two byte now   
   >      setnz AL     ;and I'd globally ignore the upper 24 bits   
   >      ;and eax,1    ;only needed if your TRUE must be _that_ BIG   
      
   I had one similar   
      
      test eax, eax  ;doesn't imply a register update   
      setnz al   
      and eax, 1   
      
   Though my query about it was I wondered if it wouldn't be hit with a   
   partial-register stall. I take your point about maybe not needing the   
   full register, or at least not needing it immediately.   
      
      
   --   
   James Harris   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca