home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 4,138 of 4,675   
   David Brown to R.Wieser   
   Re: Bit Swizzling   
   08 Sep 20 22:00:49   
   
   XPost: comp.lang.c, comp.arch.fpga   
   From: david.brown@nospicedham.hesbynett.no   
      
   On 08/09/2020 20:26, R.Wieser wrote:   
   > (Too many xpost groups, had to remove one)   
   >   
   > Rick,   
   >   
   >>         o |= (((v & (1 << s)) >> s) << d);   
   >   
   > If you reverse the two tables (having the output bits in order from high to   
   > low) you could left-shift the output by one and than OR the output with the   
   > right-shifted input masked with 1.   In this specific case (all eight bits   
   > swizzeled) you do not even need to clear the output.   
   >   
   > My C(++) isn't worth anything, but I imagine it could look something like   
   > this :   
   >   
   > o <<= 1   
   > o |=  (v >> s) & 1   
   >   
   > That takes, at least on a x86, 4 machine instructions per bit.   
   >   
   > The thing with writing C(++) that should work /everywhere/ is that you can't   
   > use optimalisations for a specific processor.   
   >   
   > If you would use the x86, which has instructions that use the Carry flag as   
   > the ninth bit, you would only need 2 machine instructions per bit (rotate   
   > desired source bit into carry, rotate carry bit into target).   
      
   A good compiler can sometimes (not always) do that kind of thing in the   
   generated code, if it recognises the pattern.  Often, however, these   
   kind of small instructions are effectively free on x86 processors as you   
   wait for memory (of course that depends very heavily on the rest of the   
   code and how you are using this function).   
      
   >   
   > And a remark : you've made that "swizzle" a function.  Which means you will   
   > probably have a relative large overhead, possibly doubling if not tripling   
   > the ammount of instructions executed for each bit extraction and insertion   
   > (starting with the "call" and "return").   Rewiting it as a macro would   
   > probably be a good idea.   
      
   Such a function would be (or should be) inlined by the compiler - using   
   an inline function is almost always a better choice than a macro if you   
   don't /need/ it to be a macro.   
      
   >   
   > In the case of an x86 and some smart-ass usage of its nine-bit rotate   
   > instructions means that a full 8 bit swizzle uses only 16 instructions (or   
   > 17 if you want the source to be unchanged afterwards) - both code /and/   
   > execution ...   
   >   
   > Regards,   
   > Rudy Wieser   
   >   
   >   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca