Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.asm.x86    |    Ahh, the lost art of x86 assembly    |    4,675 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 4,138 of 4,675    |
|    David Brown to R.Wieser    |
|    Re: Bit Swizzling    |
|    08 Sep 20 22:00:49    |
      XPost: comp.lang.c, comp.arch.fpga       From: david.brown@nospicedham.hesbynett.no              On 08/09/2020 20:26, R.Wieser wrote:       > (Too many xpost groups, had to remove one)       >       > Rick,       >       >> o |= (((v & (1 << s)) >> s) << d);       >       > If you reverse the two tables (having the output bits in order from high to       > low) you could left-shift the output by one and than OR the output with the       > right-shifted input masked with 1. In this specific case (all eight bits       > swizzeled) you do not even need to clear the output.       >       > My C(++) isn't worth anything, but I imagine it could look something like       > this :       >       > o <<= 1       > o |= (v >> s) & 1       >       > That takes, at least on a x86, 4 machine instructions per bit.       >       > The thing with writing C(++) that should work /everywhere/ is that you can't       > use optimalisations for a specific processor.       >       > If you would use the x86, which has instructions that use the Carry flag as       > the ninth bit, you would only need 2 machine instructions per bit (rotate       > desired source bit into carry, rotate carry bit into target).              A good compiler can sometimes (not always) do that kind of thing in the       generated code, if it recognises the pattern. Often, however, these       kind of small instructions are effectively free on x86 processors as you       wait for memory (of course that depends very heavily on the rest of the       code and how you are using this function).              >       > And a remark : you've made that "swizzle" a function. Which means you will       > probably have a relative large overhead, possibly doubling if not tripling       > the ammount of instructions executed for each bit extraction and insertion       > (starting with the "call" and "return"). Rewiting it as a macro would       > probably be a good idea.              Such a function would be (or should be) inlined by the compiler - using       an inline function is almost always a better choice than a macro if you       don't /need/ it to be a macro.              >       > In the case of an x86 and some smart-ass usage of its nine-bit rotate       > instructions means that a full 8 bit swizzle uses only 16 instructions (or       > 17 if you want the source to be unchanged afterwards) - both code /and/       > execution ...       >       > Regards,       > Rudy Wieser       >       >       >              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca