From: anton@nospicedham.mips.complang.tuwien.ac.at   
      
   Terje Mathisen writes:   
   >As all these measurements have shown, the "speed of light" latency of   
   >all the various branchless variants are identical, the measured jitter   
   >between them depends on microarchitectural quirks on various cpu models,   
   >including some machines where CMOV take a cycle more than on others.   
   >   
   >The JZ versions however depends almost completely on the hit rate of the   
   >branch predictor: If the branch is both regularly executed (otherwise   
   >timing doesn't matter, right?), and well predicted (90%+), then it is   
   >effectively impossible to beat it with branchless code.   
      
   For throughput, I am not so sure.   
      
   But for latency, a perfectly predicted branching version is great,   
   because it breaks the dependency chain. Or, for a microbenchmark, a   
   perfectly predicted branching version has the same throughput for a   
   latency benchmark as for a throughput benchmark.   
      
   OTOH, for worst-case branch prediction, the branching version sucks.   
      
   - anton   
   --   
   M. Anton Ertl Some things have to be seen to be believed   
   anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen   
   http://www.complang.tuwien.ac.at/anton/home.html   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|