Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.asm.x86    |    Ahh, the lost art of x86 assembly    |    4,675 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 3,077 of 4,675    |
|    Andrew Cooper to aen@nospicedham.spamtrap.com    |
|    Re: more cyles    |
|    23 Nov 17 10:00:29    |
      From: amc96@nospicedham.cam.ac.uk              On 23/11/2017 00:56, aen@nospicedham.spamtrap.com wrote:       > Hi!       >       > Here is another timing exercise: This one tells me that to create one       > permutation from 13! takes an average of 7 cycles. The algorithm is       > again from Don Knuth's TAOCP Vol. 2.       >       > Any disagreements?       >       > .intel_syntax noprefix       > # as -o posting.o posting.asm       > # gcc -static -o posting posting.o       > # ./posting gives the output: 7.070950 44030949650 6227020800       > # time ./posting gives: real 0m12,992s user 0m12,956s sys 0m,000s       > # bc -l       > # 12.956*3400000000 gives: 44050400000       > .macro TSCStart       > rdtsc       > shl rdx,32       > or rax,rdx       > push rax       > .endm # TSCStart       >       > .macro TSCEnd       > rdtsc       > shl rdx,32       > or rax,rdx       > sub rax,[rsp]       > add rsp,8       > .endm # TSCEnd              This use of rdtsc isn't accurate. It will be reordered in the pipeline       with the content you are trying to time.              If you have rdtscp available, use that. It specifically has fencing       properties to prevent reordering.              If not, use lfence;rdtsc for Intel and mfence;rdtsc for AMD hardware to       explicitly serialise the instruction stream before reading the TSC.              ~Andrew              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca