home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 3,791 of 4,675   
   Terje Mathisen to aen@nospicedham.spamtrap.com   
   Re: Primality test with one instruction   
   18 Feb 19 16:34:57   
   
   From: terje.mathisen@nospicedham.tmsw.no   
      
   aen@nospicedham.spamtrap.com wrote:   
   > On Mon, 18 Feb 2019 01:41:25 -0600, Robert Wessel   
   >  wrote:   
   >>   
   >> My question is, what exactly is the point?  It's trivial, but you're   
   >> stuck needing 512MB of disk and address space, plus the I/O for that   
   >> every time the program loads.   
   >>   
   > I think there isn't much I/O going on at the start.  The way I   
   > understand it, the mapping function only stores the address and length   
   > of the bitarray and only loads a page when there is an access to that   
   > memory and the address space used is only that page.   
   >   
   >> I'd call that two instructions, but that's just me.   
   >>   
   > the jnc or jc is a reaction to the test.   
   >   
   >> If you don't mind making the test four instructions, you could store   
   >> only a bitmap of the odd numbers.   
   >>   
   > I thought of that, and will try it some later time.   
   >   
   >> Considering the cost of a page I/O, even to SSD, you might well be   
   >> better off just testing against the 6542 primes (which, if my   
   >> calculations are correct, is the number of primes below sqrt(2**32)).   
   >> That would only take a 13kb table to store.  And if there's a fair   
   >> fraction of non-primes being tested, that will usually fail after only   
   >> the first few tests anyway.   
   >>   
   > I did some timings, and a run through all 4G numbers took ~10 seconds.   
   >   
   >   
   > The test was:   
   >      mov rcx,4294967295   
   > L0: bt  [rbx],rcx   
   >      dec rcx   
   >      jnz L0   
   >   
   > A sequential run through all the numbers is IMO the worst case test,   
   > because all pages need to be loaded.   
      
   Not at all! You are not using the results of the BT operation in order   
   to branch or do anything else, i.e. it is discarded at once.   
      
   >   
   > In cycles one test takes ~8 cycles, which seems not bad to me, and the   
      
   8 cycles for a perfectly predicted loop like this means that the BT   
   itself is taking this long.   
      
   Please try two more tests:   
      
   a) Starting at 2 and incrementing until wraparound, i.e. the same JNZ L0   
   as the last instruction. Count the number of primes found like this:   
      
         xor eax,eax   
         mov ecx,2   
   L0:  bt  [rbx],rcx   
         adc eax,0   
         inc rcx   
         jnz L0   
   ; EAX has prime count for verification   
      
   b) Do the same with the compressed 2:1 (odd) table   
      
         mov eax,1 ; Counting 2 as a prime   
         mov ecx,1 ; 3 SHR 1   
   L0:  bt  [rbx],rcx   
         adc eax,0   
         inc rcx   
         jns L0	; Stops at 2^31   
      
      
   > test for even numbers could of course be eliminated with a test rcx,1   
   > instead of a prime test.   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca