home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 3,301 of 4,675   
   Terje Mathisen to Robert Prins   
   Re: Online generation of constants for "   
   12 Mar 18 20:50:30   
   
   From: terje.mathisen@nospicedham.tmsw.no   
      
   Robert Prins wrote:   
   > On 2018-03-12 09:05, Terje Mathisen wrote:   
   >> The reason for this extra stuff is that you need n+1 bits of actual   
   >> precision   
   >> in order to get exact results when emulating an n-bit division.   
   >   
   > Which is what Agner Fog attributes to you in his manuals.   
   >   
   > So many things are "too trivial" to require online tools, what's the   
   > harm in one more?   
   >   
   > Anyway, am I correct in finding that for limited range dividends, you   
   > can get away with, as if it matters, smaller shifts?   
   >   
   > I use 301036/8 for division by 3652425, 1881437/4 for division by 36525,   
   > 14035840/0 for division by 306, and 429496729/0 for division by 10, and   
   > for the JDN's I'm working with (1980-06-16 to now+) that works.   
      
   The accuracy of the reciprocal must be higher than both the divisor and   
   the dividend, so for division of the full 32-bit range you always need   
   an effectively 33 bit reciprocal.   
      
   For smaller values like I neded for calendar operations, i.e.   
   calculating the century number given a (julian) day number which is   
   known to be inside a 400-year period I could approximate j/100 with   
   41/4096. This works because that fraction just happens to be very close   
   to the exact (1/100) reciprocal. :-)   
      
   Multiplication by 41 can be done in several ways:   
      
      imul eax,edx		;; *41   
      
      lea edx,[eax+eax*4] ;; *5   
      lea eax,[eax+edx*8] ;; *41   
      
      lea edx,[eax+eax*8] 	;; *9   
      shl eax,5		;; *32   
      add eax,edx   
      
   On a cpu where LEA takes two cycles the last version can run in three   
   cycles while the dual-LEA version would take four.   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca