... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 130,143 of 131,241
BGB to Michael S
Re: Tonights Tradeoff
05 Nov 25 10:15:00
   From: cr88192@gmail.com   
      
   On 11/5/2025 3:21 AM, Michael S wrote:   
   > On Tue, 04 Nov 2025 22:51:28 GMT   
   > MitchAlsup  wrote:   
   >   
   >> Thomas Koenig  posted:   
   >>   
   >>> Terje Mathisen  schrieb:   
   >>>   
   >>>> I still think the IBM DFP people did an impressively good job   
   >>>> packing that much data into a decimal representation. :-)   
   >>>   
   >>> Yes, that modulo 1000 packing is quite clever.  It is relatively   
   >>> cheap to implement in hardware (which is the point, of course).   
   >>> Not sure how easy it would be in software.   
   >>   
   >> Brain dead easy: 1 table of 1024 entries each 12-bits wide,   
   >>                   1 table of 4096 entries each 10-bits wide,   
   >> isolate the 10-bit field, LD the converted value.   
   >> isolate the 12-bit field, LD the converted value.   
   >>   
   >> Other than "crap loads" of {deMorganizing and gate optimization}   
   >> that is essentially what HW actually does.   
   >>   
   >> You still need to build 12-bit decimal ALUs to string together   
   >   
   > Are talking about hardware or software?   
   >   
      
   I had interpreted it as being about software with BCD helper ops.   
      
   Otherwise, would probably go a different route.   
      
   One other tradeoff is whether to go for Decimal128 in DPD or BID.   
      
   Stuff online says BID is better for a software implementation, but I am   
   having doubts. It is possible that DPD could make more sense in both   
   cases, albeit likely, in the absence of BCD helpers, it may make sense   
   to map DPD to linear 10-bit values.   
      
   While BID could make sense, it would have a drawback of assuming having   
   some way of quickly performing power-of-10 multiplies on large integer   
   values. If you have a CPU where the fastest way to perform generic   
   128-bit multiply is to break it down into 32 bit multiplies, and/or use   
   shift-and-add, it is not a particularly attractive option.   
      
   Contrast, working with 16-bit chunks holding 10 bit values is likely to   
   work out being cheaper.   
      
   Despite BID being more conceptually similar to Binary128, they differ in   
   that Binary128 would only need to use large-integer multiply sparingly   
   (namely, for multiply operations).   
      
      
      
   Though, likely fastest option would be to map the DPD values to 30-bit   
   linear values, then internally use the 30-bit linear values, and convert   
   back to DPD at the end. Though, the performance of this is likely to   
   depend on the operation.   
      
   A non-standard variant, representing the value as packed 30 bit fields,   
   could likely be the fastest option. Could use the same basic layout as   
   the existing Decimal128 format.   
      
      
   S0, my guess for a performance ranking, fast to slow, being:   
      1: Dense packed, 30b linear, 30+30+30+20+digit   
      2: DPD   
      3: BID   
      
      
   As for whether or not to support Decimal128 (in either form), dunno.   
      
   Closest I have to a use-case is that well, technically there is a   
   _Decimal128 type in C, and it might make sense for it to be usable.   
      
   But, then one needs to decide on which possible format to use here.   
      And, whether to aim for performance or compatibility.   
      
      
   ...   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]