home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 130,310 of 131,241   
   Robert Finch to Anton Ertl   
   Re: Multi-precision addition and archite   
   17 Nov 25 08:17:20   
   
   From: robfi680@gmail.com   
      
   On 2025-11-17 3:33 a.m., Anton Ertl wrote:   
   > Robert Finch  writes:   
   >> Finding it too difficult to support 128-bit operations using high, low   
   >> register pairs. Getting the reservation stations to pair up the   
   >> registers seems a bit scary. It would be much simpler to just have   
   >> 128-bit registers and it appears as if it may not be any more logic.   
   >   
   > If you want to support 128-bit operations, using 128-bit registers   
   > certainly is the way to go.  Note how AMD used to split 128-bit SSE   
   > operations into 64-bit parts on 64-bit registers in the K8, split   
   > 256-bit AVX operations into 128-bit parts on 128-bit registers in Zen,   
   > but they went away from that: In Zen4 512-bit operations are performed   
   > in 256-bit-pieces, but the registers are 512 bits wide.   
   >   
   > However, the point of carry bits or Mitch Alsup's CARRY is not 128-bit   
   > operations, but multi-precision, which can be 256-bit for some crypto,   
   > 4096 bits for other crypto, or billions of bits for the stuff that   
   > Alexander Yee is doing.   
   >   
   >> Sparc v9 died?   
   >   
   > Oracle has discontinued SPARC development in 2017, Fujitsu has   
   > announced in 2016 that they switch to ARM A64.  Both Oracle and   
   > Fujitsu released their last new SPARC CPU in 2017.  Fujitsu has   
   > released the ARM A64-based A64FX in 2019.  The Leon4 (2017 according   
   > to ) and Leon5   
   > (2019) implement SPARC v8, not v9.   
   >   
   > The MCST-R2000 (2018) implements SPARC v9, but will it have a   
   > successor?  And even if it has a successor, will it be available in   
   > relevant numbers?  MCST is not married to SPARC, despite their name;   
   > they have worked on Elbrus 2000 implementations as well; Elbrus 2000   
   > supports Elbrus VLIW and "Intel x86" instruction sets, and new models   
   > were released in 2018, 2021, and 2025, so MCST now seems to focus on   
   > that.   
   >   
   > - anton   
      
   Skimming through the SPARC architecture manual I am wondering how they   
   handle register renaming with a windowed register file. If the register   
   window file is deep there must be a ginormous number of registers for   
   renaming. Would it need to keep track of the renames for all the   
   registers? How does it dump the rename state to memory?   
      
   Tried to find some information on Elbrus. I got page not found a couple   
   of times. Other than it’s a VLIW machine I do not know much about it.   
      
   *****   
      
   I would like a machine able to process 128-bit values directly, but it   
   takes up too many resources. It is easier to make the register file deep   
   as opposed to wide. BRAM has a max 64-bit width. After that it takes   
   more BRAMs to get a wider port. I tried a 128-bit wide register file,   
   but it used about 200 BRAMs. Too many.   
      
   There are now 128 logical registers available in Qupls. It turns out   
   that the BRAM setup is 512 registers deep no matter whether there are   
   32,64 or 128 registers. So, may as well make them available.   
      
   Qupls reservation stations were set up with support for eight operands   
   (four each for each ½ 128-bit register). The resulting logic was about   
   25,000 LUTs for just one RS. This is compared to about 5,000 LUTs when   
   there were just four operands. What gets implemented is considerably   
   less as most functional units do not need all the operands.   
      
   It may be resource efficient to use multiple reservation stations as   
   opposed to more operands in a single station. But then the operands need   
   to be linked together between stations. It may be possible using a hash   
   of the PC value and ROB entry number.   
      
   Qupls seems to have an implementation four or five times the size of the   
   FPGA again. Back to the drawing board.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca