home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 130,636 of 131,241   
   BGB to Stefan Monnier   
   Re: Linus Torvalds on bad architectural    
   28 Dec 25 16:09:11   
   
   From: cr88192@gmail.com   
      
   On 12/28/2025 12:55 PM, Stefan Monnier wrote:   
   > MitchAlsup [2025-12-28 17:43:25] wrote:   
   >> In order to stop the BE::LE war, one could always do a Middle Endian   
   >> bit/Byte order. You start in the middle and each step goes right-then-left.   
   >   
   > I thought the solution was to follow the Cray 1's lead, where memory is   
   > only every accessed in units of the same size (a "word").   
   >   
      
   Apparently DEC Alpha did this:   
      Nothing smaller than 64 bits in HW.   
      
   So, you want byte-oriented memory access or similar? Implement it yourself.   
      
      
      
   Well, my recent goings on in ISA space:   
   Ended up adding a J52I prefix to my jumbo-prefix extension in RISC-V.   
      
   J52I is a 64-bit prefix that glues 52 bits onto the immediate, making it   
   possible to encode 64-bit immediate and displacement values.   
      
   I changed the interpretation such that: [Reg+Disp64] is instead   
   understood as [Abs64+Reg]. It is now possible to encode Abs64 branches   
   via J52I+JALR.   
      
   Potentially similar could be defined for XG2 and XG3 as well. Wouldn't   
   require any new changes or additions encoding-wide, but would define   
   something new in terms of decoding behavior in the case of Abs64   
   (currently, Disp64 is not allowed; this would make it allowed just   
   understood as Abs64).   
      
   Though, unlike RISC-V, where [Rb+Dis64] and [Abs64+Rb] are conceptually   
   equivalent, would need to decide the specifics in the XG2/XG3 case:   
   Do the same thing as what I did for RV, meaning the displacement   
   register is unscaled;   
   Break symmetry, and make it so that it is [Abs64+Rb*Scale].   
      
      
   In XG3, it could in theory just use the RV-J52I encodings as well for a   
   lot of the cases if needed.   
      
   For XG2, there is a 64-bit encoding for an Abs48 branch (special case).   
   Would need to debate whether or not Abs64 memory ops are needed. But,   
   still niche, as it more often applies to thunks and similar than normal   
   code generation.   
      
      
      
   It was added partly as I started to realize I had some non-zero use   
   cases for Imm64 and Abs64 addressing in RV Mode.   
      
   At first, partly designed a new encoding scheme for Imm64 instructions,   
   but then realized it was possible to devise a J52I prefix which could be   
   done more cheaply within my implementation.   
      
   Then ended up battling with timing failures (this stuff was "the straw   
   that broke the camel's back" in terms of timing constraints). Have ended   
   up partially restructuring some parts of the decoder, partly reducing   
   clutter and improving timing some (so back to passing timing again).   
      
      
   Formally, the J52I prefix will likely fully replace the use of   
   J22+J22+LUI and similar. It can also express the same behavior (via   
   "ADDI Xd, X0, Imm64") and with slightly less hair. Internally, the J52I   
   prefixes also better leverage J21I decoding (as effectively both   
   "halves" of the J52I prefix are decoded in ways more consistent with the   
   handling of J21I; and for the low 32 bits, the immediate is decoded   
   as-if it had been given a J21I prefix).   
      
      
   Then after this ended up tweaking things in BGBCC and my PEL loaders   
   such that the base-reloc previously used to encode tripwires now also   
   can encode the location of stack canary values; allowing the loader to   
   essentially randomize the stack canary values each time a program is loaded.   
      
   Mostly works, though seemingly fails for some reason on the CPU core   
   when the boot ROM is built in RISC-V mode. At first I thought it was a   
   cache-coherence issue (in the RISC-V case the relevant cache-related   
   functions were NOPs). Now it appears though as-if the code was somehow   
   disrupting the application of base relocs (in a way that doesn't happen   
   in the emulator).   
      
   So, it is possible that bugs remain in the RV support.   
      
      
   Also in the process got around to re-enabling basic ASLR for the kernel   
   in the Boot ROM (the original issue impacting the symbol listing in the   
   emulator is no longer as relevant).   
      
   Well, and implementing some of the RV CBO instructions and similar.   
   But, still doesn't fully address needs nor is a particularly close match   
   to how my CPU does things. Also FENCE.I uses too much encoding space,   
   and I ended up handling it in a similar way to CBO.   
      
   Had ended up adding a few non-standard 0R encodings for things like TLB   
   flushing and similar.   
      
   As-is, FENCE.I can't fully implement the standard semantics, which in my   
   case would need to also be able to flush the L1 D$.   
      FENCE: Effectively needs full cache flush.   
        Strategy: Trap and emulate.   
      FENCE.I:   
        Proper semantics needs full cache flush (both D$ and I$).   
        Strategy: Trap and emulate proper version,   
          allow CBO-like handling of a variant case.   
      
   Some wonk exists in the CBO spec, it is like someone didn't quite get   
   the purpose in why one would want cache-line invalidation instructions   
   and was trying to work this in with assumptions of an fully coherent   
   cache (rather than, say, one using explicit cache flushing because the   
   HW uses a weak coherence model; and where using explicit flushes doesn't   
   really make sense if one has coherent caches).   
      
   ...   
      
      
   >   
   >          Stefan   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca