Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.arch    |    Apparently more than just beeps & boops    |    131,241 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 130,636 of 131,241    |
|    BGB to Stefan Monnier    |
|    Re: Linus Torvalds on bad architectural     |
|    28 Dec 25 16:09:11    |
      From: cr88192@gmail.com              On 12/28/2025 12:55 PM, Stefan Monnier wrote:       > MitchAlsup [2025-12-28 17:43:25] wrote:       >> In order to stop the BE::LE war, one could always do a Middle Endian       >> bit/Byte order. You start in the middle and each step goes right-then-left.       >       > I thought the solution was to follow the Cray 1's lead, where memory is       > only every accessed in units of the same size (a "word").       >              Apparently DEC Alpha did this:        Nothing smaller than 64 bits in HW.              So, you want byte-oriented memory access or similar? Implement it yourself.                            Well, my recent goings on in ISA space:       Ended up adding a J52I prefix to my jumbo-prefix extension in RISC-V.              J52I is a 64-bit prefix that glues 52 bits onto the immediate, making it       possible to encode 64-bit immediate and displacement values.              I changed the interpretation such that: [Reg+Disp64] is instead       understood as [Abs64+Reg]. It is now possible to encode Abs64 branches       via J52I+JALR.              Potentially similar could be defined for XG2 and XG3 as well. Wouldn't       require any new changes or additions encoding-wide, but would define       something new in terms of decoding behavior in the case of Abs64       (currently, Disp64 is not allowed; this would make it allowed just       understood as Abs64).              Though, unlike RISC-V, where [Rb+Dis64] and [Abs64+Rb] are conceptually       equivalent, would need to decide the specifics in the XG2/XG3 case:       Do the same thing as what I did for RV, meaning the displacement       register is unscaled;       Break symmetry, and make it so that it is [Abs64+Rb*Scale].                     In XG3, it could in theory just use the RV-J52I encodings as well for a       lot of the cases if needed.              For XG2, there is a 64-bit encoding for an Abs48 branch (special case).       Would need to debate whether or not Abs64 memory ops are needed. But,       still niche, as it more often applies to thunks and similar than normal       code generation.                            It was added partly as I started to realize I had some non-zero use       cases for Imm64 and Abs64 addressing in RV Mode.              At first, partly designed a new encoding scheme for Imm64 instructions,       but then realized it was possible to devise a J52I prefix which could be       done more cheaply within my implementation.              Then ended up battling with timing failures (this stuff was "the straw       that broke the camel's back" in terms of timing constraints). Have ended       up partially restructuring some parts of the decoder, partly reducing       clutter and improving timing some (so back to passing timing again).                     Formally, the J52I prefix will likely fully replace the use of       J22+J22+LUI and similar. It can also express the same behavior (via       "ADDI Xd, X0, Imm64") and with slightly less hair. Internally, the J52I       prefixes also better leverage J21I decoding (as effectively both       "halves" of the J52I prefix are decoded in ways more consistent with the       handling of J21I; and for the low 32 bits, the immediate is decoded       as-if it had been given a J21I prefix).                     Then after this ended up tweaking things in BGBCC and my PEL loaders       such that the base-reloc previously used to encode tripwires now also       can encode the location of stack canary values; allowing the loader to       essentially randomize the stack canary values each time a program is loaded.              Mostly works, though seemingly fails for some reason on the CPU core       when the boot ROM is built in RISC-V mode. At first I thought it was a       cache-coherence issue (in the RISC-V case the relevant cache-related       functions were NOPs). Now it appears though as-if the code was somehow       disrupting the application of base relocs (in a way that doesn't happen       in the emulator).              So, it is possible that bugs remain in the RV support.                     Also in the process got around to re-enabling basic ASLR for the kernel       in the Boot ROM (the original issue impacting the symbol listing in the       emulator is no longer as relevant).              Well, and implementing some of the RV CBO instructions and similar.       But, still doesn't fully address needs nor is a particularly close match       to how my CPU does things. Also FENCE.I uses too much encoding space,       and I ended up handling it in a similar way to CBO.              Had ended up adding a few non-standard 0R encodings for things like TLB       flushing and similar.              As-is, FENCE.I can't fully implement the standard semantics, which in my       case would need to also be able to flush the L1 D$.        FENCE: Effectively needs full cache flush.        Strategy: Trap and emulate.        FENCE.I:        Proper semantics needs full cache flush (both D$ and I$).        Strategy: Trap and emulate proper version,        allow CBO-like handling of a variant case.              Some wonk exists in the CBO spec, it is like someone didn't quite get       the purpose in why one would want cache-line invalidation instructions       and was trying to work this in with assumptions of an fully coherent       cache (rather than, say, one using explicit cache flushing because the       HW uses a weak coherence model; and where using explicit flushes doesn't       really make sense if one has coherent caches).              ...                     >       > Stefan              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca