From: cr88192@gmail.com   
      
   On 5/23/2023 3:17 PM, Dan Cross wrote:   
   > In article , BGB wrote:   
   >> On 5/23/2023 1:26 PM, Dan Cross wrote:   
   >> [snip]   
   >>>> On a 50 MHz core, only about 0.2% of the CPU time is going into handling   
   >>>> TLB misses.   
   >>>   
   >>> That's not the issue.   
   >>>   
   >>> The hypervisor has to invoke the guest's   
   >>> TLB miss handler, which will have to fault _again_ once it tries   
   >>> to write to the TLB to insert an entry; this can lead to several   
   >>> round-trips, bouncing between the host and guest several times.   
   >>> With nested VMs, this gets significantly worse.   
   >>>   
   >>   
   >> So?...   
   >   
   > I wonder: have you looked into why essentially every modern   
   > architecture in common use today uses hardware page tables?   
   > The hardware engineers working on are not stupid, and they are   
   > perfectly well aware of everything you said about e.g. larger   
   > TLBs. Yet there is a reason they chose to implement things   
   > the way essentially every extant modern architecture has.   
   > Perhaps they are aware of something you would find illuminating.   
   >   
   > The issues I'm talking about very much exist and very much   
   > affect world-world designs. I'll take the slightly larger cost   
   > in transistors over the disadvantages, including forcing   
   > pipeline flushes, thrashing the icache to handle TLB fault   
   > misses, and significantly more complex virtualization.   
   >   
      
   I think a lot of this is making a big fuss over nothing, FWIW.   
      
   But, in any case, SuperH (along with PA-RISC, MIPS, SPARC, etc) got   
   along reasonably well with software-managed TLB.   
      
   Their downfall wasn't related to them spending an extra fraction of a   
   percent of CPU time on handling TLB Miss ISRs.   
      
   Similarly, this also wasn't what caused Itanium to fail (nor was it due   
   to it being VLIW based, etc).   
      
   And, likewise, the IBM POWER ISA is still around, ...   
      
   ...   
      
      
      
   > Besides....what do you do if a guest decides it wants to insert   
   > a mapping covering part the hypervisor itself into the TLB?   
   >   
      
   There is no reason for the guest to be able to be able to put something   
   into the TLB which would somehow circumvent the host; since anything the   
   guest tries to load into the TLB will need to first get translated   
   through the host.   
      
   This is like asking why a program running in virtual memory can't just   
   create a pointer into memory inside the kernel:   
   The application doesn't have access to the kernel's address space to   
   begin with.   
      
   Or, stated another way, the entire "physical address" space for the   
   guest would itself be a virtual memory space running in user-mode.   
      
      
      
   >> [snip]   
   >>>> One could also have the guest OS use page-tables FWIW.   
   >>>   
   >>> How does the hypervisor know the format of the guest's page   
   >>> tables, in general?   
   >>>   
   >>   
   >> They have designated registers and the tree formats are documented as   
   >> part of the ISA/ABI specs...   
   >   
   > The point of a hypervisor is to provide a faithful emulation   
   > of the _hardware_: it's up to the guest to decide what ABI it   
   > uses. The hypervisor can't really force that onto the guest,   
   > and sothere's no "ABI" as such in a non-paravirtualized   
   > hypervisor. The whole point is that unmodified guests can run   
   > without change and think that they're running directly on the   
   > bare metal.   
   >   
   > It's unclear what the point of an ISA-mandated page table format   
   > would be in a system that doesn't use them. What prevents a   
   > guest from just ignoring them and doing its own thing?   
   >   
      
   You can have either accurate hardware level emulation, or slightly   
   better performance, and make a tradeoff there.   
      
   If the OS wants its own page-table format, it can specify that it is   
   using its own encoding easily enough via the tag bits in the TTB   
   register or similar.   
      
   And, if it claims to be using a standard table format, but is doing   
   something different, and crashes as a result. Well, that is its problem,   
   and/or one adds a flag or similar to the emulator to disable any   
   "faster" page translation.   
      
      
   Not like it is likely to matter all that much.   
   Hence, why I was using B-Trees for the 96-bit mode...   
      
      
   One can note that fetching something from a B-Tree is not exactly a fast   
   operation, but still roughly 3 orders of magnitude faster than swapping   
   a page in the pagefile.   
      
   ...   
      
      
   > - Dan C.   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|