From: cross@spitfire.i.gajendra.net   
      
   In article ,   
   Scott Lurndal wrote:   
   >cross@spitfire.i.gajendra.net (Dan Cross) writes:   
   >[snip]   
   >>It's still not entirely clear to me now the BSP/BSC is supposed   
   >>to boot, however. If the world starts in 64-bit mode, and that   
   >>still requires paging to be enabled, then who sets up the page   
   >>tables that the BSP starts up on?   
   >   
   >I haven't dug into it, but perhaps they come up in some funky   
   >identity mode when the PT root pointer (CR3?) hasn't been programmed.   
      
   Now that would genuinely be a useful change.   
      
   >>>>I don't see how virtio can give a user-application pass-through   
   >>>>access to programmed IO, but I appreciate an argument that says   
   >>>>that there can be a uioring sort of thing to communicate IO   
   >>>>requests from userspace to the kernel without a trap.   
   >>>   
   >>>We do that all the time on our processors. Applications like DPDK   
   >>>and Open Data Plane (ODP) rely on user-mode access to the   
   >>>device MMIO (often using SR-IOV virtual functions) space and direct   
   >>>DMA (facilitated by an IOMMU/SMMU) initiated by usermode code.   
   >>   
   >>Ok, sure, but that's not PIO.   
   >   
   >By PIO are you referring to 'in' and 'out' instructions that have   
   >been obsolete for three decades except for a few legacy devices   
   >like the UART   
      
   Well, yes. (The context was the removal of both ring 3 port   
   access instructions, as well as the IOPL from TSS.)   
      
   >(and access to pci config space, although PCI   
   >express defines the memory mapped ECAM as an alternative which   
   >is used on non-intel/amd systems)?   
      
   I try to blot that out of my mind.   
      
   I believe that PCI express deprecates the port-based access   
   method to config space; MMIO _must_ be supported and in   
   particular, is the only way to get access to the extended   
   capability space. So from that perspective we're not losing   
   anything. Certainly, I've used memory-mapped IO for dealing   
   with PCI config space on x86_64. The port-based access method   
   is really only for compatibility with legacy systems at this   
   point.   
      
   >> Unprivileged access to the PIO   
   >>space seems like it's just going away. I think that's probably   
   >>fine as almost all high-speed devices are memory-mapped anyway,   
   >>so we're just left with legacy things like the UART or PS/2   
   >>keyboard controller or whatever.   
   >   
   >Plus, with PCI, a "io space" bar can be programmed to sit anywhere   
   >in the physical address space. With most modern devices either   
   >being PCI or providing PCI configuration space semantics, one can   
   >still use PIO even on ARM processors via IO BAR. Not that there really are   
   >any modern PCI/PCIe devices that use anything other than "memory space"   
   >bars.   
      
   Yup. It really seems like the only devices that demand access   
   via port IO are the legacy "PC" devices; if the 8159A is going   
   away, what's left? The RTC, UART and keyboard controller? Is   
   the PIT dual-wired to an IOAPIC for interrupt generation?   
      
   >>>Interrupts are still mediated by the OS (virt-io provides these   
   >>>capabilities), although DPDK/ODP generally poll completion rings   
   >>>rather than use interrupts.   
   >>   
   >>Really? Even with SR-IOV and the interrupt remapping tables in   
   >>the IOMMU? Are you running in VMX non-root mode? Why not use   
   >>posted interrupts?   
   >   
   >Hmm. I do seem to recall some mechanisms for interrupt virtualization   
   >in the IOMMU, but I've been, as noted above, in the ARMv8 world for a while   
   now.   
      
   Is this the point where I express my jealousy? :-D   
      
   But yes: the IOMMU can be used to deliver interrupts directly to   
   a VCPU (provided you're using APIC virtualization) by writing to   
   a posted-interrupt vector. The resulting interrupt will be   
   generated and delivered in the guest without intervention from   
   the hypervisor.   
      
   >Speaking for ARM systems, the guts of the interrupt controller   
   >(including the interrupt acknowledge registers) are privileged. There   
   >is no way to segregate user-mode-visible interrupts from all others   
   >which is needed to ensure that a user-mode program can't royally screw   
   >up the system, the kernel must accept and end the interrupt.   
      
   I'm not sure I understand; I thought the GIC was memory mapped,   
   including for the banked per-CPU registers? Is the issue that   
   you don't want to expose the entire mapping (I presume this has   
   to be on some page granularity) to userspace?   
      
   >The   
   >ARM GICv3 is actually much more sophisticated than the local and I/O   
   >APICs' on Intel and the GICv4 adds some level of interrupt virtualization   
   >to support delivery directly to the guest without intervention from   
   >the hypervisor. IIRC, the Intel IOMMU interrupt remapping tables   
   >were to support that type of usage, not direct user mode access   
   >(which would require user-mode access to the local APIC to end the   
   >interrupt).   
      
   That is correct; perhaps I'm misintpreting what you meant   
   earlier: I think I gather now that you're talking about   
   overloading functionality meant for virtualization to provide   
   unprivileged access to devices in a host. That is, allocate a   
   virtual function and pass that through to a userspace process,   
   but don't enter a virtualzied CPU context?   
      
    - Dan C.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|