... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.asm.x86
Ahh, the lost art of x86 assembly
4,675 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 3,218 of 4,675
Andrew Cooper to Terje Mathisen
Re: Speculative data leaks in all supers
06 Jan 18 02:15:00
   From: amc96@nospicedham.cam.ac.uk   
      
   On 05/01/2018 19:17, Terje Mathisen wrote:   
   > Thank you Andrew!   
      
   No problem at all.  Frankly, it was quite cathartic finally being able   
   to talk about this in public.  (According to git, my earliest patch   
   towards fixing this in Xen is Wed, 16 Aug 2017 17:06:59 +0000, which is   
   now upstream.)   
      
   The current media frenzy, and the fact that Variant 1 has unfixable   
   cases, is a very different story. (combined with 0 research into what   
   kinds of dependent memory reads can in-practice be coerced into being   
   speculatively-leaky.)   
      
   > I suggested over in comp.arch that the obvious way to isolate processes   
   > properly would be to _never_ allow any externally-visible state to be   
   > written back before the instruction either actually retires or at least   
   > is known to run and not fault.   
   >   
   > This would then force all cache controllers, BTB, TLB etc to have a few   
   > local buffers, one for each possible level of speculation, to hold this   
   > type of data and then commit it when the instruction retire>   
   > Mitch Alsup told us that they actually implemented something similar to   
   > this way back in 1991. :-)   
      
   Modern processors from both vendors already have this to a certain   
   extent.  E.g. while the Return Stack Buffer/Return Address Stack is   
   architecturally 32 entries, it is apparently micro-architecturally   
   larger to deal with the fact that the pipeline can speculate ~200 uops   
   ahead of the retire buffer in well-optimised code.   
      
   (If there is anything I've learnt in practice, it's that the phrase   
   "It's complicated" doesn't begin to describe things.)   
      
   I'm not entirely convinced it is safe as described, in cases where you   
   have nested speculation windows ("It's complicated") where an inner   
   window gets restarted while an outer one is still pending.  OTOH, this   
   is based on a pathological distrust of double-fetch scenarios, rather   
   than a sensible period of time to consider the proposal.   
      
   ~Andrew   
      
   >   
   > Terje   
   >   
   > Andrew Cooper wrote:   
   >> On 03/01/2018 18:38, Rod Pemberton wrote:   
   >>>   
   >>> Apparently, Intel processor's for over the past decade are   
   >>> allowing speculative execution of code without any privilege   
   >>> checks.  The exact specifics of the flaw are apparently still   
   >>> secret.   
   >>   
   >> The embargo broke 5h ago.  tl;dr everything is broken, although   
   >> Intel processors do have a failure mode which is worse than the   
   >> others.   
   >>   
   >> All the attack strategies rely on the fact that you can recover the   
   >> results of calculations during speculative execution via cache   
   >> timing attacks, combined with the fact that an attacker can   
   >> deliberately poison branch prediction logic to cause speculation of   
   >> chosen code.   
   >>   
   >> SP1, a.k.a. Bounds-check Bypass:   
   >>   
   >> In this case, you are limited to executing basic blocks that you can   
   >> locate in the victim context.  As an attacker, you control the   
   >> taken/not-taken prediction state, and can deliberately cause the   
   >> processor to speculate into the wrong basic block when it encounters   
   >> a conditional branch.  This can be (ab)used to deliberately cause a   
   >> speculative read off the end of an array.   
   >>   
   >> In Jit-able cases (BPF filters in the kernel, Javascript in a   
   >> webpage, many other examples), an attacker has some control over the   
   >> eventual layout of basic blocks in the victim context.   
   >>   
   >> This case is the hardest to deal with, because sort of inhibiting   
   >> speculation before every memory read that has any   
   >> attacker-controlled component, it can't be fixed.   
   >>   
   >> SP2, a.k.a. Branch Target Injection:   
   >>   
   >> Indirect jump and call instructions (call/jmp *%reg/mem) typically   
   >> don't have a single destination during the lifetime of the program,   
   >> and are predicted using the Branch Target Buffer, which is based on   
   >> the branch history.  An attacker can poison the BTB and cause   
   >> speculation to go to an arbitrary destination.   
   >>   
   >> Therefore, an attacker which poisons the BTB can cause the victim   
   >> indirect branch to speculate to an arbitrary location, and is not   
   >> restricted to the victim basic blocks in their allotted order.  On   
   >> hardware without the SMEP feature active, speculation can be   
   >> redirected back into user code, so the attacker can provide a custom   
   >> basic block to be speculated over - See SP3.   
   >>   
   >> ret instructions are also indirect branches, but are predicted   
   >> (along with call instructions) via the Return Stack Buffer.  An RSB   
   >> prediction is always followed if valid, so an attacker can poison the   
   >> RSB and find a victim codepath which executes more ret instructions   
   >> than call instructions, at which point the attacker takes control of   
   >> speculation in the same way.  longjmp() and/or context switch into a   
   >> deeper call tree than the one you are currently in is the most common   
   >> way of executing more ret instructions than call instructions in   
   >> otherwise well-formed code.   
   >>   
   >> Mitigating this is far harder.  To do it effective and efficiently,   
   >> you need new compilers which can transform indirect branches into   
   >> safer alternatives (e.g. the RETPOLINE thunk), and new microcode   
   >> which implements additional facilities to the kernel.  Despite this,   
   >> the performance hit is substantial.   
   >>   
   >> SP3, a.k.a. Rogue Data Load:   
   >>   
   >> This issue is specific to Intel processors (and some ARM processors,   
   >> but that is OT), and occurs because permission checks for reads of   
   >> pages which are already present in the TLB are deferred until the   
   >> instruction is retired.   
   >>   
   >> This means that, entirely in userspace, with no   
   >> modeswitches/traps/system calls/etc, speculative execution can read   
   >> supervisor mappings and recover the content via cache timing   
   >> attacks.   
   >>   
   >> All mitigations for this revolve around breaking the TLB-hit which is   
   >> a necessary prerequisite.  For native operating systems, this means   
   >> isolating the user and kernel execution, and Linux KPTI is the   
   >> prominent example.  For hardware with virt extentions, moving the   
   >> workload into a VM also mitigates the issue, as the TLBs tagging   
   >> prohibits a hit.   
   >>   
   >>   
   >> ~Andrew   
   >>   
   >> (P.S. All in all, its been a long few months.  If you want want the   
   >> rather more gory details of how to mitigate SP2 in reality, see   
   >> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00110.html)   
   >>   
   >>   
   >>   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]