From: redelm@ev1.net.invalid   
      
   Robert Prins wrote in part:   
   > When it's compiled with an {$undef xmm}, i.e. using the   
   > MMX registers it runs flawlessly. However, when compiled   
   > with an {$define xmm} is falls over a bit later with a   
   > zero-divide error, and that is, so my debugging code that   
   > uses the XSAVE instruction to dump the CPU state shows,   
   > caused by the intervening call to the "System._MemNew"   
   > routine. However, and that's what I have been unable to   
   > figure out, it only happens on two occasions. Every other   
   > call to "System._MemNew" leaves XMM0 unchanged.   
   >   
   > And yes, I'm moving 16 bytes, but the final vmovdqu in the   
   > code above is followed by another vmovdqu that fills in 8   
   > of those "overwritten" bytes.   
   >   
   > So the questions are,   
   >   
   > 1) how do I figure out where XMM0 is clobbered up, and 2)   
   > how can it be that it's only clobbered up in two (out of   
   > several 100) cases   
      
   Repeatably? That sounds like an algorithmic error.   
   Random would be something like task switch clobber.   
      
   > System is W7 Pro-64, and it happens on two systems, an AMD   
   > FX8150 and an Intel 4710MQ, which would exclude, almost   
   > certainly, a hardware problem, and mentioning hardware,   
   > I don't think there is a way to actually trap access to   
   > registers?   
      
   Yes, By setting breakpoints and the debugger running   
   essentially single stepped (slug-slow).   
      
   I have some confidence that even MS w7-64 preserves XMM   
   registers across syscalls & task swaps. However, your hot   
   silicon also has YMM and may need a VEX prefix to correctly   
   do XMM (overflow wrap).   
      
   https://en.wikipedia.org/wiki/Advanced_Vector_Extensions   
      
   -- Robert R   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|