From: robert@nospicedham.prino.org   
      
   On 2018-09-16 20:36, Robert Redelmeier wrote:   
   > Robert Prins wrote in part:   
   >> On 2018-09-15 18:09, Robert Redelmeier wrote:   
   >>> I have some confidence that even MS w7-64 preserves XMM   
   >>> registers across syscalls & task swaps. However, your hot   
   >>> silicon also has YMM and may need a VEX prefix to correctly   
   >>> do XMM (overflow wrap).   
   >>   
   >> W7(-64) supports AVX since SP1, so that's not the problem,   
   >> and all my code is either using legacy MMX registers or   
   >> VEX encoded instructions using the XMM/YMM registers.   
   >>   
   >> What is "overflow wrap", and how would I detect it,   
   >> and would this not also affect MMX instructions?   
   >   
   > Every transition [upwards] in wordsize presents the machine   
   > designer decisions of what to do with the new extra bits.   
   > Essentially to use them for greater precision or parallel-wise.   
    >   
   > Your YMM registers may not always behave exactly as XMM,   
   > especially since they are wider. Do they wrap or saturate on   
   > overflow, or go into higher bits? There may be a control word   
   > to set the desired behaviour. Or use a prefix to distinguish   
   > XMM from YMM . MMX is deep legacy and will be coded as such.   
      
   I'm using XMM registers with VEX coded instructions, which sets the upper   
   halves   
   to zero, there should be no spillage from XMM into YMM.   
      
   Meanwhile, the mystery has deepened, if I run the program using the data that   
   causes the zero-divide (when part of the full input file), the program will   
   happily process it, without any problems!   
      
   So, I decided to comment out the very first trip in the input file, and   
   lo-and-behold, I get another zero-divide, but this time on data that appears   
   before the data that caused the original one. Comment out also the second trip,   
   and the zero-divide abend appears even earlier in the process.   
      
   I've run the program with the VP supplied HEAPCHK unit, and other than the   
   known   
   allocated memory that isn't freed, there is nothing abnormal showing op.   
      
   I am clueless, totally clueless. And without a way to actually set a breakpoint   
   on access to XMM0, I don't know what more I can do. (Actually, I'll give it one   
   last try and maybe rebuild the RTL with traps before and after the standard   
   _MemNew routine goes to Windows)   
      
   Robert   
   --   
   Robert AH Prins   
   robert(a)prino(d)org   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|