home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 3,537 of 4,675   
   Robert Redelmeier to Robert Prins   
   Re: Converting MMX to XMM [was: XSAVE ar   
   15 Sep 18 18:09:15   
   
   From: redelm@ev1.net.invalid   
      
   Robert Prins  wrote in part:   
   > When it's compiled with an {$undef xmm}, i.e. using the   
   > MMX registers it runs flawlessly. However, when compiled   
   > with an {$define xmm} is falls over a bit later with a   
   > zero-divide error, and that is, so my debugging code that   
   > uses the XSAVE instruction to dump the CPU state shows,   
   > caused by the intervening call to the "System._MemNew"   
   > routine. However, and that's what I have been unable to   
   > figure out, it only happens on two occasions. Every other   
   > call to "System._MemNew" leaves XMM0 unchanged.   
   >   
   > And yes, I'm moving 16 bytes, but the final vmovdqu in the   
   > code above is followed by another vmovdqu that fills in 8   
   > of those "overwritten" bytes.   
   >   
   > So the questions are,   
   >   
   > 1) how do I figure out where XMM0 is clobbered up, and 2)   
   > how can it be that it's only clobbered up in two (out of   
   > several 100) cases   
      
   Repeatably?  That sounds like an algorithmic error.   
   Random would be something like task switch clobber.   
      
   > System is W7 Pro-64, and it happens on two systems, an AMD   
   > FX8150 and an Intel 4710MQ, which would exclude, almost   
   > certainly, a hardware problem, and mentioning hardware,   
   > I don't think there is a way to actually trap access to   
   > registers?   
      
   Yes, By setting breakpoints and the debugger running   
   essentially single stepped (slug-slow).   
      
   I have some confidence that even MS w7-64 preserves XMM   
   registers across syscalls & task swaps.  However, your hot   
   silicon also has YMM and may need a VEX prefix to correctly   
   do XMM (overflow wrap).   
      
   https://en.wikipedia.org/wiki/Advanced_Vector_Extensions   
      
   -- Robert R   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca