Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.asm.x86    |    Ahh, the lost art of x86 assembly    |    4,675 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 3,538 of 4,675    |
|    Robert Prins to Robert Prins    |
|    Converting MMX to XMM [was: XSAVE area -    |
|    13 Sep 18 21:19:33    |
   
   From: robert@nospicedham.prino.org   
      
   On 2018-09-02 13:22, Robert Prins wrote:   
   > Hi,   
   >   
   > Is anyone willing to share some ready-to-use assembler code, or even just a   
   > structure, to access the ***non-compacted*** (extended) XSAVE area. Yes, it's   
   > in the latest Intel manual (Intel 325462-sdm-vol-1-2abcd-3abcd.pdf, Chapter   
   > 13, paragraph 13.4, PDF page 316), and I can count everything myself, but I   
   > don't like re-inventing the wheel. (I'm a lazy git...)   
      
   Lazy git has done it himself, and it's not brought me anywhere closer to   
   figuring out why come code converted from MMX to XMM on a few occasions fails.   
      
   Here's the code:   
      
   @03:   
   {$ifdef xmm }   
   { vpxor xmm0, xmm0, xmm0 } db $c5,$f9,$ef,$c0   
   {$else }   
   { pxor mm0, mm0 } db $0f,$ef,$c0   
   {$endif }   
      
    cmp _ieps, -1   
    je @04   
      
    mov eax, _jc.y   
    cmp eax, _year   
    je @06   
      
    mov _year, eax   
    jmp @05   
      
   @04:   
    mov eax, [ebx + offset split_list.trip]   
    cmp eax, _trip   
    je @06   
      
    mov _trip, eax   
      
   @05:   
    mov _24h_max, 0   
      
   @06:   
    mov split_1st, ebx   
      
    imul ecx, [ebx + offset split_list.jdn], 1440   
    add ecx, [ebx + offset split_list.dtime]   
    mov _djdn, ecx   
      
   @07:   
    mov eax, _ajdn   
    mov _sjdn, eax   
      
    imul ecx, [ebx + offset split_list.jdn], 1440   
    add ecx, [ebx + offset split_list.atime]   
    mov _ajdn, ecx   
      
    mov eax, _ajdn   
    sub eax, _djdn   
    cmp eax, 1440   
    jg @08   
      
   {$ifdef xmm }   
   { vpaddd xmm0, xmm0, [ebx + offset split_list.km] } db   
   $c5,$f9,$fe,$43,offset split_list.km   
   {$else }   
   { paddd mm0, [ebx + offset split_list.km] } db $0f,$fe,$43,offset   
   split_list.km   
   {$endif }   
      
    mov split_last, ebx   
    mov ebx, [ebx + offset split_list.split_nxt]   
    test ebx, ebx   
    jnz @07   
      
   @08:   
    test ebx, ebx   
    jnz @09   
      
    mov eax, _ajdn   
    mov _sjdn, eax   
      
   @09:   
   {$ifdef xmm }   
   { vmovd ecx, xmm0 } db $c5,$f9,$7e,$c1   
   {$else }   
   { movd ecx, mm0 } db $0f,$7e,$c1   
   {$endif }   
    cmp ecx, _24h_max   
    jl @10   
      
    mov _24h_max, ecx   
      
    push type p24h_list   
    call System._MemNew   
    mov p24h_ptr, eax   
    push eax   
      
    push offset p24h_ptr   
    call update_list_pointers //> hhcommon.pas   
      
    pop eax   
      
    and dword ptr [eax + offset p24h_list.p24h_nxt], 0   
      
    mov edx, _ieps   
    mov [eax + offset p24h_list.split_id], edx   
      
    mov edx, _sjdn   
    sub edx, _djdn   
    mov [eax + offset p24h_list.elaps], edx   
      
   {$ifdef xmm }   
   { vmovdqu [eax + offset p24h_list.km], xmm0 } db $c5,$fa,$7f,$40,offset   
   p24h_list.km   
   {$else }   
   { movq [eax + offset p24h_list.km], mm0 } db $0f,$7f,$40,offset   
   p24h_list.km   
   {$endif }   
      
   When it's compiled with an {$undef xmm}, i.e. using the MMX registers it runs   
   flawlessly. However, when compiled with an {$define xmm} is falls over a bit   
   later with a zero-divide error, and that is, so my debugging code that uses the   
   XSAVE instruction to dump the CPU state shows, caused by the intervening call   
   to   
   the "System._MemNew" routine. However, and that's what I have been unable to   
   figure out, it only happens on two occasions. Every other call to   
   "System._MemNew" leaves XMM0 unchanged.   
      
   And yes, I'm moving 16 bytes, but the final vmovdqu in the code above is   
   followed by another vmovdqu that fills in 8 of those "overwritten" bytes.   
      
   So the questions are,   
      
   1) how do I figure out where XMM0 is clobbered up, and   
   2) how can it be that it's only clobbered up in two (out of several 100) cases   
      
   System is W7 Pro-64, and it happens on two systems, an AMD FX8150 and an Intel   
   4710MQ, which would exclude, almost certainly, a hardware problem, and   
   mentioning hardware, I don't think there is a way to actually trap access to   
   registers?   
      
      
      
      
   --   
   Robert AH Prins   
   robert(a)prino(d)org   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca