From: james.harris.1@nospicedham.gmail.com   
      
   On 27/06/2017 01:14, Rod Pemberton wrote:   
   > On Mon, 26 Jun 2017 01:45:25 -0700 (PDT)   
   > "Rick C. Hodgin" wrote:   
   >   
   >> Oh my ... AT&T syntax.   
   >   
   > With -O2, 32-bit GCC (DJGPP for DOS) generates:   
   >   
   > .globl _rp_stricmp   
   > _rp_stricmp:   
   > push ebp   
   > mov ecx, 1   
   > mov ebp, esp   
   > push esi   
   > push ebx   
   > mov edx, DWORD PTR [ebp+12]   
   > mov ebx, DWORD PTR [ebp+8]   
   > L10:   
   > test ecx, ecx   
   > je L3   
   > movsx ecx, BYTE PTR [ebx]   
   > movsx esi, BYTE PTR [edx]   
   > inc ebx   
   > inc edx   
   > cmp ecx, esi   
   > je L10   
   > mov al, BYTE PTR _lower[esi]   
   > cmp BYTE PTR _lower[ecx], al   
      
   It's interesting that for lower() DJGPP uses a lookup table.   
      
   > je L10   
   > L3:   
   > sub ecx, esi   
   > pop ebx   
   > mov eax, ecx   
   > pop esi   
   > pop ebp   
   > ret   
      
   That's good code but it occurred to me that because the offset is the   
   same into each string, one offset could be incremented instead of two   
   string pointers. Then, rather than the following (if ESI and EDI are the   
   string pointers)   
      
    add esi, 1   
    add edi, 1   
    movsx eax, [esi]   
    movsx ebx, [edi]   
      
   if the offset is in EDX then the equivalent would be a bit shorter. That   
   could end up being faster. And it saves a register.   
      
    add ebx, 1   
    movsx eax, [esi + edx]   
    movsx ebx, [edi + edx]   
      
      
   I saw you (Rod) make a good point in another thread that if lower() is a   
   function call then its overhead can be avoided in many cases by XOR of   
   the two bytes to see if they /might/ match. If the XOR is 0 then they   
   match. If it is 0x20 then the might match. Otherwise, they cannot match   
   and there's no need to lower-case either of them.   
      
   Even better, the XOR operation can set the flags for the equality test   
   so we don't need a CMP. Instead of the initial test   
      
    cmp eax, ecx ;Compare the two chars   
    je these_chars_match   
      
   we could use   
      
    xor eax, ecx   
    jz these_chars_match   
      
   And then EAX is ready to be tested for whether the two bytes /might/ be   
   a case-insensitive match.   
      
    cmp eax, 0x20   
    jne found_a_mismatch   
    ;The chars might match   
      
   (All code untested and may well contain errors....)   
      
   --   
   James Harris   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|