... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.asm.x86
Ahh, the lost art of x86 assembly
4,675 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 2,765 of 4,675
Rod Pemberton to James Harris
Re: Optimize stricmp() algorithm (casele
29 Jun 17 18:40:19
   From: NeedNotReplyHere@nospicedham.xrsevnneqk.cem   
      
   On Thu, 29 Jun 2017 09:57:41 +0100   
   James Harris  wrote:   
      
   > On 29/06/2017 01:40, Rod Pemberton wrote:   
   > > On Wed, 28 Jun 2017 11:21:21 +0100   
   > > James Harris  wrote:   
   > >> On 28/06/2017 04:31, Rod Pemberton wrote:   
      
   > > Also, recent versions of GCC were reported to optimize away   
   > > legitimate code in some situations.  I happened to see one instance   
   > > on Linux when compiling for this thread.  I had the "if-break"   
   > > statement in a different location, and even though the C code was   
   > > correct, GCC "optimized", i.e., deleted, away the rest of my   
   > > loop ...  This was noticed upon execution, and then confirmed by   
   > > the missing assembly.   
   >   
   > I posted something similar to comp.lang.c recently. With higher   
   > optimisation levels gcc would optimise away a certain loop test.   
   > Turned out it was within its rights to do so, in a sense, because the   
   > code depended on overflow of a signed integer - and that has no   
   > defined behaviour in C. I can accept that. That's how C is specified.   
      
   No ...  We should not accept that (in the general sense).   
      
   C has a whole bunch of "standards" including early C documents, early   
   books, articles by the authors, and the C Rationale, various Technical   
   Corrigendums, all of which explain required behavior, some of which is   
   not part of the formal ANSI or ISO standards.  This historical behavior   
   must be preserved for C to compile correctly.   
      
   > Where I am not comfortable is that gcc didn't issue any warnings and   
   > that it made different assumptions at different optimisation levels   
   > and, thereby, produced different programs. I tended to assume, before   
   > that, that higher levels of compiler optimisation would just take   
   > longer but would still produce programs which operated in the same   
   > way.   
      
   Well, this is what happens when you get C specification pedants, like   
   those on comp.lang.c, bringing C compilers into compliance with the   
   C specifications.  What you expected to work doesn't anymore.  I.e.,   
   they force and maximize situations to have undefined behavior etc where   
   they think it should exist.  Not good.   
      
   > But the earlier code in this thread allowed any of 256 char codes and   
   > if we keep to that, then the xor value is only a hint. It can be used   
   > to avoid the lower() or lower[] operations, along the following lines.   
   >   
   >    if the xor is 0 then go on to next pair. The chars are the same   
   >    if the xor is 0x20 then we _might_ have a match   
   >      check the two lower()s. If they are the same, we have a match   
   >   
   > And with that, if one string ends before the other then we will still   
   > have a mismatch and the loop will terminate. lower(0) will return 0.   
   > So AFAICS we only need to check one string for a terminating zero.   
      
   Ok.   
      
   > I understand that in the other thread you were thinking of using xor   
   > as a test in its own right, which would work on limited inputs.   
   > In /this/ thread I suggested it for a different purpose - a possible   
   > time optimisation. The inputs would still be all 256 potential byte   
   > values, not a limited range.   
   >   
   > To illustrate, here's the complete idea as code. It is all untested   
   > (or even compiled) so might have bugs. Read it as just a way to show   
   > the idea.   
   >   
   > #define MIGHT ('A' ^ 'a') /* To indicate that the chars might match */   
   >                            /* 0x20 in ASCII, 0x40 in EBCDIC */   
   >   
   > char lower(unsigned char ch) /* ASCII, not EBCDIC */   
   > {   
   >    if (ch >= 'A' && ch <= 'Z')  /* If uppercase */   
   >      return ch - 'A' + 'a';  /* Return the lowercase version */   
   >    return ch;  /* Else return the original */   
   > }   
   >   
   > int jh_stricmp(char *s, char *t)   
   > {   
   >    int i;   
   >    char x; /* The xor of each pair of characters */   
   >   
   >    for (i = 0; s[i]; ++i)   
   >    {   
   >      x = s[i] ^ t[i];   
   >      if (x == 0) /* The chars match (in this case, they are   
   > identical) */ continue;   
   >      if (x == MIGHT) /* The chars might match (upper & lower case) */   
   >        if (lower(s[i]) == lower(t[i])) /* They do match */   
   >          continue;   
   >      break;   
   >    }   
   >    return s[i] - t[i];   
   > }   
   >   
   > Note that (x == MIGHT) is only used to avoid unnecessary function   
   > calls, not to replace them.   
      
   To avoid the function calls for multiple character translations, you   
   can create a lookup table.   
      
   #define ARY 256   
   char lower[ARY];   
      
   /* In main() or as a void func(void) */   
      int i;   
      for(i=0;i In practice, on long identical strings the xor trick might cost more   
   > than it saves, but would help if the routine were called frequently   
   > on short non-matching strings.   
   >   
      
   I would approach the problem slightly differently ...   
      
   When "if (x == MIGHT)", the XOR result is 0x20, meaning that the two   
   characters are either an upper- and lower-case alphabetic which are the   
   same when lower-cased, or they're two different graphics characters   
   0x20 apart.  So, I think you only need to know if _one_ of the   
   characters is alphabetic or not.  I.e., if one is alphabetic, both   
   should be, if XOR==0x20, e.g., 'A' and 'a'.  If one is a graphic, both   
   should be, if XOR==0x20, e.g., '[' and '{'.  In C, detecting an   
   alphabet character is a call to islapha().   
      
   So, my routine using this identity/constraint would be something like   
   this, which replaces two calls to lower() or two lower[] lookups with   
   one alpha[] lookup.   
      
   int rp_xor_stricmp(void *p0, void *p1)   
   {   
     int t;   
     char *x,*y;   
      
     x=p0;   
     y=p1;   
      
     while(1)   
     {   
       if(!((*x)&(*y)))   
       {   
         x++;   
         y++;   
         break;   
       }   
       t=(*x)^(*y);   
       x++;   
       y++;   
       if(!t)   
         continue;   
       if(t==0x20)   
       {   
         if(!alpha[(int)*(x-1)])   
           break;   
         continue;   
       }   
       break;   
     }   
     x--;   
     y--;   
     return((*x)-(*y));   
   }   
      
   Note that I needed to dereference x minus one to check   
   for alpha. Otherwise, I'd have needed to introduce some more   
   temporaries. This might be better with an index i, as you prefer,   
   instead of pointers.   
      
      
   Using the same identity/constraint, your 'x==MIGHT' code should look   
   something like this:   
      
   > int jh_stricmp(char *s, char *t)   
   > {   
   >    int i;   
   >    char x; /* The xor of each pair of characters */   
   >   
   >    for (i = 0; s[i]; ++i)   
   >    {   
   >      x = s[i] ^ t[i];   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]