home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 2,766 of 4,675   
   James Harris to Rod Pemberton   
   Re: Optimize stricmp() algorithm (casele   
   30 Jun 17 07:41:09   
   
   From: james.harris.1@nospicedham.gmail.com   
      
   On 29/06/2017 23:40, Rod Pemberton wrote:   
   > On Thu, 29 Jun 2017 09:57:41 +0100   
   > James Harris  wrote:   
   >   
   >> On 29/06/2017 01:40, Rod Pemberton wrote:   
   >>> On Wed, 28 Jun 2017 11:21:21 +0100   
   >>> James Harris  wrote:   
   >>>> On 28/06/2017 04:31, Rod Pemberton wrote:   
      
   ...   
      
   > To avoid the function calls for multiple character translations, you   
   > can create a lookup table.   
   >   
   > #define ARY 256   
   > char lower[ARY];   
   >   
   > /* In main() or as a void func(void) */   
   >     int i;   
   >     for(i=0;i     {   
   >       lower[i]=tolower(i);   
   >     }   
      
   That's a good way to go. It's fast, simple and would work with any   
   character set that will fit in 8 bits.   
      
   ...   
      
   > When "if (x == MIGHT)", the XOR result is 0x20, meaning that the two   
   > characters are either an upper- and lower-case alphabetic which are the   
   > same when lower-cased, or they're two different graphics characters   
   > 0x20 apart.   
      
   That's far from obvious but I have to say that for ASCII it seems to be   
   correct. And it may work for EBCDIC too (with an XOR of 0x40).   
      
   ...   
      
   > Using the same identity/constraint, your 'x==MIGHT' code should look   
   > something like this:   
   >   
   >> int jh_stricmp(char *s, char *t)   
   >> {   
   >>     int i;   
   >>     char x; /* The xor of each pair of characters */   
   >>   
   >>     for (i = 0; s[i]; ++i)   
   >>     {   
   >>       x = s[i] ^ t[i];   
   >>       if (x == 0) /* The chars match (in this case, they are   
   >> identical) */   
   >>         continue;   
   >>       if (x == MIGHT) /* The chars might match (upper & lower case) */   
   >           if (alpha[(int)s[i]]) /* They do match */   
   >            continue;   
   >>       break;   
   >>     }   
   >>     return s[i] - t[i];   
   >> }   
      
   I could understand an (unsigned) cast because chars might be signed but   
   why the (int) cast? Wouldn't it just sign-extend a negative number but   
   still leave it negative?   
      
   ...   
      
   > Sigh, the true value for Linux is___() ctype.h functions doesn't fit   
   > into a char ...  When cast to a char or masked with 0xFF, the true value   
   > is zero (false).  Unbelievable!  So, it needs two logical-nots !! to   
   > fix.  A simple assignment works for tolower().  That tells me someone   
   > once knew what they were doing.  It also makes you wonder if anyone   
   > coding for Linux had any C experience prior to Linux.  Who would   
   > intentionally cause a failure for a function intended to be used to   
   > fill a character array for use as a lookup table?  This is just F-word   
   > insane.  They must've broken every C program in existence that use   
   > ctype.h functions, which was not written specifically for Linux.  I   
   > guess it's a good thing I don't usually use the ctype.h functions.   
   >    
      
   Understood.   
      
      
   --   
   James Harris   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca