Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.asm.x86    |    Ahh, the lost art of x86 assembly    |    4,675 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 2,710 of 4,675    |
|    Terje Mathisen to Rick C. Hodgin    |
|    Re: Optimize stricmp() algorithm (casele    |
|    25 Jun 17 07:48:03    |
   
   From: terje.mathisen@nospicedham.tmsw.no   
      
   Rick C. Hodgin wrote:   
   > On Saturday, June 24, 2017 at 5:50:14 PM UTC-4, Terje Mathisen wrote:   
   >> Rick C. Hodgin wrote:   
   >>> Can anybody help me optimize this code?   
   >> [snip]   
   >>> It's designed to be used as a custom assembly algorithm for a   
   >>> stricmp() algorithm which follows this general pattern, which   
   >>> is designed to be the target of a qsort() callback:   
   >>>   
   >>> int caseless_compare(const void *p1, const void *p2)   
   >>> {   
   >>> int d;   
   >>> const unsigned char *s1 = p1;   
   >>> const unsigned char *s2 = p2;   
   >>>   
   >>> while ((d = tolower(*s1) - tolower(*s2)) == 0 && *s1)   
   >>> s1++, s2++;   
   >>>   
   >>> return d;   
   >>> }   
   >>>   
   >>   
   >> Just writing a version of this code which can handle national 8-bit   
   >> character sets is more interesting, you pretty much have to use some   
   >> form of lookup table, i.e.   
   >>   
   >> while (d = tolower[c = *s1++] - tolower[*s2++] && c) {};   
   >   
   > This minor tweak has given the best results so far:   
   >   
   > Rick2 = 123   
   > Bart = 117   
   > Ben = 116   
   > RickAsm = 105   
   > BartAsm = 104   
   > Terje C = 96   
   >   
   > The original algorithm yours was based off is the Ben score above.   
   > Your little tweak gave it a 20-point gain.   
      
   Oops, as you've already noticed I forgot to keep the comparison, i.e.   
   the diff needs to be zero. :-(   
      
   However, as long as this is ~the same speed as the 7-bit only code then   
   it is a big win in my book since it would support any 8-bit character   
   set including national language versions of both ASCII and EBCDIC.   
      
   What I intended to write more about yesterday but was how to make a   
   reasonable fast unicode version:   
      
   Do a setup stage first, either to monocase all the keys, or to create a   
   hash table consisting of the lowercase version of all the individual   
   characters seen in those keys.   
      
   With monocased keys the problem is of course already solved, at a cost   
   of N*M (nr of keys * average key length) tolower() calls, while a hash   
   table would give near-linear lookup speeds after first spending the same   
   time calling tolower() the same number of times.   
      
   Personally I would have gone to great lengths in the old days to   
   minimize ram use, but today I would much rather use the nomocased keys   
   approach.   
      
   Terje   
      
   --   
   -
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca