home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,060 of 243,242   
   Philipp Klaus Krause to All   
   Re: Unicode...   
   23 Nov 25 12:42:20   
   
   From: pkk@spth.de   
      
   Am 14.11.25 um 22:03 schrieb Michael Sanders:   
   > static int utf8_width(const char *s) {   
   >      int w = 0;   
   >      const unsigned char *p = (const unsigned char *)s;   
   >   
   >      while (*p) {   
   >          if (*p < 0x80) { w++; p++; } // ASCII 1-byte   
   >          else if ((*p & 0xE0) == 0xC0) { w++; p += 2; } // 2-byte UTF-8   
   >          else if ((*p & 0xF0) == 0xE0) { w++; p += 3; } // 3-byte UTF-8   
   >          else if ((*p & 0xF8) == 0xF0) { w++; p += 4; } // 4-byte UTF-8   
   >          else { w++; p++; } // fallback   
   >      }   
   >   
   >      return w;   
   > }   
   Do you need this to work under non-UTF-8 locales? If you only need that   
   length when the locale is UTF-8, why not just use mblen from stdlib.h?   
      
   Philipp   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca