Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.c    |    Meh, in C you gotta define EVERYTHING    |    243,242 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 242,060 of 243,242    |
|    Philipp Klaus Krause to All    |
|    Re: Unicode...    |
|    23 Nov 25 12:42:20    |
   
   From: pkk@spth.de   
      
   Am 14.11.25 um 22:03 schrieb Michael Sanders:   
   > static int utf8_width(const char *s) {   
   > int w = 0;   
   > const unsigned char *p = (const unsigned char *)s;   
   >   
   > while (*p) {   
   > if (*p < 0x80) { w++; p++; } // ASCII 1-byte   
   > else if ((*p & 0xE0) == 0xC0) { w++; p += 2; } // 2-byte UTF-8   
   > else if ((*p & 0xF0) == 0xE0) { w++; p += 3; } // 3-byte UTF-8   
   > else if ((*p & 0xF8) == 0xF0) { w++; p += 4; } // 4-byte UTF-8   
   > else { w++; p++; } // fallback   
   > }   
   >   
   > return w;   
   > }   
   Do you need this to work under non-UTF-8 locales? If you only need that   
   length when the locale is UTF-8, why not just use mblen from stdlib.h?   
      
   Philipp   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca