home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 241,959 of 243,242   
   Michael Sanders to All   
   Unicode...   
   14 Nov 25 21:03:38   
   
   From: porkchop@invalid.foo   
      
   Well, I finally got bitten by Unicode.   
      
   Managed a work around, but I don't have enough experience   
   with Unicode to know just exactly what I'm doing...   
      
   #include    
   #include    
      
   static int utf8_width(const char *s) {   
       int w = 0;   
       const unsigned char *p = (const unsigned char *)s;   
      
       while (*p) {   
           if (*p < 0x80) { w++; p++; } // ASCII 1-byte   
           else if ((*p & 0xE0) == 0xC0) { w++; p += 2; } // 2-byte UTF-8   
           else if ((*p & 0xF0) == 0xE0) { w++; p += 3; } // 3-byte UTF-8   
           else if ((*p & 0xF8) == 0xF0) { w++; p += 4; } // 4-byte UTF-8   
           else { w++; p++; } // fallback   
       }   
      
       return w;   
   }   
      
   int main(void) {   
       const char *s = "élan";   
       printf("string:     %s\n", s);   
       printf("strlen:     %d\n", strlen(s)); // 4   
       printf("utf8_width: %d\n", utf8_width(s)); //5   
      
       return 0;   
   }   
      
   --   
   :wq   
   Mike Sanders   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca