Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.c    |    Meh, in C you gotta define EVERYTHING    |    243,242 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 242,003 of 243,242    |
|    James Kuyper to Michael Sanders    |
|    Re: Unicode...    |
|    19 Nov 25 09:08:10    |
      From: jameskuyper@alumni.caltech.edu              On 2025-11-18 15:17, Michael Sanders wrote:       > On Tue, 18 Nov 2025 14:27:53 -0500, James Kuyper wrote:       >       >> Could you identify which document guarantees that every Unicode locale       >> contains "UTF-8"? Do you know what the domain of applicability of that       >> document is? It apparently does not cover my Ubuntu Linux system. The       >> command "locale -a" provides a list of all supported locales. Here's       >> what it says:       >>       >> [...]       >       > Hi James, umm 'guarantees'? No no... It does NOT verify:       >       > - whether the environment actually supports UTF8 fully       > - whether multibyte functions are enabled       > - whether the terminal supports UTF8       > - whether the C library supports UTF8 normalization       > (combining characters, etc. but it seems to work well here)       >       > To be sure: It's not a UTF-8 capability test. It's only a       > locale-string check. So it likely misses many valid UTF8       > locale variants...              If intended for use by anyone other than yourself, you should document       it's limitations in that regard, either with in-code comments or in user       documentation.              > Here I'm running any mixture of: Windows/BSD/Linix Mint LMDE.       >       > The best I can tell you at this stage is that it works on my end,       > not a very satisfying reply I'm sure you'd agree. But till I learn       > more about the issue that's the best I can offer.       >       > If you manage an improvement, please do post it here in the group       > so I can learn more too.              There might be documents specifying locale naming standards, but I'm not       aware of any. In the absence of such standards, or on systems not       covered by such standards, there's not much you can do about this.              If your targets include Linux Mint, there's a chance the locale names       might be similar to those on my Ubuntu Linux system - but I'm no expert       on the differences between Linux distributions. If so, you should make       the "UTF" search case-insensitive, and make the '-' optional, which       would add considerable complexity to what is currently a very simple       routine.              I'm curious - if you're interested in Unicode, why are you not making       any use of the Unicode support available in the current version of C?       Does your code need to work under older versions of C?              Since C2023, a conforming implementation of C is required to support       character constants and string literals that use UTF-8, UTF-16, and       UTF-32 encodings when prefixed with u8, u or U, respectively. Those use       the char8_t, char16_t, and char32_t types. Also new in C2023 is       mbrtoc8() and c8rtomb().       Those prefixes and types go back to C2011, where it was optional whether       they used those encodings. There were pre#defined macros which could be       queried to determine whether or not they did. Routines for converting       between those types and multi-byte strings or wchar_t also go back to       that time.              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca