Forums before death by AOL, social media and spammers... "We can't have nice things"
|    rec.arts.sf.fandom    |    Discussions of SF fan activities    |    137,311 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 137,251 of 137,311    |
|    Torbjorn Lindgren to Keith F. Lynch    |
|    Re: AKICIF: Capitalizing Book Titles    |
|    21 Jan 26 13:45:04    |
   
   From: tl@none.invalid   
      
   Keith F. Lynch wrote:   
   >Lynn McGuire wrote:   
   >> I ignore many rules in life and keep on using two spaces between each   
   >> sentence. Like this. And this.   
   >   
   >As do I. It just looks better. And is more compatible with emacs.   
   >And apparently also with vi.   
      
      
      
      
   >Speaking of sort orders, there's one person here whose name has been   
   >variously rendered as:   
   >   
   >Lawrence =?iso-8859-13?q?D=FFOliveiro?=   
      
   So it's ISO 8859-13 character 0xFF[1] which maps to Unicode U+2019[2]   
   aka "Right Single Quotation Mark". As opposed to U+0027[3]   
   (Apostrophe) which is in the 7-bit ASCII set   
      
      
   >Lawrence D\377Oliveiro   
   >Lawrence D\342\200\231Oliveiro   
   >Lawrence D\303\277Oliveiro   
   >Lawrence D\222Oliveiro   
      
   These all needs context to be decodable.   
      
      
   >I don't know what chacacter(s) belong between the "D" and the "live,"   
   >nor do I know what the intended sort order is.   
      
   Looking at an online Unicode Collation Demo[4] I can see that in the   
   Unicode "standard sort order" U+0027 and U+2019 sort together   
   (basically as if they were both regular apostrophes) and after space   
   but before any letter. Seems sensible.   
      
   But as people have discussed the "correct" collation order can depend   
   on country and/or language and there's not enough data to be sure what   
   the intended sort order would be - given that it's given as 8859-13 it   
   MIGHT be one of the Baltic countries sort order.   
      
   The demo has a long list, I sampled one baltic country (Lithuaniam)   
   and it sorted this the same way but can't be bothered check if they   
   all do (seems likely given the small region).   
      
   And that assumes it wasn't a normal apostrophe that got transformed   
   into the "curly typographic" form by something like Microsoft smart   
   quotes (spiut) and also that the declared ISO character set is   
   "correct" - if it started as keyboard input or Unicode it could be a   
   case of "declare a ISO 8859 encoding that have that character - it   
   looks like it might only be in 8859-7 (Latin/Greek) and 8859-13   
   (Baltic Rim) so it's definitely possible.   
      
      
   1. https://en.wikipedia.org/wiki/ISO/IEC_8859-13   
   2. https://www.compart.com/en/unicode/U+2019   
   3. https://www.compart.com/en/unicode/U+0027   
   4. https://icu4c-demos.unicode.org/icu-bin/collation.html   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca