... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"

rec.arts.sf.fandom

Discussions of SF fan activities

137,311 messages

[ << oldest | < older | list | newer > | newest >> ]

Message 137,191 of 137,311

Bernard Peek to Dorothy J Heydt

Re: AKICIF: Capitalizing Book Titles

17 Jan 26 21:00:39

   From: bap@shrdlu.com   

   On 2026-01-17, Dorothy J Heydt  wrote:   
   > In article <10kdp51$1kju4$1@dont-email.me>,   
   > Evelyn C. Leeper  wrote:   
   >>In 2010, the Academia Real Española declared that 'ch' and 'll' were no   
   >>longer letters in their own right, but digraphs (like 'ph' in English).   
   >>As such words with 'ch' would be alphabetized after 'cg' and before   
   >>'ci', and those with 'll' would have that between 'lk' and 'lm'.   
   >>   
   >>I personally think this was because computers could not handle them as   
   >>single letters, and sort algorithms in particular would just break.   

   No, there are international standards for collating sequences.  At the 1990   
   Worldcon in the Netherlands I found a small huddle of Confused Americans   
   trying to find their membership numbers in an alphabetical list.  The list   
   followed the Dutch collating sequence standards where the prefix "van" in a   
   surname is ignored.  So "van Gelder" appears between F...  and H...  names.   

   >   
   > [Hal Heydt]   
   > Not really true...  It depends on how the text is coded   
   > internally.   

   No.  The collating sequence should be the same whatever coding is used.   
   Information should always appear in the first place your users look for it.   
   I posted a comment on a mailing list about computing standards suggesting   
   that where there are two or more valid positions in an alphabetic list the   
   data should appear in every valid position.   

   Of course that breaks things in different ways. The response to my   
   suggestion was that as it doesn't affect Americans it's unnecessary.   

   > If you're using ASCII or ECBDIC, then those digraphs   
   > would be two symbols each.  That's because those are 7 or 8 bit   
   > code schemes.  If you're using unicode, each character uses 16   
   > bits and those could be very easily defined as single symbols.   
   > (Unicode handles a great many more symbol sets than the Roman   
   > alphabet.)   

   I'm not sure whether the library's will require a change to Unicode or the   
   collating sequence   
   --   
   Bernard Peek   
   bap@shrdlu.com   
   Wigan   

   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)

[ << oldest | < older | list | newer > | newest >> ]