home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c++.moderated      Moderated discussion of C++ superhackery      33,346 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 31,353 of 33,346   
   Ulrich Eckhardt to Dilip   
   Re: Dealing with encoding in basic_strea   
   25 May 11 15:29:22   
   
   6206284b   
   From: ulrich.eckhardt@dominolaser.com   
      
   Dilip wrote:   
   > I must be missing something here but I thought the only way characters   
   > outside of BMP are encoded in UTF-16 is precisely by using surrogate   
   > pairs? Am I misunderstanding something?   
      
   With strings, an operation might slice such a surrogate pair in the middle,   
   leaving you with an invalid UTF-16 string. Ditto for UTF-8. There is nothing   
   in the basic_string class that prevents this from happening.   
      
   With IOStreams' encoding (codecvt facets), there is no way to signal to the   
   caller that you need more internal elements in order to write a full piece   
   of output. There are ways to request more input bytes to create a full   
   internal element while reading and there are ways to request more output   
   buffer for bytes while writing, but the design assumes an internal element   
   can always be fully written on its own.   
      
   Uli   
      
      
   --   
   Domino Laser GmbH   
   Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932   
      
      
         [ See http://www.gotw.ca/resources/clcm.htm for info about ]   
         [ comp.lang.c++.moderated.    First time posters: Do this! ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca