home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c++.moderated      Moderated discussion of C++ superhackery      33,346 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 33,052 of 33,346   
   Bart van Ingen Schenau to James K. Lowden   
   Re: compilers, endianness and padding   
   17 May 13 02:49:47   
   
   From: bart@ingen.ddns.info.invalid   
      
   On Thu, 16 May 2013 05:47:52 -0700, James K. Lowden wrote:   
      
   > On Tue, 14 May 2013 15:08:22 CST   
   > Bart van Ingen Schenau  wrote:   
   >   
   >> Trees are not that difficult to serialize. How about a slightly   
   >> more complex structure:   
   >>   
   >> class X {   
   >>   struct t {   
   >>      size_t a;   
   >>      char* b;   
   >>   };   
   >   
   > As I mentioned elsewhere, it's necessary in the general case for the   
   > compiler to provide the extent as well as the value of a pointer.   
      
   >   
   >>   size_t c;   
   >>   union {   
   >>     char d[sizeof(t)];   
   >>     t e;   
   >>   } f;   
   >> };   
   >   
   > At first glance, this seems no problem at all, insofar as sizeof(f)   
   > is known at compile time.  The problem I think you're alluding to is   
   > that two different compilers might arrange f differently, and   
   > nothing about the bit pattern of the union tells us what to do.   
      
   No, the problem I am alluding to is that the compiler can't know which   
   member of the union is supposed to be valid and thus if it can chase   
   the pointer in the structure at all.   
      
   Here is the same class again with more meaningful names:   
      
   class String {   
      struct large_t {   
        size_t capacity;   
        char*  data;   
      };   
      
      size_t length;   
      union {   
        char short[sizeof(large_t)];   
        large_t large;   
      } value;   
   };   
      
   For short strings (up to sizeof(large_t)), the string data is stored   
   directly in value.short. For larger strings, a dynamically allocated   
   area, referred to by value.large.data is used.   
      
   How would the compiler decide when to chase the value.large.data   
   pointer and when to just dump the bytes from value.short?   
      
   > My answer is simple, once again, although at a trivial cost.  It   
   > must be possible to know which member of f was last written.  Why?   
   > Because if f.t was written, serialization demands its endianism be   
   > honored.   
      
   Are you really proposing to add a hidden member to all unions to track   
   which member was last written to? Just in case it might need   
   serialization and the endianness might matter?  And have you   
   considered that your proposed serialization feature might be   
   standardized to use an endian-neutral serialization format?   
      
   > One might hope, though, that this sort of malarky might fade into   
   > history if endianism were dealt with in the language proper.   
      
   I would not hold my breath. For either.   
      
   { Quoted signature removed -mod }   
      
   Bart v Ingen Schenau   
      
      
   --   
         [ See http://www.gotw.ca/resources/clcm.htm for info about ]   
         [ comp.lang.c++.moderated.    First time posters: Do this! ]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca