0c8ea3c0   
   From: alf.p.steinbach+usenet@gmail.com   
      
   On 10.04.2012 11:18, rossmpublic@gmail.com wrote:   
   > I have a very simple question that I have been unable to find a   
   > satisfactory answer. The question is why do I need to manually   
   > optimize my functions using const references?   
   >   
   > For example:   
   >   
   > // Optimized passing of string parameter   
   > Widget(std::string const& name);   
   > SetName(std::string const& name);   
   >   
   > // Non-optimized passing of string parameter   
   > Widget(std::string name);   
   > SetName(std::string name);   
   >   
   > I understand that with the latter notation there is an additional   
   > copy involved on most compilers, but why is that exactly?   
   > Why is it that the compiler (as smart as it is today) is unable to   
   > optimize away the additional copy?   
      
   The compiler is able to do it within the current language, but it's   
   constrained by   
      
    * the problem of aliasing, i.e. correctness violation, and   
      
    * depending on the solution, a combinatorial explosion, and that   
      
    * depending on the solution, the linker must support the scheme.   
      
   A partial solution, to avoid all three of those problems, is to use   
   immutable types with reference semantics.   
      
   Also, for the particular case of `std::string` another partial solution   
   is to use a COW (Copy On Write) implementation, which I believe is still   
   how the implementation for g++ works. It works in spite of the severe   
   shortcomings of the `std::string` class that theoretically should foil   
   its positive effect. It's like the magic of the horse shoe the Niels   
   Bohr had over his desk: theoretically it shouldn't work, but as Niels   
   remarked, "I am scarcely likely to believe in such foolish nonsense.   
   However, I am told that a horseshoe will bring you good luck whether you   
   believe in it or not".   
      
    ---   
      
   Here is an example of aliasing at work:   
      
      
   #include    
   #include    
   using namespace std;   
      
   void spoiler();   
      
   int ageOf( string name )   
   {   
    return 0?0   
    : name=="john"? 18   
    : name=="mary"? 22   
    : 0;   
   }   
      
   void foo( string const& name )   
   {   
    int const age = ageOf( name );   
    spoiler();   
    cout << name << " is " << age << " years old." << endl;   
   }   
      
   string bah = "john";   
      
   void spoiler() { bah = "the universe"; }   
      
   int main()   
   {   
    foo( bah );   
   }   
      
      
    ---   
      
   In many if not most cases, however, the compiler can easily prove that   
   there is no possible aliasing. It can also emit information that makes   
   it better able to prove that for later compilations of other code. But   
   here's where both the combinatorial explosion and problem of possible   
   need for linker support, enter the picture.   
      
   For consider a function declared like   
      
    void foo( string s );   
      
   If that function is non-optimized, then machine code must be emitted to   
   /copy/ the actual argument, while if the function is optimized like ...   
      
    void foo( string const& s );   
      
   then machine code to pass an address must be generated.   
      
   Consider then that this binary choice is present for each sufficiently   
   large argument where the optimization is relevant, and so that with n   
   such arguments we're talking about 2^n implementation variants: a   
   /combinatorial explosion/ akin to the one for perfect forwarding.   
      
   With the now most popular compilation model of C++ the compiler can't   
   know which variant it should assume, if there is only one. One possible   
   solution is to assume that /all/ 2^n variants exist, and to use all of   
   them freely with different linkage level name mangling. But then the   
   linker has to remove all the unused function implementations, lest the   
   final program increase greatly in size, like generally almost doubling   
   in size (which might counter any positive effect).   
      
    ---   
      
   Another possible solution, one that avoids both the combinatorial   
   explosion and the need for linker support, can be based on David   
   Wheeler's well known aphorism, "Any problem in computer science can be   
   solved by another level of indirection".   
      
   Since the reference optimization only makes sense for sufficiently large   
   arguments that anyway are handled via pointers/addresses, the caller can   
   simply, for each argument, pass a flag, e.g. in a processor register,   
   that tells the implementation /whether to copy/ that argument. If, in a   
   particular call, a particular argument is so flagged and is not of   
   primitive type, then the implementation must copy it and update its   
   pointer to point at the copy. Then it can just proceed normally.   
      
   This set of flags imposes a slight overhead on every call, in that the   
   implementation must check the flags, but it removes the need for linker   
   support.   
      
   Perhaps, in order to let the programmer decide, functions that support   
   and need the flags could be marked with some attribute.   
      
   And even further, perhaps calls could also be annotated so that the   
   programmer could take responsibility for the arguments being non-aliased.   
      
      
   > Why should I be writing optimization code into my interfaces?? This   
   > seems very wrong to me.   
      
   >From a purely idealistic point of view it is indeed very wrong to   
   hardcode optimization decisions into interfaces. Ideally there should be   
   "in", "out" and "in-out" designators as in Ada, and ideally the language   
   should then support proper Liskov substitution[1]. I.e., supporting   
   covariant "out" arguments, contravariant "in"-arguments, and enforcing   
   invariant "in-out" arguments.   
      
   C# is one stop closer to that ideal than C++, by having an "out"   
   designator, but I'm not sure if it supports proper LSP for "out".   
      
   However, C++ is very much a language that's evolved to meet practical   
   needs. And apparently "in", "out" and "in-out" arguments have not been   
   very urgent practical needs. For if they had been, then they would   
   presumably have been supported already (of course this argument applies   
   to any desired feature, but I'm just sayin').   
      
    ---   
      
   It is possible to attain the /appearance/ of "in", "out" and "in-out"   
   support by using e.g. empty macros with suggestive names.   
      
   That pure appearance effect is apparently the main idea of   
   Microsoft's[2] "Standard Annotation Language" SAL wrt. the C++   
   programming language (for the C programming language the annotations   
   may however have some slight advantage, but at an extreme, mind-boggling   
   cost). For example, the last time I checked the SAL annotations for one   
   of the most used Windows API functions, MessageBox, they were still   
   wrong. Which is what one can expect when it's just comment-like   
   annotations, and not a language-supported feature checked by a compiler.   
      
   In my humble opinion such schemes, for C++, are worse than not having   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|