... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.c++.moderated
Moderated discussion of C++ superhackery
33,346 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 32,394 of 33,346
Dave Harris to Martin B.
Re: Techniques to avoid minimal scope in
09 Jun 12 11:36:03
   From: brangdon@cix.compulink.co.uk   
      
   0xCDCDCDCD@gmx.at (Martin B.) wrote (abridged):   
   > I really asked this question because I though there might be a   
   > technical, elegant solution out of this, but since noone posted   
   > anything that I found particularly enlightening(*), I guess there   
   > really isn't a technical solution to this, but it has to be   
   > addressed on the pedagogical and psychological level :-)   
      
   Well, move semantics should help. Consider some straight-forward   
   code for expressing what you are doing:   
      
       for (int i = 0; i != n; ++i)   
           set_lower_text( i, to_lower( get_text( i ) ) );   
      
   I hope this passes your guidelines. Suppose the signatures are:   
      
       string get_text( int i );   
       string to_lower( string &&s );   
       void set_lower_text( int i, &&s );   
      
   So get_text() allocates and returns a copy. This is passed to to_lower() as an   
   r-value reference, so it can modify it in-place and return it without copying   
   the   
   characters or doing an allocation. Set_lower_text() receives another r-value   
   reference, which it can move to its final destination, again without   
   allocating or   
   copying. So we have one allocation and one copy, both in get_text(). If   
   instead the   
   signatures are:   
      
       const string &get_text( int i );   
       string to_lower( const string &s );   
       void set_lower_text( int i, &&s );   
      
   then the count is the same but now the allocation and copy happens inside   
   to_lower(). We can provide both overloads of to_lower().   
      
   Now compare with your optimised code:   
      
   >        string s;   
   >        for (int i=0; i!=n; ++i) {   
   >          get_text(i, s); // void get_text(int, string&);   
   >          to_lower(s);   
   >          set_lower_text(i, s);   
   >        }   
      
   with signatures:   
      
       void get_text( int i, string &s );   
       void to_lower( string &s );   
       void set_lower_text( int i, const string &s );   
      
   Your version necessarily copies the string in get_text(), and again in   
   set_lower_text(). Set_lower_text() will probably have to do an allocation. (If   
   instead it takes an r-value reference and pilfers its argument's memory, s is   
   left   
   empty and we gain nothing by moving it outside the loop.) So we have 1   
   allocation   
   and 2 copies. You've turned two lines of code into five and not really gained   
   anything over the plain version. You might have saved an allocation, but you've   
   definitely copied the characters an extra time, so if the allocator is   
   efficient   
   (as I've argued elsewhere) you might even be slower.   
      
   Just write plain, simple code in C++11 and it will usually be fine.   
   If you can use iterators, so much the better:   
      
       std::transform( begin(), end(), lower_begin(), to_lower );   
      
   What's not to like?   
      
   If it really matters, I would look more at what get_text() and   
   set_lower_text() are actually doing. For example, supposing   
   set_lower_text() actually stores its argument in a vector called   
   lower_text:   
      
        void set_lower_text( int i, const string &s ) {   
            lower_text[i] = s;   
            // Enforce invariants here, if any.   
        }   
      
   You might consider using in-place modification explicitly. Add:   
      
        void set_lower_text( int i,   
                const std::function fn ) {   
            fn( lower_text[i] );   
            // Enforce invariants here, if any.   
        }   
      
   so the caller code looks like:   
      
       for (int i = 0; i != n; ++i)   
           set_lower_text( i, [=]( string &lower_text ) {   
               get_text( i, lower_text );   
               to_lower( lower_text );   
           } );   
      
   This avoids the need for an intermediate buffer entirely. The work is   
   done directly in lower_text's own storage. If you can encapsulate the   
   loop, too, so much the better. You could present it as a kind-of   
   in-place transform:   
      
       template   
       void set_all_lower_text( Iterator i, Func f ) {   
           for (auto j = lower_text.begin(); j != lower_text.end();   
                   ++j, ++i)   
               f( *j, *i );   
               // Enforce invariants here...   
           }   
           // ... or here.   
       }   
      
   Used like:   
      
       set_all_lower_text( text_cbegin(),   
         []( string &lower_text, const string &text ) {   
               lower_text = text;   
               to_lower( lower_text );   
           } );   
      
   This version allows more control and economies of scale. For example,   
   if lower_text must be kept sorted, you can resort it once at the end,   
   instead of repeatedly after each string is changed. If you need to   
   notify listeners when the strings are changed, you can do so once   
   instead of once per string. If you need to lock the data against   
   multi-threaded access, you can do so with less overhead.   
      
   The key thing is for the lambda expression to express what we want to   
   do to each string in its purest and most efficient form, and then use   
   that with a custom algorithm that manages everything else. Separation   
   of concerns.   
      
   The reason I don't think this is a good solution for your particular   
   case is because, without measurements, it's premature optimisation   
   again. I don't advocate that every signature like:   
      
       void set_lower_text( int i, const string &s );   
      
   also have an in-place version like:   
      
       void set_lower_text( int i,   
               const std::function fn );   
      
   However, where there is a proven need for performance, the in-place   
   version may be worth considering. And just adding the r-value   
   reference version:   
      
       void set_lower_text( int i, string &&s ) {   
           lower_text[i] = std::move( s );   
           // Enforce invariants here, if any.   
       }   
      
   gains efficiency without disrupting caller code at all.   
      
      
   (I have reordered these quotes for clarity.)   
   > Of course, iff I'm going to write an algorithm, it's gonna be   
   > as efficient as it gets without contorting myself.   
   >   
   > BUT -- that's really missing the point I think: The examples I   
   > gave in the OP were just that: examples. People are *going to*   
   > write loops that don't make much sense as algorithms (-> used   
   > once).   
      
   We may be saying the same thing here. Perhaps set_all_lower_text() is   
   what you had in mind for the efficient algorithm.   
      
   On the other hand, perhaps you are saying that anything used only   
   once shouldn't be an algorithm? If so, then I disagree. Even if   
   set_all_lower_text() is only used in one place, it's still providing   
   benefits like abstraction, information hiding, de-coupling and   
   separation of concerns. We have three components here - the source   
   of get_text(), the destination for set_lower_text(), and the code   
   that transfers between them - and none of them needs to know   
   implementation details of the other two.   
      
   I would further say that thinking in algorithms often leads to code   
   that is both elegant and efficient. Elegant because it is more   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]