... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.c++.moderated
Moderated discussion of C++ superhackery
33,346 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 32,949 of 33,346
Seungbeom Kim to All
Re: Passing multi-dimensional arrays to
28 Mar 13 01:06:59
   From: musiphil@bawi.org   
      
   On 2013-03-27 08:09, Öö Tiib wrote:   
   > On Wednesday, 27 March 2013 13:10:02 UTC+2, Seungbeom Kim  wrote:   
   >> On 2013-03-26 19:40, ï¿½ï¿½ Tiib wrote:   
   >>>   
   >>> That factor is not too important ... more idiomatic would be to   
   >>> assume that typedefs are always done ...:   
   >>>   
   >>>      typedef std::vector Column;   
   >>>      typedef std::vector Matrix;   
   >>>      Matrix::size_type sz = matrix.size();   
   >>   
   >> I often do that, too. And that turns a one-line declaration of an   
   >> array ('double matrix[M][N];') into three lines. Which is not too   
   >> bad, but while I like typedefs that yield meaningful abstractions,   
   >> this is rather just to replace repeating long names with repeating   
   >> short names, and/or typing too much on a single line with typing   
   >> reasonable amounts on more lines, which I feel to be like a   
   >> necessary evil.   
   >   
   > The typedefs are part of class that has such multidimensional   
   > relation.  Multidimensional relation is complex and expensive.   
   > Expensive things result with desire to refactor.  For example to   
   > consider switching Column with 'std::map'.  You have   
   > difficulties to measure performance of such alternatives without   
   > typedefs.  Typedefs together with 'auto' make it often almost   
   > effortless.   
      
   I agree with you in general, but not with your example. And I should   
   have made it more clear that my criticism was more toward the   
   intermediate typedef for Column than the one for the entire Matrix.   
      
   A matrix is neither necessarily a vector of columns nor necessarily a   
   vector of rows, and it may be thought of as both. If you decide to   
   represent a Matrix with a vector of Columns, that's an implementation   
   detail rather than a meaningful abstraction to be published, and you   
   need the typedef just to reduce some typing.   
      
   For example, if you were using built-in arrays, you wouldn't need the   
   intermediate typedef: instead of   
      
        typedef double Column[N];   
        typedef Column Matrix[M];   
      
   you would just write   
      
        typedef double Matrix[M][N];   
      
   because even without having Column typedefed, there's no more   
   verbosity, and the typedef serves no other purpose.   
      
   For another example, you may want to switch your representation of   
   Matrix into a vector of Rows instead, then your Column typedef is no   
   longer useful and you have to rewrite many parts of your code anyway.   
   You mentioned switching Column into 'std::map', but What   
   if you want to switch the entire Matrix into 'std::map, double>' or 'std::map>'? Again the   
   Column typedef is no longer useful. Such switching can happen, but it   
   cannot be made effortless just by having more typedefs.   
      
   The point of this story is that the verbosity of data structures such   
   as 'std::vector>' often encourages us to have more   
   typedefs that may alleviate the verbosity but don't serve as   
   meaningful abstractions, that we wouldn't need (all of) them if we had   
   more concise data structures available.   
      
   >>> ....also 'auto' was modified exactly because of that:   
   >>>   
   >>>      auto it = matrix.end();   
   >>   
   >> This is great, as long as you can use C++11 (which is, sadly, not   
   >> always true).   
   >   
   > The feature is in language. There will be constantly diminishing   
   > amount of poor souls who can't use it. Most helpful suggestion for   
   > them is to migrate. What else we can suggest?   
      
   It's not always that they don't want to migrate, as I implied with the   
   word "sadly." There are situations where such syntactic improvements   
   don't justify the cost of migration.   
      
   (And I don't want to have to say that we have to upgrade to a C++11   
   compiler in order to use a 2D array with such great ease. :D)   
      
   >>>> in exchange for the nice access syntax 'm[i][j]'.   
   >>>   
   >>> That is nicest but rarely optimal access syntax.   
   >>   
   >> Is it suboptimal because of the underlying representation of a   
   >> vector of vectors, or is there anything fundamental in the access   
   >> syntax that prevents the access through it from being optimal?   
   >   
   > Even with raw array it is commonly suboptimal. Say we have 'T   
   > m[X][Y];', address of 'm[i][j]' is at 'm + i * sizeof(T[Y]) + j *   
   > sizeof(T)'.  That can lose 20% of performance on edge cases (that   
   > rarely matters of course).   
   >   
   > The algorithms that traverse multidimensional relations are usually   
   > quite sequential so iterators (or pointers for case of C array) are   
   > usually more optimal (one addition for every increment). Standard   
   > algorithm library for example often performs quite close to optimal   
   > (and takes iterators as rule).   
      
   It's not only that it rarely matters, but if we traverse a container   
   sequentially, as we'd do with an iterator, I don't think many decent   
   compilers will actually generate those multiplications and additions   
   verbatim. What I've heard is that vector traversals with indexing and   
   iterators will generate equally optimal code.   
      
   And if you represent a matrix with a vector of rows, how do you do   
   column-wise traversal, or even a diagonal one with iterators? How do   
   you traverse an upper triangular matrix? These things are so much   
   easier with indexing, and an iterator that's tied to the incidental   
   choice of the internal representation cannot solve them.   
      
   >>> IMHO the choice of container should be always implementation   
   >>> detail. In OOP sense two-dimensional container is just a 1 to M*N   
   >>> relation. It is not good to expose carriers of relations for any   
   >>> by-passer to adjust.   
   >>   
   >> I'm not sure if I implied anything opposite.   
   >   
   > I was more about the immersion of 'm[i][j]' syntax that is usually   
   > sub-optimal anyway so the only thing for what I would consider it   
   > would be exposing the relation in interface.   
      
   In cases that I'm talking about (and that I think we're talking   
   about), the multidimensional relation is part of the interface.   
      
   >> And my question was more towards comparison with std::vector: even   
   >> for numeric matrices, I have seen many people recommend std::vector   
   >> but not as many recommend std::valarray. Why is that?   
   >   
   > I was just answering your question ("why not valarray?"). My   
   > impression is that it is not generic enough when discussing   
   > arrays. It is not feature-rich enough when discussing computations   
   > with dense matrices.   
      
   I haven't found yet anything lacking in std::valarray that std::vector   
   has that is necessary in implementing numerical vectors or matrices.   
   On the other hand, std::valarray allows optimization through   
   expression templates, which the GNU C++ Library seems to have indeed.   
      
   My guess is that people are just more familiar with std::vector.   
      
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]