home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 130,413 of 131,241   
   Anton Ertl to Scott Lurndal   
   Re: Multi-precision addition and archite   
   30 Nov 25 15:18:21   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   scott@slp53.sl.home (Scott Lurndal) writes:   
   >anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:   
   >>scott@slp53.sl.home (Scott Lurndal) writes:   
   >>>Thomas Koenig  writes:   
   >>>>Anton Ertl  schrieb:   
   >>>>> Thomas Koenig  writes:   
   >>>>>>I recently heard that CS graduates from ETH Zürich had heard about   
   >>>>>>pipelines, but thought it was fetch-decode-execute.   
   >>>>>   
   >>>>> Why would a CS graduate need to know about pipelines?   
   >>>   
   >>>So they can properly simluate a pipelined processor?   
   >>   
   >>Sure, if a CS graduate works in an application area, they need to   
   >>learn about that application area, whatever it is.   
   >   
   >It's useful for code optimization, as well.   
      
   In what way?   
      
   >In general,   
   >any programmer should have a solid understanding of the   
   >underlying hardware - generically, and specifically   
   >for the hardware being programmed.   
      
   Certainly.  But do they need to know between a a Wallace multiplier   
   and a Dadda multiplier?  If not, what is it about pipelined processors   
   that would require CS graduates to know about them?   
      
   >>Processor pipelines are not the basics of what a CS graduate is doing.   
   >>They are an implementation detail in computer engineering.   
   >   
   >Which affect the performance of the software created by the   
   >software engineer (CS graduate).   
      
   By a constant factor; and the software creator does not need to know   
   that the CPU that executes instructions at 2 CPI (486) instead of at   
   10 CPI (VAX-11/780) is pipelined; and these days both the 486 and the   
   VAX are irrelevant to software creators.   
      
   >>A few more examples where compilers are not as good as even I expected:   
   >>   
   >>Just today, I compiled   
   >>   
   >>u4 = u1/10;   
   >>u3 = u1%10;   
   >>   
   >>(plus some surrounding code) with gcc-14 in three contexts.  Here's   
   >>the code for two of them (the third one is similar to the second one):   
   >>   
   >>movabs $0xcccccccccccccccd,%rax   movabs $0xcccccccccccccccd,%rsi   
   >>sub    $0x8,%r13                  mov    %r8,%rax   
   >>mul    %r8                        mov    %r8,%rcx   
   >>mov    %rdx,%rax                  mul    %rsi   
   >>shr    $0x3,%rax                  shr    $0x3,%rdx   
   >>lea    (%rax,%rax,4),%rdx         lea    (%rdx,%rdx,4),%rax   
   >>add    %rdx,%rdx                  add    %rax,%rax   
   >>sub    %rdx,%r8                   sub    %rax,%r8   
   >>mov    %r8,0x8(%r13)              mov    %rcx,%rax   
   >>mov    %rax,%r8                   mul    %rsi   
   >>                                  shr    $0x3,%rdx   
   >>                                  mov    %rdx,%r9   
   >>   
   >>The major difference is that in the left context, u3 is stored into   
   >>memory (at 0x8(%r13)), while in the right context, it stays in a   
   >>register.  In the left context, gcc managed to base its computation of   
   >>u1%10 on the result of u1/10; in the right context, gcc first computes   
   >>u1%10 (computing u1/10 as part of that), and then computes u1/10   
   >>again.   
   >   
   >Sort of emphasizes that programmers need to understand the   
   >underlying hardware.   
      
   I am the programmer of the code shown above.  In what way would better   
   knowledge of the hardware made me aware that gcc would produce   
   suboptimal code in some cases?   
      
   >What were u1, u3 and u4 declared as?   
      
   unsigned long (on that platform).   
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
     Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca