... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,850 of 131,241
MitchAlsup to All
Re: sign/zero/garbage extension (was: Ti
07 Oct 25 20:18:11
   From: user5857@newsgrouper.org.invalid   
      
   anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
      
   > MitchAlsup  writes:   
   > >   
   > >anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   > ..   
   > >My 66000 CMP is signless--it compares two integer registers and delivers   
   > >a bit vector of all possible comparisons {2 equality, 4 signed, 4 unsigned,   
   > >4 range checks, [and in FP land 10-bits are the class of the RS1 operand]}   
   >   
   > With an 88000-style compare and a result register of 64 bits, you can   
   > spend 14 bits on 64-bit comparison, 14 bits on 32-bit comparison, 14   
   > bits on 16-bit comparison, and 14 bits on 8-bit comparison, and still   
   > have 8 bits left.  What is a "range check" and why does it take 4   
   > bits?   
      
   CIN 0 <= Reg <  Max   
   FIN 0 <  Reg <= Max   
   RIN 0 <  Reg <  Max   
   SIN 0 <= Reg <= Max   
      
   >   
   > >> It is certainly part of the way towards my idea of having sign- and   
   > >> zero-extended 32-bit operands for every operand of every instruction.   
   > >   
   > >Unnecessary if the integer calculation deliver properly range-limited   
   > >64-bit results.   
   >   
   > Sign- or zero extension will still be necessary for things like   
   >   
   > long a=...   
   > int b=a;   
   > .. c[b];   
      
   The movement of long to int will 'smash' out extraneous significance.   
   As written: b has range [-2G..+2G] and the register holding b's value   
   will too.   
      
   The important property is that registers contain 64-bits and the value   
   in the register is range-limited to the calculated (or LDed) result.   
      
   > With the extension in the operands, you do not need any extension   
   > instructions, not even for division, right-shift etc.   
   >   
   > The question, however, is if the extensions occur often enough to   
   > merit such features.  I lean towards the SPARC/PowerPC/My 66000-v1   
   > approach here.   
      
   I did too, until  conversations with LLVM compiler writer.   
   GNUPLOT seems to be a banner application wrt range-limited calcu-   
   lations.   
      
   > >> It would be interesting to see how many sign-extensions and   
   > >> zero-extensions (whether explicit or implicitly part of the   
   > >> instruction) are executed in code that is generated from various C   
   > >> sources (with and without -fwrapv).   
   > >   
   > >In GNUPLOT is is just over 4% of instruction count for 64-bit-only   
   > >integer calculations.   
   >   
   > Now what if you had a calling convention with garbage-extension?  A   
   > number of extensions in your examples would go away.   
      
   Not many, few are on ABI and most of the ones that are are dealt with   
   when moving arguments to preserved registers. So, you could send HoBs   
   that are never observed since the MOV Rpreserved,Rargument gets changed   
   into a SR[AL] Rpreserved,Rargument<32:0> at no space or time cost.   
      
   > >Counted for() loops are somewhat special in that it is quite easy to   
   > >determine that the loop index never exceeds the range-limit of the   
   > >container.   
   >   
   > There have been enough cases where such reasoning led to "optimizing"   
   > code into an infinite loop and other fallout of adversarial compilers.   
   >   
   > >>                      If n is unsigned, you can also choose unsigned,   
   > >> but then this code will be slow on RV64 (and MIPS64 and SPARC V9 and   
   > >> PowerPC64 and Alpha).   
   > >   
   > >Example please !?!   
   >   
   > With a slightly different loop:   
   >   
   > long foo(long a[], unsigned l, unsigned h)   
   > {   
   >   unsigned i;          // <---this variable should be uint64_t   
   >   long r=0;   
   >   for (i=l; i!=h; i++)   
   >     r+=a[i];   
   >   return r;   
   > }   
   >   
   > gcc-10 -O3 produces on RV64G:   
   >   
   > 0000000000000000 :   
   >    0:   872a                    mv      a4,a0   
   >    2:   4501                    li      a0,0   
   >    4:   00c58c63                beq     a1,a2,1c <.L4>   
   >   
   > 0000000000000008 <.L3>:   
   >    8:   02059793                slli    a5,a1,0x20 // eliminate HoBs   
   >    c:   83f5                    srli    a5,a5,0x1d // does not have scaled   
   indexing   
   >    e:   97ba                    add     a5,a5,a4   // does not have indexing   
   >   10:   639c                    ld      a5,0(a5)   // all that work   
   >   12:   2585                    addiw   a1,a1,1   
   >   14:   953e                    add     a0,a0,a5   // loop induction   
   >   16:   feb619e3                bne     a2,a1,8 <.L3>   
   >   1a:   8082                    ret   
   >   
   > 000000000000001c <.L4>:   
   >   1c:   8082                    ret   
   >   
   foo:   
        MOV   R4,#0   
        MOV   R5,#1   
        VEC   R7,{}   
        LDD   R6,[R1,R5<<3]   
        ADD   R4,R4,R6   
        LOOP2 NE,R5,#1,R3   
        MOV   R1,R4   
        RET   
   >   
   >   
   > >   
   > >> If n is int, you can also choose int, and there is actually enough   
   > >> information here to make the code efficient (even with -fwrapv),   
   > >> because in this code int overflow really cannot happen,   
   > >   
   > >Consider the case where n is int64_t or uint64_t !?!   
   >   
   > Then the first condition does not hold on I32LP64.   
   >   
   > >Consider the C-preprocessor with::   
   > ># define int (short int) // !!   
   > >in scope.   
   >   
   > Then the compiler will see short int, and generate code accordingly.   
   > What's your point?   
   >   
   > - anton   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]