From: anton@mips.complang.tuwien.ac.at   
      
   kegs@provalid.com (Kent Dickey) writes:   
   >In article <2025Oct4.121741@mips.complang.tuwien.ac.at>,   
   >Anton Ertl wrote:   
   >>MitchAlsup writes:   
   >>>int subroutine( int a, int b )   
   >>>{   
   >>> return a+b;   
   >>>}   
   ...   
   >>I tested this on AMD64, and did not find sign-extension in the caller,   
   >>neither with gcc-14 nor with clang-19; both produce the following code   
   >>for your example (with "subroutine" renamed into "subroutine1").   
   >>   
   >>0000000000000000 :   
   >> 0: 8d 04 37 lea (%rdi,%rsi,1),%eax   
   >> 3: c3 ret   
   ...   
   >AMD64 in hardware does 0 extension of 32-bit operations. From your   
   >example "lea (%rdi,%rsi,1),%eax" (AT&T notation, so %eax is the dest),   
   >the 64-bit register %rax will have 0's written into bits [63:32].   
   >So the AMD64 convention for 32-bit values in 64-bit registers is to   
   >zero-extend on writes. And to ignore the upper 32-bits on reads, so   
   >using a 64-bit register should use the %exx name.   
      
   Interesting. At some point I got the impression that LEA produces a   
   64-bit result, because it produces an address, but testing reveals   
   that LEA has a 32-bit zero-extended variant indeed.   
      
   >I agree with you that I32LP64 was a mistake, but it exists, and I   
   >think ARM64 did a good job handling it. It has all integer operations   
   >working on two sizes: 32-bit and 64-bit, and when writing a 32-bit result,   
   >it 0-extends the register value.   
   >   
   >You don't want "garbage extend" since you want a predictable answer.   
      
   Zero-extended for unsigned and sign-extended for int are certainly   
   more forgiving when some function is called without a prototype and   
   the actual type does not match the implied type (I once read about   
   IIRC miranda prototypes, but a web search only gives me Star Trek   
   stuff when I ask for that).   
      
   Zero-extending for int is less forgiving. Apparently by 2003 (when   
   AMD64 appeared) the use of prototypes was widespread enough that such   
   a calling convention was acceptable.   
      
   But once all the functions have correct prototypes, garbage-extension   
   is just as workable as other alternatives.   
      
   >Your choices for writing 32-bit results in a 64-bit register are thus   
   >sign-extend (not a good choice) or zero-extend (what almost everyone chose).   
      
   What makes you think that one is a better choice than the other?   
      
   The most obvious choices to me are:   
      
   Sign-extend int and zero-extend unsigned: That has the best chance at   
   the expected behaviour when the prototype is missing and would be   
   required.   
      
   If you rely on prototypes being present, you can take any choice,   
   including garbage-extension. Then you can use the full 64-bit   
   operation in many cases, and only insert sign or zero extension when a   
   conversion from 32-bit to 64 bit is needed (and that extension can be   
   part of an instruction, as in ARM A64 addressing modes).   
      
   As for what "almost everyone chose", here's some data:   
      
   int unsigned ABI   
   sign-extended sign-extended MIPS o64 and 64   
   sign-extended zero-extended SPARC V9   
   sign-extended zero-extended PowerPC64   
   zero-extended zero-extended AMD64   
   zero-extended zero-extended ARM A64   
   sign-extended sign-extended RV64   
      
   I determined this by looking at the code for   
      
   unsigned usubroutine( unsigned a, unsigned b )   
   {   
    return a+b;   
   }   
      
   int isubroutine( int a, int b )   
   {   
    return a+b;   
   }   
      
   The code on variois architectures (as compiled with gcc -O) is:   
      
   MIPS64 (gcc -mabi=64 -O and gcc -mabi=o64 -O):   
   0000000000000034 :   
    34: 03e00008 jr ra   
    38: 00851021 addu v0,a0,a1   
      
   000000000000003c :   
    3c: 03e00008 jr ra   
    40: 00851021 addu v0,a0,a1   
      
   SPARC V9:   
   0000000000000018 :   
    18: 9d e3 bf 50 save %sp, -176, %sp   
    1c: b0 06 00 19 add %i0, %i1, %i0   
    20: 81 cf e0 08 return %i7 + 8   
    24: 91 32 20 00 srl %o0, 0, %o0   
      
   0000000000000028 :   
    28: 9d e3 bf 50 save %sp, -176, %sp   
    2c: b0 06 00 19 add %i0, %i1, %i0   
    30: 81 cf e0 08 return %i7 + 8   
    34: 91 3a 20 00 sra %o0, 0, %o0   
      
   PowerPC64:   
   0000000000000030 <.usubroutine>:   
    30: 7c 63 22 14 add r3,r3,r4   
    34: 78 63 00 20 clrldi r3,r3,32   
    38: 4e 80 00 20 blr   
    ...   
      
   0000000000000048 <.isubroutine>:   
    48: 7c 63 22 14 add r3,r3,r4   
    4c: 7c 63 07 b4 extsw r3,r3   
    50: 4e 80 00 20 blr   
      
   >RISC-V is in another land, where they effectively have   
   >no 32-bit operations, but rather a convention that all 32-bit inputs   
   >must be sign-extended in a 64-bit register.   
      
   RISC-V has a number of sign-extending 32-bit instructions, and a   
   calling convention to go with it.   
      
   There seem to be the following options:   
      
   Have no 32-bit instructions, and insert sign-extension or   
   zero-extension instructions where necessary (or implicitly in all   
   operands, as I outlined earlier). SPARC V9 and PowerPC64 seem to take   
   this approach.   
      
   Have 32-bit instructions that sign-extend: MIPS64, Alpha, and RV64.   
      
   Have 32-bit instructions that zero-extend: AMD64 and ARM A64.   
      
   Have 32-bit instructions that sign-extend and 32-bit instructions that   
   zero-extend. No architecture that does that is known to me. It would   
   be a good match for the SPARC-V9 and PowerPC64 calling convention.   
      
   There is also one instruction set (ARM A64) that has special 32-bit   
   sign-extension and zero-extension forms for some operands.   
      
   And you can then adapt the calling convention to match the instruction   
   set. For "no 32-bit instructions", garbage-extension seems to be the   
   cheapest approach to me, but I expect that when SPARC-V9 and PowerPC64   
   came on the market, there was enough C code with missing prototypes   
   around that they preferred a more forgiving calling convention.   
      
   >If you pick ILP64 for your ABI, then you will get rid of almost all of   
   >these zero- and sign-extensions of 32-bit C and C++ code. It will just   
   >work. If you pick I32LP64, then you should have a full suite of 32-bit   
   >operations and 64-bit operations, at least for all add, subtract, and   
   >compare operations.   
      
   For compare, divide, shift-right and rotate, you either first need to   
   sign/zero-extend the register, or you need 32-bit versions (possibly   
   both signed and unsigned).   
      
   >And if you do I32LP64, your indexed addressing   
   >modes should have 3 types of indexed registers: 64-bit, 32-bit signed,   
   >and 32-bit unsigned. That worked well for ARM64.   
      
   It is certainly part of the way towards my idea of having sign- and   
   zero-extended 32-bit operands for every operand of every instruction.   
      
   It would be interesting to see how many sign-extensions and   
   zero-extensions (whether explicit or implicitly part of the   
   instruction) are executed in code that is generated from various C   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|