From: bc@freeuk.com   
      
   On 08/02/2026 19:21, Waldek Hebisch wrote:   
   > Bart wrote:   
   >> On 07/02/2026 22:48, Waldek Hebisch wrote:   
      
   >>> In SYS V convention argument is passed in exactly one place. It may   
   >>> be GPR, may be XMM register, may be on the stack. If you put right   
   >>> thing in RAX, then your arguments are valid regardless if the function   
   >>> is a vararg function or not.   
   >>   
   >> I had to go and check this, and you're right. SYS V does nothing special   
   >> when calling variadic functions.   
   >   
   > Well, there is special thing: RAX should contain number of SSE   
   > registers used for passing parameters. You do not need to set   
   > RAX for normal calls (at least on Linux, some other systems   
   > require it for all calls).   
      
   I looked out for that but don't remember seeing in on godbolt.org, and I   
   think it was for SYS V.   
      
   But I tried it again, and AL is being set to some count, which appears   
   to be the total number of float arguments (and rereading your comment,   
   you say the same thing).   
      
   >   
   >> I guess that makes implementing the body of variadic functions harder,   
   >> since it doesn't know where to look for the n'th variadic argument   
   >> unless it knows the type.   
   >   
   > Well, if a function wants to do actual computation with an argument   
   > it should better know its type.   
      
   On Windows, it will know the location of the next vararg and can access   
   its value before it knows the type. The user-provided type (eg.   
   'var_arg(p, int)') can simple do a type-punning cast on the value.   
      
   All args: fixed, variadic-reg, variadic-pushed, will also all be in   
   consecutive stack slots, regardless of type (This is the real reason why   
   floats should be loaded to GPRs for variadics: entry code just needs to   
   spill those 4 GPRs, it anyway won't know the mix of types.)   
      
   >> I started generating code for ARM64, but gave up because it was too hard   
   >> and not fun (the RISC processor turned out to be a LOT more complex than   
   >> the CISC x64!).   
   >   
   > Well, RISC processor means that compiler have to do work which is   
   > frequently done by hardware on a CISC. Concerning arm32, most   
   > annoying for me was limited range of constants, especially limit   
   > on offsets that can be part of an instruction. With my current   
   > implementation that puts something like 2kB limit on size of local   
   > variables. And my generator mixes instructions and constant data   
   > (otherwise it could not access constant data using limited available   
   > offsets), which works but compilcates code generator and probably   
   > gives suboptimal performance.   
      
   There are a dozen annoying things like this on arm64. Even when you give   
   up and decide to load 64-bit constants from a memory pool, you find you   
   can't even directly access that pool as it has an absolute address. That   
   can involve first loading the page address (ie. minus lower 12 bits) to   
   R, then you have to use an address mode involving R and the lower 12   
   bits as an offset.   
      
   >> The last straw was precisely to do with the SYS V call-conventions, and   
   >> I hadn't even gotten to variadic arguments yet, nor to structs passed   
   >> by-value, where the rules are labyrinthine.   
   >   
   > My low-level code only handles scalar arguments. That includes pointer   
   > to structures, but not structures passed by value. Structures passed by   
   > value could be handled by higher-level code, but up to now there was   
   > no need to do this.   
   >   
   > BTW, my amd64 code is assembler, so off-topic here, but arm32 code   
   > is mostly C. I use two helper structures:   
   >   
   > struct registers_buffer {   
   > int i_reg[4];   
   > union {double d; struct {float sl; float sh;} sf2;} f_reg[8];   
   > };   
   >   
   > typedef struct registers_buffer reg_buff;   
   >   
   > typedef struct arg_state { int ni; int sfi; int dfi; int si;} arg_state;   
   >   
   > C code fills 'reg_buff' with values and later low-level assembly   
   > copies values from the buffer to registers. I allocate enough space on   
   > the stack so that C code can write to the stack without risk of   
   > stack overflow.   
   >   
   > There are 3 helper routines:   
      
      
   This looks pretty complicated, but what is it for: is it still to do   
   with variadic functions, or is to with the LIBFFI problem?   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|