... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,391 of 131,241
Anton Ertl to Dan Cross
Re: System calls (was: VAX)
13 Aug 25 21:23:34
   From: anton@mips.complang.tuwien.ac.at   
      
   cross@spitfire.i.gajendra.net (Dan Cross) writes:   
   >In article <2025Aug13.181010@mips.complang.tuwien.ac.at>,   
   >Anton Ertl  wrote:   
   >>For lseek(2):   
   >>   
   >>| Upon successful completion, lseek() returns the resulting offset   
   >>| location as measured in bytes from the beginning of the file.   
   >>   
   >>Given that off_t is signed, lseek(2) can only return positive values.   
   >   
   >This is incorrect; or rather, it's accidentally correct now, but   
   >was not previously.  The 1990 POSIX standard did not explicitly   
   >forbid a file that was so large that the offset couldn't   
   >overflow, hence why in 1990 POSIX you have to be careful about   
   >error handling when using `lseek`.   
   >   
   >It is true that POSIX 2024 _does_ prohibit seeking so far that   
   >the offset would become negative, however.   
      
   I don't think that this is accidental.  In 1990 signed overlow had   
   reliable behaviour on common 2s-complement hardware with the C   
   compilers of the day.  Nowadays the exotic hardware where this would   
   not work that way has almost completely died out (and C is not used on   
   the remaining exotic hardware), but now compilers sometimes do funny   
   things on integer overflow, so better don't go there or anywhere near   
   it.   
      
   >But, POSIX 2024   
   >(still!!) supports multiple definitions of `off_t` for multiple   
   >environments, in which overflow is potentially unavoidable.   
      
   POSIX also has the EOVERFLOW error for exactly that case.   
      
   Bottom line: The off_t returned by lseek(2) is signed and always   
   positive.   
      
   >>For mmap(2):   
   >>   
   >>| On success, mmap() returns a pointer to the mapped area.   
   >>   
   >>So it's up to the kernel which user-level addresses it returns.  E.g.,   
   >>32-bit Linux originally only produced user-level addresses below 2GB.   
   >>When memories grew larger, on some architectures (e.g., i386) Linux   
   >>increased that to 3GB.   
   >   
   >The point is that the programmer shouldn't have to care.   
      
   True, but completely misses the point.   
      
   >>Sure, but system calls are first introduced in real kernels using the   
   >>actual system call interface, and are limited by that interface.  And   
   >>that interface is remarkably similar between the early days of Unix   
   >>and recent Linux kernels for various architectures.   
   >   
   >Not precisely.  On x86_64, for example, some Unixes use a flag   
   >bit to determine whether the system call failed, and return   
   >(positive) errno values; Linux returns negative numbers to   
   >indicate errors, and constrains those to values between -4095   
   >and -1.   
   >   
   >Presumably that specific set of values is constrained by `mmap`:   
   >assuming a minimum 4KiB page size, the last architecturally   
   >valid address where a page _could_ be mapped is equivalent to   
   >-4096 and the first is 0.  If they did not have that constraint,   
   >they'd have to treat `mmap` specially in the system call path.   
      
   I am pretty sure that in the old times, Linux-i386 indicated failure   
   by returning a value with the MSB set, and the wrapper just checked   
   whether the return value was negative.  And for mmap() that worked   
   because user-mode addresses were all below 2GB.  Addresses furthere up   
   where reserved for the kernel.   
      
   >>I wonder how the kernel is informed that it can now return more   
   >>addresses from mmap().   
   >   
   >Assuming you mean the Linux kernel, when it loads an ELF   
   >executable, the binary image itself is "branded" with an ABI   
   >type that it can use to make that determination.   
      
   I have checked that with binaries compiled in 2003 and 2000:   
      
   -rwxr-xr-x 1 root root 44660 Sep 26  2000 /usr/local/bin/gforth-0.5.0*   
   -rwxr-xr-x 1 root root 92352 Sep  7  2003 /usr/local/bin/gforth-0.6.2*   
      
   [~:160080] file /usr/local/bin/gforth-0.5.0   
   /usr/local/bin/gforth-0.5.0: ELF 32-bit LSB executable, Intel 80386, version 1   
   (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, stripped   
   [~:160081] file /usr/local/bin/gforth-0.6.2   
   /usr/local/bin/gforth-0.6.2: ELF 32-bit LSB executable, Intel 80386, version 1   
   (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux   
   2.0.0, stripped   
      
   So there is actually a difference between these two.  However, if I   
   just strace them as they are now, they both happily produce very high   
   addresses with mmap, e.g.,   
      
   mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =   
   0xf7f64000   
      
   I don't know what the difference is between "for GNU/Linux 2.0.0" and   
   not having that, but the addresses produced by mmap() seem unaffected.   
      
   However, by calling the binaries with setarch -L, mmap() returns only   
   addresses < 2GB in all calls I have looked at.  I guess if I had   
   statically linked binaries, i.e., with old system call wrappers, I   
   would have to use   
      
   setarch -L    
      
   to make it work properly with mmap().  Or maybe Linux is smart enough   
   to do it by itself when it encounters a statically-linked old binary.   
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
     Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]