... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,402 of 131,241
Dan Cross to david.brown@hesbynett.no
Re: System calls
14 Aug 25 21:44:42
   From: cross@spitfire.i.gajendra.net   
      
   In article <107l5ju$k78a$1@dont-email.me>,   
   David Brown   wrote:   
   >On 14.08.2025 17:44, Dan Cross wrote:   
   >> In article ,   
   >> Scott Lurndal  wrote:   
   >>> Both Burroughs Large Systems (48-bit stack machine) and the   
   >>> Sperry 1100/2200 (36-bit) systems had (have, in emulation today)   
   >>> C compilers.   
   >>   
   >> Yup.  The 1100-series machines were (are) 1's complement.  Those   
   >> are the ones I usually think of when cursing that signed integer   
   >> overflow is UB in C.   
   >>   
   >> I don't think anyone is compiling C23 code for those machines,   
   >> but back in the late 1980s, they were still enough of a going   
   >> concern that they could influence the emerginc C standard.  Not   
   >> so much anymore.   
   >   
   >They would presumably have been part of the justification for supporting   
   >multiple signed integer formats at the time.   
      
   C90 doesn't have much to say about this at all, other than   
   saying that the actual representation and ranges of the integer   
   types are implementation defined (G.3.5 para 1).   
      
   C90 does say that, "The representations of integral types shall   
   define values by use of a pure binary numeration system" (sec   
   6.1.2.5).   
      
   C99 tightens this up and talks about 2's comp, 1's comp, and   
   sign/mag as being the permissible representations (J.3.5, para   
   1).   
      
   >UB on signed integer   
   >arithmetic overflow is a different matter altogether.   
      
   I disagree.   
      
   >> Regardless, signed integer overflow remains UB in the current C   
   >> standard, nevermind definitionally following 2s complement   
   >> semantics.  Usually this is done on the basis of performance   
   >> arguments: some seemingly-important loop optimizations can be   
   >> made if the compiler can assert that overflow Cannot Happen.   
   >   
   >The justification for "signed integer arithmetic overflow is UB" is in   
   >the C standards 6.5p5 under "Expressions" :   
      
   Not in ANSI/ISO 9899-1990.  In that revision of the standard,   
   sec 6.5 covers declarations.   
      
   >"""   
   >If an exceptional condition occurs during the evaluation of an   
   >expression (that is, if the result is not mathematically defined or not   
   >in the range of representable values for its type), the behavior is   
   >undefined.   
   >"""   
      
   In C90, this language appears in sec 6.3 para 5.  Note, however,   
   that they do not define what an exception _is_, only a few   
   things that _may_ cause one.  See below.   
      
   >It actually has absolutely nothing to do with signed integer   
   >representation, or machine hardware.   
      
   Consider this language from the (non-normative) example 4 in sec   
   5.1.2.3:   
      
   |On a machine in which overflows produce an exception and in   
   |which the range of values representable by an *int* is   
   |[-32768,+32767], the implementation cannot rewrite this   
   |expression as [continues with the specifics of the example]....   
      
   That seems pretty clear that they're thinking about machines   
   that actually generate a hardware trap of some kind on overflow.   
      
   >It doesn't even have much to do   
   >with integers at all.  It is simply that if the calculation can't give a   
   >correct answer, then then the C standards don't say anything about the   
   >results or effects.   
   >   
   >The point is that there when the results of an integer computation are   
   >too big, there is no way to get the correct answer in the types used.   
   >Two's complement wrapping is /not/ correct.  If you add two real-world   
   >positive integers, you don't get a negative integer.   
      
   Sorry, but I don't buy this argument as anything other than a   
   justification after the fact.  We're talking about history and   
   motivation here, not the behavior described in the standard.   
      
   In particular, C is a programming language for actual machines,   
   not a mathematical notation; the language is free to define the   
   behavior of arithmetic expressions in any way it chooses, though   
   one presumes it would do so in a way that makes sense for the   
   machines that it targets.  Thus, it could have formalized the   
   result of signed integer overflow to follow 2's complement   
   semantics had the committee so chosen, in which case the result   
   would not be "incorrect", it would be well-defined with respect   
   to the semantics of the language.  Java, for example, does this,   
   as does C11 (and later) atomic integer operations.  Indeed, the   
   C99 rationale document makes frequent reference to twos   
   complement, where overflow and modular behavior are frequently   
   equivalent, being the common case.  But aside from the more   
   recent atomics support, C _chose_ not to do this.   
      
   Also, consider that _unsigned_ arithmetic is defined as having   
   wrap-around semantics similar to modular arithmetic, and thus   
   incapable of overflow.  But that's simply a fiction invented for   
   the abstract machine described informally in the standard: it   
   requires special handling one machines like the 1100 series,   
   because those machines might trap on overflow.  The C committee   
   could just as well have said that the unsigned arithmetic   
   _could_ overflow and that the result was UB.   
      
   So why did C chose this way?  The only logical reason is that   
   there were machines at the time that where a) integer overflow   
   caused machine exceptions, and b) the representation of signed   
   integers was not well-defined, so that the actual value   
   resulting from overflow could not be rigorously defined.  Given   
   that C90 mandated a binary representation for integers and so   
   the representation of of unsigned integers is basically common,   
   there was no need to do that for unsigned arithmetic.   
      
   >> And of course, even today, C still targets oddball platforms   
   >> like DSPs and custom chips, where assumptions about the ubiquity   
   >> of 2's comp may not hold.   
   >   
   >Modern C and C++ standards have dropped support for signed integer   
   >representation other than two's complement, because they are not in use   
   >in any modern hardware (including any DSP's) - at least, not for   
   >general-purpose integers.  Both committees have consistently voted to   
   >keep overflow as UB.   
      
   Yes.  As I said, performance is often the justification.   
      
   I'm not convinced that there are no custom chips and/or DSPs   
   that are not manufactured today.  They may not be common, their   
   mere existence is certainly dumb and offensive, but that does   
   not mean that they don't exist.  Note that the survey in, e.g.,   
   https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm   
   only mentions _popular_ DSPs, not _all_ DSPs.   
      
   Of course, if such machines exist, I will certainly concede that   
   I doubt very much that anyone is targeting them with C code   
   written to a modern standard.   
      
   	- Dan C.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]