From: david.brown@hesbynett.no   
      
   On 14.08.2025 23:44, Dan Cross wrote:   
   > In article <107l5ju$k78a$1@dont-email.me>,   
   > David Brown wrote:   
   >> On 14.08.2025 17:44, Dan Cross wrote:   
   >>> In article ,   
   >>> Scott Lurndal wrote:   
   >>>> Both Burroughs Large Systems (48-bit stack machine) and the   
   >>>> Sperry 1100/2200 (36-bit) systems had (have, in emulation today)   
   >>>> C compilers.   
   >>>   
   >>> Yup. The 1100-series machines were (are) 1's complement. Those   
   >>> are the ones I usually think of when cursing that signed integer   
   >>> overflow is UB in C.   
   >>>   
   >>> I don't think anyone is compiling C23 code for those machines,   
   >>> but back in the late 1980s, they were still enough of a going   
   >>> concern that they could influence the emerginc C standard. Not   
   >>> so much anymore.   
   >>   
   >> They would presumably have been part of the justification for supporting   
   >> multiple signed integer formats at the time.   
   >   
   > C90 doesn't have much to say about this at all, other than   
   > saying that the actual representation and ranges of the integer   
   > types are implementation defined (G.3.5 para 1).   
   >   
   > C90 does say that, "The representations of integral types shall   
   > define values by use of a pure binary numeration system" (sec   
   > 6.1.2.5).   
   >   
   > C99 tightens this up and talks about 2's comp, 1's comp, and   
   > sign/mag as being the permissible representations (J.3.5, para   
   > 1).   
      
   Yes. Early C didn't go into the details, then C99 described the systems   
   that could realistically be used. And now in C23 only two's complement   
   is allowed.   
      
   >   
   >> UB on signed integer   
   >> arithmetic overflow is a different matter altogether.   
   >   
   > I disagree.   
   >   
      
   You have overflow when the mathematical result of an operation cannot be   
   expressed accurately in the type - regardless of the representation   
   format for the numbers. Your options, as a language designer or   
   implementer, of handling the overflow are the same regardless of the   
   representation. You can pick a fixed value to return, or saturate, or   
   invoke some kind of error handler mechanism, or return a "don't care"   
   unspecified value of the type, or perform a specified algorithm to get a   
   representable value (such as reduction modulo 2^n), or you can simply   
   say the program is broken if this happens (it is UB).   
      
   I don't see where the representation comes into it - overflow is a   
   matter of values and the ranges that can be stored in a type, not how   
   those values are stored in the bits of the data.   
      
   >>> Regardless, signed integer overflow remains UB in the current C   
   >>> standard, nevermind definitionally following 2s complement   
   >>> semantics. Usually this is done on the basis of performance   
   >>> arguments: some seemingly-important loop optimizations can be   
   >>> made if the compiler can assert that overflow Cannot Happen.   
   >>   
   >> The justification for "signed integer arithmetic overflow is UB" is in   
   >> the C standards 6.5p5 under "Expressions" :   
   >   
   > Not in ANSI/ISO 9899-1990. In that revision of the standard,   
   > sec 6.5 covers declarations.   
   >   
   >> """   
   >> If an exceptional condition occurs during the evaluation of an   
   >> expression (that is, if the result is not mathematically defined or not   
   >> in the range of representable values for its type), the behavior is   
   >> undefined.   
   >> """   
   >   
   > In C90, this language appears in sec 6.3 para 5. Note, however,   
   > that they do not define what an exception _is_, only a few   
   > things that _may_ cause one. See below.   
   >   
      
   It's basically the same in C90 onwards, with just small changes to the   
   wording. And it /does/ define what is meant by an "exceptional   
   condition" (or just "exception" in C90) - that is done by the part in   
   parentheses.   
      
   >> It actually has absolutely nothing to do with signed integer   
   >> representation, or machine hardware.   
   >   
   > Consider this language from the (non-normative) example 4 in sec   
   > 5.1.2.3:   
   >   
   > |On a machine in which overflows produce an exception and in   
   > |which the range of values representable by an *int* is   
   > |[-32768,+32767], the implementation cannot rewrite this   
   > |expression as [continues with the specifics of the example]....   
   >   
   > That seems pretty clear that they're thinking about machines   
   > that actually generate a hardware trap of some kind on overflow.   
   >   
      
   They are thinking about that possibility, yes. In C90, the term   
   "exception" here was not clearly defined - and it is definitely not the   
   same as the term "exception" in 6.3p5. The wording was improved in C99   
   without changing the intended meaning - there the term in the paragraph   
   under "Expressions" is "exceptional condition" (defined in that   
   paragraph), while in the example in "Execution environments", it says   
   "On a machine in which overflows produce an explicit trap". (C11   
   further clarifies what "performs a trap" means.)   
      
   But this is about re-arrangements the compiler is allowed to make, or   
   barred from making - it can't make re-arrangements that would mean   
   execution failed when the direct execution of the code according to the   
   C abstract machine would have worked correctly (without ever having   
   encountered an "exceptional condition" or other UB). Representation is   
   not relevant here - there is nothing about two's complement, ones'   
   complement, sign-magnitude, or anything else. Even the machine hardware   
   is not actually particularly important, given that most processors   
   support non-trapping integer arithmetic instructions and for those that   
   don't have explicit trap instructions, a compiler could generate "jump   
   if overflow flag set" or similar instructions to emulate traps   
   reasonably efficiently. (Many compilers support that kind of thing as   
   an option to aid debugging.)   
      
      
   >> It doesn't even have much to do   
   >> with integers at all. It is simply that if the calculation can't give a   
   >> correct answer, then then the C standards don't say anything about the   
   >> results or effects.   
   >>   
   >> The point is that there when the results of an integer computation are   
   >> too big, there is no way to get the correct answer in the types used.   
   >> Two's complement wrapping is /not/ correct. If you add two real-world   
   >> positive integers, you don't get a negative integer.   
   >   
   > Sorry, but I don't buy this argument as anything other than a   
   > justification after the fact. We're talking about history and   
   > motivation here, not the behavior described in the standard.   
      
   It is a fair point that I am describing a rational and sensible reason   
   for UB on arithmetic overflow - and I do not know the motivation of the   
   early C language designers, compiler implementers, and authors of the   
   first C standard.   
      
   I do know, however, that the principle of "garbage in, garbage out" was   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|