Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.arch    |    Apparently more than just beeps & boops    |    131,241 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 129,409 of 131,241    |
|    David Brown to Dan Cross    |
|    Re: System calls (2/3)    |
|    15 Aug 25 17:49:53    |
      [continued from previous message]              well established long before C was conceived. And programmers of that       time were familiar with the concept of functions and operations being       defined for appropriate inputs, and having no defined behaviour for       invalid inputs. C is full of other things where behaviour is left       undefined when no sensible correct answer can be specified, and that is       not just because the behaviour of different hardware could vary. It       seems perfectly reasonable to me to suppose that signed integer       arithmetic overflow is just another case, no different from       dereferencing an invalid pointer, dividing by zero, or any one of the       other UB's in the standards.              >       > In particular, C is a programming language for actual machines,       > not a mathematical notation; the language is free to define the       > behavior of arithmetic expressions in any way it chooses, though       > one presumes it would do so in a way that makes sense for the       > machines that it targets.              Yes, that is true. It is, however, also important to remember that it       was based on a general abstract machine, not any particular hardware,       and that the operations were intended to follow standard mathematics as       well as practically possible - operations and expressions in C were not       designed for any particular hardware. (Though some design choices were       biased by particular hardware.)              > Thus, it could have formalized the       > result of signed integer overflow to follow 2's complement       > semantics had the committee so chosen, in which case the result       > would not be "incorrect", it would be well-defined with respect       > to the semantics of the language. Java, for example, does this,       > as does C11 (and later) atomic integer operations. Indeed, the       > C99 rationale document makes frequent reference to twos       > complement, where overflow and modular behavior are frequently       > equivalent, being the common case. But aside from the more       > recent atomics support, C _chose_ not to do this.       >              It could have made signed integer overflow defined behaviour, but it did       not. The C standards committee have explicitly chosen not to do that,       even after deciding that two's complement is the only supported       representation for signed integers in C23 onwards. It is fine to have       two's complement representation, and fine to have modulo arithmetic in       some circumstances, while leaving other arithmetic overflow undefined.       Unsigned integer operations in C have always been defined as modulo       arithmetic - addition of unsigned values is a different operation from       addition of signed values. Having some modulo behaviour does not in any       way imply that signed arithmetic should be modulo.              In Java, the language designers decided that integer arithmetic       operations would be modulo operations. Wrapping therefore gives the       correct answer for those operations - it does not give the correct       answer for mathematical integer operations. And Java loses common       mathematical identities which C retains - such as the identity that       adding a positive integer to another integer will increase its value.       Something always has to be lost when approximating unbounded       mathematical integers in a bounded implementation - I think C made the       right choices here about what to keep and what to lose, and Java made       the wrong choices. (Others may of course have different opinions.)              In Zig, unsigned integer arithmetic overflow is also UB as these       operations are not defined as modulo. I think that is a good natural       choice too - but it is useful for a language to have a way to do       wrapping arithmetic on the occasions you need it.              > Also, consider that _unsigned_ arithmetic is defined as having       > wrap-around semantics similar to modular arithmetic, and thus       > incapable of overflow.              Yes. Unsigned arithmetic operations are different operations from       signed arithmetic operations in C.              > But that's simply a fiction invented for       > the abstract machine described informally in the standard: it       > requires special handling one machines like the 1100 series,       > because those machines might trap on overflow. The C committee       > could just as well have said that the unsigned arithmetic       > _could_ overflow and that the result was UB.       >              They could have done that (as the Zig folk did).              > So why did C chose this way? The only logical reason is that       > there were machines at the time that where a) integer overflow       > caused machine exceptions, and b) the representation of signed       > integers was not well-defined, so that the actual value       > resulting from overflow could not be rigorously defined. Given       > that C90 mandated a binary representation for integers and so       > the representation of of unsigned integers is basically common,       > there was no need to do that for unsigned arithmetic.       >              Not at all. Usually when someone says "the only logical reason is...",       they really mean "the only logical reason /I/ can think of is...", or       "the only reason that /I/ can think of that /I/ think is logical is...".              For a language that can be used as a low-level systems language, it is       important to be able to do modulo arithmetic efficiently. It is needed       for a number of low-level tasks, including the implementation of large       arithmetic operations, handling timers, counters, and other bits and       pieces. So it was definitely a useful thing to have in C.              For a language that can be used as a fast and efficient application       language, it must have a reasonable approximation to mathematical       integer arithmetic. Implementations should not be forced to have       behaviours beyond the mathematically sensible answers - if a calculation       can't be done correctly, there's no point in doing it. Giving nonsense       results does not help anyone - C programmers or toolchain implementers,       so the language should not specify any particular result. More sensible       defined overflow behaviour - saturation, error values, language       exceptions or traps, etc., would be very inefficient on most hardware.       So UB is the best choice - and implementations can do something       different if they like.              Too many options make a language bigger - harder to implement, harder to       learn, harder to use. So it makes sense to have modulo arithmetic for       unsigned types, and normal arithmetic for signed types.              I am not claiming to know that this is the reasoning made by the C       language pioneers. But it is definitely an alternative logical reason       for C being the way it is.              >>> And of course, even today, C still targets oddball platforms       >>> like DSPs and custom chips, where assumptions about the ubiquity       >>> of 2's comp may not hold.       >>       >> Modern C and C++ standards have dropped support for signed integer       >> representation other than two's complement, because they are not in use       >> in any modern hardware (including any DSP's) - at least, not for       >> general-purpose integers. Both committees have consistently voted to       >> keep overflow as UB.       >       > Yes. As I said, performance is often the justification.       >       > I'm not convinced that there are no custom chips and/or DSPs       > that are not manufactured today. They may not be common, their       > mere existence is certainly dumb and offensive, but that does       > not mean that they don't exist. Note that the survey in, e.g.,       > https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm       > only mentions _popular_ DSPs, not _all_ DSPs.       >              I think you might have missed a few words in that paragraph, but I       believe I know what you intended. There are certainly DSPs and other              [continued in next message]              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca