... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.c
Meh, in C you gotta define EVERYTHING
243,242 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 241,377 of 243,242
David Brown to Kaz Kylheku
Re: signed vs unsigned and gcc -Wsign-co
21 Oct 25 12:42:20
   From: david.brown@hesbynett.no   
      
   On 20/10/2025 22:09, Kaz Kylheku wrote:   
   > On 2025-10-20, David Brown  wrote:   
   >> On 20/10/2025 17:03, pozz wrote:   
   >>> After many years programming in C language, I'm always unsure if it is   
   >>> safer to use signed int or unsigned int.   
   >>>   
   >>> Of course there are situations where signed or unsigned is clearly   
   >>> better. For example, if the values could assume negative values, signed   
   >>> int is the only solution. If you are manipulating single bits (&, |, ^,   
   >>> <<, >>), unsigned ints are your friends.   
   >>>   
   >>> What about other situations? For example, what do you use for the "i"   
   >>> loop variable?   
   >>>   
   >>> I recently activated gcc -Wsign-conversion option on a codebase and   
   >>> received a lot of warnings. I started to fix them, usually expliciting   
   >>> casting. Is it the way or is it better to avoid the warning from the   
   >>> beginning, choosing the right signed or unsigned type?   
   >>>   
   >>>   
   >>   
   >> Signed and unsigned types are equally safe.  If you are sure you are   
   >> within the ranges you know will work for the types you use, your code is   
   >> safe.  If you are not sure, you are unsafe.   
   >   
   > Safe generally means that the language somehow protects from harm, not   
   > that you protect yourself.   
      
   No - "safe" means lower risk of harm, at least in /my/ book.  It doesn't   
   matter if it is something /you/ do, or something the language does, or   
   something the tools do.  (Ideally, of course, you want these all working   
   together.)   
      
   >   
   > Correct code operating on correct inputs, using unsafe constructs,   
   > is still called unsafe code.   
      
   It will be called "unsafe code" by Rust salesmen, but not by software   
   developers who work on safe code.  "Safe code" is code used safely, it   
   is not an inherent property of code constructs, types, or languages.   
   All code constructs are unsafe if used incorrectly, while clear and   
   well-understood code constructs are safe if used correctly.  (Of course   
   some languages, tools, and programming practices make it easier to write   
   safe code, or harder to write unsafe code, or easier to tell the   
   difference.)   
      
   >   
   > However using unsigned types due to them being safe is often poorly   
   > considered because if something goes wrong contrary to the programmer's   
   > intent, there likely will be undefined behavior somewhere.   
      
   Exactly.  Unsigned types are not somehow "safer" than signed types, just   
   because signed types have UB on overflow.  Don't overflow your signed   
   types, then you have no UB.  And if you overflow your unsigned types   
   without that being an intentional and understood part of your code, you   
   will at the very least get unexpected behaviour - a bug - and just like   
   UB, there are no limits to how bad that can get.   
      
   >   
   > E.g. an array underflow using an unsigned index will not produce   
   > integer overlow undefined behavior, but the access will go out of   
   > bounds, which is undefined behavior.   
   >   
      
   Yes - bugs of all sorts often lead to UB sooner or later, even if the   
   behaviour of the code is defined by the C language standards up to that   
   point.   
      
   > There are bugs which play out without any undefined behavior:   
   > the program calculates something contrary to its requirements,   
   > but stays within the confines of the defined language.   
   >   
   > The odds that by using unsigned numbers you will get only that type of   
   > bug are low, and even if so, it is not a big comfort.   
   >   
   > Signed numbers behave more like mathematical integers, in cases   
   > when there is no overflow.   
   >   
   > If a, b and c are small, non-negative quantities, you might be tempted   
   > to make them unsigned. But if you do so, then you can no longer make   
   > this derivation of inequalities:   
   >   
   >    a + b > c   
   >   
   >        c > a - b   
   >   
   > Under the unsigned types, we cannot add -b to both sides of the   
   > inequality, preserving its truth value, even if all the operands   
   > are tiny numbers that fit into a single decimal digit!   
   >   
   > If b happens to be greater than a, we get a huge value on the right   
   > side that is now larger than c, not smaller.   
   >   
   > Gratuitous use of unsigned types impairs our ability to   
   > algebra to simplify code, due to the "cliff" at zero.   
   >   
      
   Yes.   
      
   > This is a nuanced topic where there isn't a one-type-fits-all answer,   
   > but I gravitate toward signed; use of unsigned has to be justified in   
   > some way.   
   >   
   > When sizes are being calculated and they come from functions or   
   > operators that produce size_t, then that tends to dictate unsigned.   
   >   
   > If the quantities are large and can possibly overflow, there are   
   > situations in which unsigned makes that simpler.   
   >   
      
   But normally, use of a bigger integer type makes the code significantly   
   simpler and easier to get correct - and often more efficient.   
      
   > For instance if a and b are unsigned such that a + b can semantically   
   > overflow (i.e. the result of the natural addition of a + b  doesn't   
   > fit into the type). It is simpler to detect: you can just do the   
   > addition, and then test:   
   >   
   >    c = a + b;   
   >   
   > when there is no overflow, it must be that (c >= a && c >= b)   
   > so if either (c < a) or (c < b) is true, it overflowed.   
   >   
      
   Or you use a bigger type and check simply and clearly for a result that   
   is too big for your needs.  Far too often, programmers go through   
   reasoning like this and figure out what they see as "optimal" source   
   code, then leave it in the source with no explanation as to what is   
   going on.  Aim to write code that does what it looks like it does - such   
   as adds the two values correctly giving the mathematically correct   
   result, then checks the range.  Otherwise, good luck to the maintainer   
   that changes the expression to "c = a + b + 1;".   
      
   Or, with C23, use chk_add().  (Many compilers have extensions with the   
   same effect, like __builtin_add_overflow, if you are happy using them.)   
      
   > This is significantly less verbose than a correct overflow test   
   > for signed addition, which has to avoid doing the actual addition,   
   > and has to be split into three cases: a and b have opposite   
   > sign (always okay), a and b are both positive, and a and b are   
   > both negative.   
   >   
      
   Sure.  But it is still significantly worse than using "long long int"   
   (or "int_least64_t" if you prefer), or using ckd_add().   
      
   There are things in C23 that are somewhat controversial, but I think the   
   checked integer operations are clearly a good standardisation of   
   existing compiler-specific practice.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]