home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.c      Meh, in C you gotta define EVERYTHING      243,242 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 242,690 of 243,242   
   Waldek Hebisch to highcrew   
   Re: On Undefined Behavior   
   02 Jan 26 05:53:13   
   
   From: antispam@fricas.org   
      
   highcrew  wrote:   
   > Hello,   
   >   
   > While I consider myself reasonably good as C programmer, I still   
   > have difficulties in understanding undefined behavior.   
   > I wonder if anyone in this NG could help me.   
   >   
   > Let's take an example.  There's plenty here:   
   > https://en.cppreference.com/w/c/language/behavior.html   
   > So let's focus on https://godbolt.org/z/48bn19Tsb   
   >   
   > For the lazy, I report it here:   
   >   
   >   int table[4] = {0};   
   >   int exists_in_table(int v)   
   >   {   
   >       // return true in one of the first 4 iterations   
   >       // or UB due to out-of-bounds access   
   >       for (int i = 0; i <= 4; i++) {   
   >           if (table[i] == v) return 1;   
    >       }   
   >       return 0;   
   >   }   
   >   
   > This is compiled (with no warning whatsoever) into:   
   >   
   >   exists_in_table:   
   >           mov     eax, 1   
   >           ret   
   >   table:   
   >           .zero   16   
   >   
   >   
   > Well, this is *obviously* wrong. And sure, so is the original code,   
   > but I find it hard to think that the compiler isn't able to notice it,   
   > given that it is even "exploiting" it to produce very efficient code.   
   >   
   > I understand the formalism: the resulting assembly is formally   
   > "correct", in that UB implies that anything can happen.   
   > Yet I can't think of any situation where the resulting assembly   
   > could be considered sensible.  The compiled function will   
   > basically return 1 for any input, and the final program will be   
   > buggy.   
      
   You do not get the formalism: compiler applies a lot transformations   
   which are supposed to be correct for programs obeying the C rules.   
   However, compiler does not understand the program.  It may notice   
   details that you missed, but it act essentialy blindly on   
   information it has.  And most transformations have only limited   
   info (storing all things that compiler infers would take a lot   
   of memory and searching all info would take a lot of time).   
      
   Code that you see is a result of many transformations, possibly   
   hundreds or more.  The result is a conseqence of all steps,   
   but it could be hard to isolate a single "silly" step.   
      
   > Wouldn't it be more sensible to have a compilation error, or   
   > at least a warning?  The compiler will be happy even with -Wall -Wextra   
   > -Werror.   
      
   This case looks reasonably easy: when compiling 'exists_in_table'   
   the compiler had declaration of 'table' and knows it size is 4.   
   Compiler generated its output probably after noticing that   
   the loop would produce out of bound reference.  So with some   
   extra effort it should be possible to generate a diagnostic.   
   But in general, instead of array you may have a pointer without   
   bound information.  Or upper bound may be variable.  As James   
   wrote, for such reasons C standard does not require a diagnostic.   
   Also, in the past gcc and clang did not generate diagnostics   
   in such situation.  gcc is very complex beast and adding   
   diagnostics now may require nontrivial effort.   
      
   BTW: I expect that eventually gcc will warn.  Ideologicaly,   
   using various string functions can overflow buffers in   
   similar ways.  In the past such buffers overflow just generated   
   some (possibly "working") code.  Now most such uses report   
   warnings.  In fact, this problem looks like an outlier.   
      
   > There's plenty of documentation, articles and presentations that   
   > explain how this can make very efficient code... but nothing   
   > will answer this question: do I really want to be efficiently   
   > wrong?   
      
   By using C you implicitely gave "yes" as an answer.   
      
   > I mean, yes I would find the problem, thanks to my 100% coverage   
   > unit testing, but couldn't the compiler give me a hint?   
      
   Since it gave no hint it probably could not.  In cases when it   
   can it warns (at least when you activate warnings).   
      
   > Could someone drive me into this reasoning? I know there is a lot of   
   > thinking behind it, yet everything seems to me very incorrect!   
   > I'm in deep cognitive dissonance here! :) Help!   
   >   
      
   --   
                                 Waldek Hebisch   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca