home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 130,184 of 131,241   
   Anton Ertl to Niklas Holsti   
   Re: branch splitting   
   07 Nov 25 08:08:42   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   Niklas Holsti  writes:   
   >On 2025-11-06 10:46, Anton Ertl wrote:   
   >> Stephen Fuld  writes:   
   [Fortran's assigned goto]   
   >>> Because it could, and often did, make the code "unfollowable".  That is,   
   >>> you are reading the code, following it to try to figure out what it is   
   >>> doing and come to an assigned/alter goto, and you don't know where to go   
   >>> next.  The value was set some place else in the code, who knows where,   
   >>> and thus what value it was set to, and people/programmers just aren't   
   >>> used to being able to follow code like that.   
   >>   
   >> Take an example use: A VM interpreter.  With labels-as-values it looks   
   >> like this:   
   >>   
   >> void engine(char *source)   
   >> {   
   >>    void *insts[] = {&&add, &&load, &&ip, ...};   
   >>   
   >>    void **ip=compile_to_vm_code(source,insts);   
   >>   
   >>    goto *ip++;   
   >>   
   >>    add:   
   >>      ...   
   >>      goto *ip++;   
   >>    load:   
   >>      ...   
   >>      goto *ip++;   
   >>    store:   
   >>      ...   
   >>      goto *ip++;   
   >>    ...   
   >> }   
   >>   
   >> So of course you don't know where one of the gotos goes to, because   
   >> that depends on the VM code, which depends on the source code.   
   >   
   >I'm not sure if you are trolling or serious, but I will assume the latter.   
      
   This is the problem that Stephen Fuld mentioned, and that is actually   
   a practical problem that I have experience in some cases when   
   debugging programs with indirect control flow, usually with various   
   forms of indirect calls, e.g., method calls.  I have not experienced   
   it for threaded-code interpreters that use labels-as-values (as   
   outlined above), because there I can always look at ip[0], ip[1]   
   etc. to see where the next executions of goto *ip will go.   
      
   >The point is that without a deep analysis of the program you cannot be   
   >sure that these goto's actually go to one of the labels in the engine()   
   >function, and not to some other location in the code, perhaps in some   
   >other function. That analysis would have to discover that the   
   >compile_to_vm_code() function returns a pointer to a vector of addresses   
   >picked from the insts[] vector. That could need an analysis of many   
   >functions called from compile_to_vm_code(), the history of the whole   
   >program execution, and so on. NOT easy.   
      
   That has never been a problem in my experience, and I have been using   
   labels-as-values since 1992.  Up to gforth-0.6 (2003), all instances   
   of &&label and all instances of goto *expr were in the same function,   
   so if labels had a separate type, that could not be converted by   
   casts, the analysis would be trivial, at least if GNU C was an   
   Ada-like language, where labels have their own type that cannot be   
   converted to other types.  As it is, Fortran's assigned goto uses   
   integer numbers, and labels-as-values uses void *, so if anybody was   
   really interested in performing such an analysis, they would have a   
   lot of work to do.  But the design of these features with using   
   existing types makes it obvious that performing such an analysis was   
   not intended.   
      
   Interestingly, if somebody wanted to work in that direction, checking   
   at run-time that the target of a goto is inside the function that   
   contains the goto is easy and not particularly expensive.  With the   
   newfangled "control-flow integrity" features in hardware, you could   
   even check relatively cheaply that only &&label instances are targets   
   of goto *.   
      
   Ok, so what about gforth-0.6 (2003) and later?  First of all, they   
   contain two functions with goto * and &&label instances, so the   
   trivial analysis would no longer work.  Has there ever been any mixup   
   where a goto * jumped to a label in the other function?  Not that I   
   know of; if it happened, it would actually work, because the two   
   functions are identical apart from some code-space padding.   
      
   What's more relevant is that gforth-0.6 added code-copying dynamic   
   native code generation: It copies code snippets (using the addresses   
   gotten with &&label to determine where they start and where they end)   
   to some RWX data region, concatenating the snippets in this way,   
   resulting in a compiled program in the RWX region.  It then uses one   
   of the goto * in one of the functions to actually start executing this   
   dynamically-generated code.   
      
   This is probably outside of what Stallman had in mind for   
   labels-as-values, but fortunately Stallman did not try to limit what   
   can be done to what he had in mind, the way that many programming   
   language designers do, and the way that many people discussing   
   programming languages think.  This is a feature that Ritchie's C also   
   has, which cannot be said about the C of people who think that   
   "undefined behaviour" is enough justification to declare a program   
   "buggy".   
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
     Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca