Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.forth    |    Forth programmers eat a lot of Bratwurst    |    117,927 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 117,582 of 117,927    |
|    Ruvim to Anton Ertl    |
|    Re: 0 vs. translate-none    |
|    26 Sep 25 02:19:36    |
      From: ruvim.pinka@gmail.com              On 2025-09-17 20:53, Anton Ertl wrote:       > This posting is a more general reflection about designing types in       > Forth; it just uses recognizers as example.       >       > The original proposal for recognizers had R:FAIL as the result of a       > recognizer that did not recognize the input. Later that was renamed       > to NOTFOUND; then there was a proposal where 0 would be used instead,       > and Bernd Paysan changed all the uses of NOTFOUND in Gforth to 0.       > Finally, on last Thursday the committee decided to go with       > TRANSLATE-NONE for that result.       >       > Bernd Paysan thought that it would be easy to change back to a non-0       > value for TRANSLATE-NONE, by looking at the patch that changed       > NOTFOUND to 0. However, in the meantime there has been more work       > done, so it's not so easy.       >       > E.g., there was a word       >       > ?FOUND ( x -- x )       >       > that would throw -13 if x=0. This word was used both with the result       > of recognizers and with nt|0 or xt|0. Fortunately, in this case the       > cases were easy to recognize, and they are now addressed by two words:       > ?REC-FOUND (for recognizer results) and ?FOUND (for x|0).              A better name than `?rec-found` is `?recognized`.              Given the pattern "rec-something ( sd -- qt|0 )", the pattern       "?rec-something ( sd -- qt )" should be for words that accept a string       and throw an exception if it is not recognized as "something".                            > What do we learn from this? Merging two previously separate types       > such that they are dealt with (partly) the same words (e.g., 0= in       > this case) is easy, as is mixing two kinds of sand. Separating two       > previously (partly) merged types to use type-specific words is a lot       > more work.              Yes. But this work is not justified in any way.                     I see the problem a little differently — in terms of subtyping and type       hierarchies.              If a type B is a subtype of a type A, than all words that accept any       member of A, also accept any member of B.              So when introducing a new type C, the first challenge is to optimally       choose the nearest supertype (or supertypes) for it.              For example, if you make C a subtype of A, than all methods of A apply       to C. If you make C a subtype of B, all methods of A and B apply to C.              When choosing a supertype, the factors for consideration are:        - consistency with existing types and methods;        - minimizing the lexical code size of programs;        - applying existing techniques and methods to the new types;        - restrictions on implementations;              We generally don't plan for future changes to subtype relationships.       Yes, they can be changed during the design and experimentation phase,       but that doesn't constitute an argument for choosing one supertype over       another.              Obviously, the more general a supertype is, the more implementation       options are available and the fewer existing methods can be applied to       members of the type. However, this dependence alone is also not an       argument for choosing one supertype over another.                                          Returning to recognizers.              There is a quite general type: ( i*x x\0 ). Let's call it "any-nz".              The unique feature of this type is that there is a simple and general       method to check whether a data object is a member of this type — just       check whether the top single-cell value is a non-zero. And this method       applies to *any* subtype of this type. This method is made even more       elegant by the fact that control flow operators apply it automatically.              Note that nt, xt, wid are subtypes of any-nz.              Another side of any-nz is that a union type ( any-nz | 0 ) is a natively       discriminated union. This has led to a common approach of returning       any-nz on success and 0 on failure.              The question is: should the recognizers follow this approach? I think       so. This effectively means that a type of a success result of a       recognizer is a subtype of any-nz, and a type of a failure result is a       subtype of the unit type "0".                     The only counterargument is that 0 on failure is too restrictive for       implementations.              This does not seem convincing. Because `search-wordlist`, `find`,       `find-name`, `find-name-in` return 0 on failure and this is not too       restrictive for implementations.              OTOH, why in this case we should prefer the convenience of       implementations over the convenience of programs?                                                 > You can fake it by defining 0 CONSTANT TRANSLATE-NONE, but then you       > never know if your code ports to other systems where TRANSLATE-NONE is       > non-zero. For now Gforth does it this way, but I don't expect that to       > be the final stage.       >       > Should we prefer to separate types or merge them?                     In other words, should we restrict implementation options in this       regard? Yes, because this is a common approach, which makes programs       simpler.                            > Both approaches have advantages:       >       > * With separate words for dealing with the types, we can easily find       > all uses of that type and do something about it. E.g., a while ago       > I changed the cs-item (control-flow stack item) in Gforth from three       > to four cells. This was relatively easy because there are only a       > few words in Gforth that deal with cs items.                     The cs-item example does not demonstrate any advantages because the       formal type didn't change. You only needed to find the places where the       system-specific subtype was used by system-specific methods. Places       where the formal type was used didn't change.                                   > * With a merged approach, we can use the same words for dealing with       > several types, with further words building upon these words (instead       > of having to define the further words n times for n types). But       > that makes the separation problem even harder.              A separation (i.e., breaking a subtyping relationship) should not be       planned at all.                     > Overall, I think that the merged approach is preferable, but only if       > you are sure that you will never need to separate the types (whether       > due to a committee decision or because some new requirement means that       > you have to change the representation of the type).                     If an old data type will not fit the new requirements in the future, the       new type (and new methods) should be introduced. Changing existing       subtypes of an old type cannot be planned in principle.                                   --       Ruvim              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca