... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.forth
Forth programmers eat a lot of Bratwurst
117,927 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 116,569 of 117,927
Ruvim to Anton Ertl
Re: Forth systems with address units >8
24 Jun 24 14:10:31
   From: ruvim.pinka@gmail.com   
      
   On 2024-06-24 10:41, Anton Ertl wrote:   
   > Ruvim  writes:   
   >> On 2024-06-23 21:10, Anton Ertl wrote:   
   >>> Ruvim  writes:   
   >>   
   >>>> It seems, in almost any system we can have a separate byte-based address   
   >>>> space. For an address in this space, 1+ produces the address of the next   
   >>>> consecutive byte.   
   >>>>   
   >>>>   
   >>>> For example, let's consider a cell-addressed, little-endian   
   >>>> Forth-system, where one cell is 32 bits, and several most significant   
   >>>> bits of addresses are always 0.   
   >>>>   
   >>>>   
   >>>> : byte-address ( addr -- b-addr ) #2 lshift ;   
   >>>   
   >>> The BCPL approach in reverse.  Just say No!   
   >>>   
   >>> Having two incompatible address types was bad in BCPL (and AmigaDOS   
   >>> programmers can show you their scars from this mistake), and it would   
   >>> be bad in Forth.   
   >>   
   >>   
   >> Well, it's not obvious to me why this is bad.   
   >   
   > It leads to bugs where the wrong kind of address is provided or   
   > expected.  It leads to complications in designing the words where you   
   > now have to deal with two kinds of addresses and design your words to   
   > expect or provide the right one.  And in cases where the usage of the   
   > word includes both kinds of addresses, perform the conversion before   
   > and/or after the call, or have two functionally parallel words, one   
   > for each kind of address; or maybe more, if you want to support   
   > various combinations for the different parameters and return values.   
      
   It seems that all these items are also true when using two types of   
   buffers: the native format, and the byte-per-address-unit format.   
      
      
   >   
   > Actually, it's worse than the BCPL approach: On the Amiga in BCPL the   
   > two kinds of addresses were clearly distinct, i.e., the conversion   
   > were not nops, and any mistakes in usage would be found quickly in   
   > testing.   
   >   
   > In the suggested approach, on the widely-used byte-addressed Forth   
   > systems the conversion words would be nops, and having one too many,   
   > too few, or in the wrong direction or in the wrong place would not   
   > become apparent in testing.  You would have to test on a system where   
   > the address unit >8 bits to find the mistake, and it's likely that the   
   > program won't test there for other reasons (e.g., because it uses   
   > OPEN-FILE).  We have seen with CHARS how well that has worked.  Even   
   > those who wanted to write Forth-94 standard programs could not test   
   > that their programs actually complied.  We finally accepted reality   
   > and standardized 1 chars = 1.   
   >   
   > My preferred alternative of reading and writing the data in a   
   > byte-per-address-unit format (or maybe converting between a packed and   
   > a byte-per-address-unit format) has the same problem, of course, but   
   > at a smaller scale: if we standardize words for doing this reading and   
   > writing, or this conversion, on a byte-addressed machine you cannot   
   > determine by testing that you did the OPEN-FILE without a BYTEWISE   
   > fam, where it would be appropriate.  However, the places where   
   > BYTEWISE would have to be inserted are far fewer, making it much easer   
   > to get right without testing, or to insert missing instances of   
   > BYTEWISE.   
   >   
   > For the variant where there is a conversion between packed and   
   > bytewise representations in memory the number of places to consider is   
   > between the fam approach and the two-kinds-of-address approach, so I   
   > would rather recommend the fam approach.   
   >   
   > If we do not standardize words for systems with address units >8 bits   
   > (and I currently don't plan to propose such words, because I would   
   > like to see some existing practice before proposing such words), the   
   > situation is actually not that much difference from if we standardize   
   > them: many programs will not use these words either way, and it's best   
   > to go for the variant that requires the least changes.   
   >   
   > [two kinds of addresses]   
      
      
   Thank you very much for such a detailed explanation. This is very   
   convincing! Agree.   
      
      
      
   >>> If systems like jsforth want to go there, they should implement it and   
   >>> establish common practice about such things.  It will be interesting   
   >>> to see how this approach works out with, e.g., 20-bit cells.   
   >>   
   >>   
   >> It will not work if addresses use all bits in a cell.   
   >   
   > That's also the case for jsforth.  jsforth can address 16GB (4G   
   > cells), but 32-bit byte addresses can only address 4GB.   
      
   I think, this is the strongest argument against a separate space of   
   byte-addresses.   
      
      
   >   
   >> The only way that I can see is to use double-cell size addresses to   
   >> refer individual bytes (or even bits).   
   >   
   > On one hand, requiring double-cell addresses for W@ etc. will   
   > certainly ensure that most or all mistakes in converting between the   
   > address types will be found in testing.   
   >   
   > On the other hand, double-cell addresses for W@ etc. conflicts with   
   > existing practice   
      
   Of course, for words that accept double-cell addresses,   
   like  ( n.offset addr.base ), other names should be used.   
      
      
   > and is very likely to lead to a proposal that   
   > proposes it being rejected.  I also doubt that the that the users of   
   > systems with address units >8 bits would prefer it over the fam   
   > approach, and that the implementors of such systems would implement   
   > such words.   
      
      
      
   Concerning the fam-based approach.   
      
   If I understand that right, in this approach, a 32bit little-endian   
   Forth system, in which an address unit is 16 bits (and then 1 char is 16   
   bits too), should provide:   
      
   : b@ ( addr -- x ) c@ $ff and ;   
   : w@ ( addr -- x ) dup b@ swap char+ b@  8 lshift or ;   
   : wle ( x -- x ) ;   
   : wbe ( x -- x ) $ffff and dup 8 lshift swap 8 rshift or ;   
   : l@ ( addr -- x ) dup w@ swap cell+ w@ 16 lshift or ;   
   : lle ( x -- x ) ;   
   : lbe ( x -- x ) dup wbe 16 lshift swap 16 rshift wbe or ;   
   \ etc   
      
   Also it should provide the "bytewise ( fam1 -- fam2 )" modifier (or   
   reuse the standard "bin"), which results in the conversion of a sequence   
   of bytes into a sequence of zero-extended wydes in reading, and the   
   reverse conversion on writing.   
      
   Probably, such a system should also provide words to unpack a buffer (to   
   copy each byte in the source buffer to an address-unit in the target   
   buffer), and to pack a buffer (to copy the byte from each address-unit   
   in the source buffer to a byte in the target buffer).   
      
      
      
      
   --   
   Ruvim   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]