home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,761 of 117,927   
   Hans Bezemer to Paul Rubin   
   Re: Idiomatic way to read a word of text   
   21 Nov 25 18:57:53   
   
   From: the.beez.speaks@gmail.com   
      
   On 18-11-2025 00:25, Paul Rubin wrote:   
   > I'm playing with the idea of writing a Roff-like text formatter in   
   > Forth.  The input is lines of text "blah blech, and this that the   
   > other...".  The text lines can be arbitrarily long so I don't want to   
   > read the entire line into a memory buffer using something like REFILL.   
   >   
   > Let's say I don't have to worry about individual words overflowing   
   > memory though (segfault is not allowed, but it's ok to panic and quit).   
   > So the main loop will be to copy an input word to the output buffer and   
   > maybe flush the output buffer.  The output buffer can be of fixed size.   
   >   
   > Also, some input lines will be formatting commands like ".i\n" (change   
   > font to italic).  Those lines should be given to the Forth text   
   > interpreter.   
   >   
   > I guess I could use the FILE word set to write something like getc()   
   > with its own buffering, but that seems messy.  I'm wondering if this is   
   > a common situation and there's an idiomatic solution.   
      
   I don't know if it's "idiomatic", but it works. In essence, it reads the   
   file binary. If there is something left at the end of the buffer, it   
   copies that to the start, adjusts the buffer address and size and   
   continues. It doesn't return a word per call, you open the file and it   
   applies a quotation to each word parsed (a n --).   
      
   No, it's not beautiful, but it works. BTW, if you happen to be German   
   and your prose contains words that exceed 256 characters, you're on your   
   own.   
      
   Hans Bezemer   
      
   ---8<---   
   256 constant /line   
      
   /line buffer: linebuf   
      
   : eow?   
      case   
        bl of true endof   
         9 of true endof   
        10 of true endof   
        13 of true endof   
        false swap   
      endcase   
      
                                           \ correct for last word   
   : ?lastword over 0= if linebuf swap chars + + else drop then over - ;   
   : -leading begin dup while over c@ bl = while 1 /string repeat then ;   
                                           \ get a word   
   : get-word                             ( addr1 n2 -- addr2 n2 f)   
      >r over 0 2swap bounds ?do i c@ eow? if i + leave then loop   
      r> over >r ?lastword r>              \ word is delimited by white space   
      
      
   : parse-line                           ( xt a n --)   
      dup >r begin                         \ save length   
        2dup r@ -rot 2>r get-word          ( xt a1 n1 a2 n2 n3)   
      while                                \ if we read a complete word   
        >r over r@ swap execute   
        r> 2r> rot 1+ /string              \ execute the action   
      repeat 2rdrop rdrop                  \ adjust the buffer   
      
      
   : open-txt s" netstrng.4th" r/o bin open-file abort" Cannot open   
   'myfile.txt'" ;   
   : adjust >r linebuf r@ cmove linebuf /line r> /string ;   
   : close-txt close-file abort" Cannot close 'myfile.txt'" ;   
      
   : parse-file                           ( h xt -- h)   
      swap >r linebuf /line                \ put xt on execution stack   
      begin   
        r@ read-file 0= over 0<> and       \ read the file buffer   
      while                                \ if not an empty line   
        linebuf swap parse-line adjust     \ parse line and adjust buffer   
      repeat drop drop r>                  \ return handle   
      
      
   open-txt [: -leading -trailing type cr ;] parse-file close-txt   
   ---8<---   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca