From: peter.noreply@tin.it   
      
   On Mon, 17 Nov 2025 15:25:20 -0800   
   Paul Rubin wrote:   
      
   > I'm playing with the idea of writing a Roff-like text formatter in   
   > Forth. The input is lines of text "blah blech, and this that the   
   > other...". The text lines can be arbitrarily long so I don't want to   
   > read the entire line into a memory buffer using something like REFILL.   
   >   
   > Let's say I don't have to worry about individual words overflowing   
   > memory though (segfault is not allowed, but it's ok to panic and quit).   
   > So the main loop will be to copy an input word to the output buffer and   
   > maybe flush the output buffer. The output buffer can be of fixed size.   
   >   
   > Also, some input lines will be formatting commands like ".i\n" (change   
   > font to italic). Those lines should be given to the Forth text   
   > interpreter.   
   >   
   > I guess I could use the FILE word set to write something like getc()   
   > with its own buffering, but that seems messy. I'm wondering if this is   
   > a common situation and there's an idiomatic solution.   
      
   In lxf and lxf64 I have the following words to support processing files   
      
   MAP-FILE ( addr len fam -- a2 l2 ior )   
   UNMAP-FILE ( a2 l2 -- ior )   
   maps a file into memory, fam is r/o or r/w   
      
   GET-LINE ( a1 l1 -- a1 l3 a2 l2 )   
   GET-WORD ( a1 l1 -- a1 l3 a2 l2 )   
   takes a memory region, returns the first line/word on top of stack   
   and remaining region below it.   
      
   Here is a simple example to count lines and words in a file   
      
   \ Process a file   
      
   variable #words   
   variable #lines   
      
   : process-line ( a l -- )   
    1 #lines +!   
    begin   
    dup while   
    get-word 2drop 1 #words +!   
    repeat   
    2drop ;   
      
   : process-file ( a l -- )   
    r/o map-file throw   
    0 #words ! 0 #lines !   
    2dup   
    begin   
    dup while   
    get-line process-line   
    repeat   
    2drop   
    unmap-file throw   
    ." the file has " #lines @ .   
    ." lines and " #words @ .   
    ." words!" ;   
      
   As Anton has already noted lxf uses this internally for parsing source files   
   Mapping the file uses memory regions outside the current process.   
   It is like first allocating memory and then reading in the entire file   
      
   BR   
   Peter   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|