Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.programming    |    Programming issues that transcend langua    |    57,431 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 56,959 of 57,431    |
|    Stefan Ram to Stefan Ram    |
|    Re: Scanning    |
|    20 Jan 23 12:16:57    |
   
   From: ram@zedat.fu-berlin.de   
      
   ram@zedat.fu-berlin.de (Stefan Ram) writes:   
   >In the next iteration, I want to extend this to a sequence   
   >of paragraphs. Still without any real markup.   
      
    (As before, I was not able to shorten all lines of this post   
    to the 72 characters which are recommended for Usenet posts,   
    so please bear with me while some lines below will exceed   
    the length of 72 characters. I do not ignore Usenet customs   
    lightly, but only after painstaking consideration.)   
      
    This post is just a report, but contains no questions to the   
    group, so please read on only if you are interested in the topic!   
      
    It was a bit difficult for me to figure out how to properly   
    do things, so I resorted to reading Chapter 8 of the TeXbook   
    where the scanning of TeX is explained. To verify my understanding,   
    I wrote small snippets of TeX. For example,   
      
   \tracingscantokens1   
   \tracingcommands3   
   \tracingonline1   
   H   
      
   \tracingscantokens0   
      
    (That is, one line containing only an "H" and then one empty line.)   
      
    gave this output on TeX:   
      
   {the letter H}   
   {horizontal mode: the letter H}   
   {blank space }   
   {\par}   
      
    . This is because TeX converts the first \n (directly after "H")   
    to a blank space and the next (directly below "H") to the control   
    sequence "\par".   
      
    I then tried to imitate this.   
      
    Here are the test cases I wrote for my code in Python:   
      
   catcode_dict[ '\t' ]= catcode_of_space # repeated here for clarification   
   process( 'Howdy___\nthere!' )   
   process( ' Howdy___\n there!_' )   
   process( 'H__\n\n' )   
   process( ' Howdy\n\n there!\n\n' )   
   process( ' Howdy\n \n there!\n \n' )   
   process( ' Howdy\n\n\n there!\n \n\n' )   
   process( 'Howdy\n\t\nthere!' )   
   catcode_dict[ '\t' ]= catcode_of_other   
   process( 'Howdy\n\t\nthere! (catcode of tab temporarily rededfined to   
   "other")' )   
   catcode_dict[ '\t' ]= catcode_of_space   
   process( '' )   
   process( ' ' )   
   process( ' ' )   
   process( r''' In a Galaxy, there lived a man.   
   He was happy when he was typing   
   paragraphs.''' )   
      
    One will see below, that, just like TeX, my scanner ignores   
    a tab at the end of a line, when the tab character has been   
    given then category of "space character" (as in plain TeX),   
    but not when it has been given the category of "other   
    character" (as in INITEX).   
      
    The output follows below. Most tests pass, but there is   
    still one error. (The error is: When the input is a sequence   
    of blanks, it produces [par], but should produce nothing.)   
    For demonstration purposes, the underscore "_" was made to   
    act like a blank space.   
      
    The actual output of the scanner is a sequence of tokens,   
    but it was assembled into a string for the demonstration   
    output below.   
      
    The output often ends with one space, because a '\n' is   
    added to the end of the input if it's missing, and this   
    then is being converted to a space. So, ironically, while   
    I set out to strip spaces at the end of lines, I now   
    sometimes add them to the end of lines!   
      
   'Howdy___\nthere!' (=input) ==>   
   'Howdy there! ' (=output)   
      
   ' Howdy___\n there!_' (=input) ==>   
   'Howdy there! ' (=output)   
      
   'H__\n\n' (=input) ==>   
   'H [par]' (=output)   
      
   ' Howdy\n\n there!\n\n' (=input) ==>   
   'Howdy [par]there! [par]' (=output)   
      
   ' Howdy\n \n there!\n \n' (=input) ==>   
   'Howdy [par]there! [par]' (=output)   
      
   ' Howdy\n\n\n there!\n \n\n' (=input) ==>   
   'Howdy [par][par]there! [par][par]' (=output)   
      
   'Howdy\n\t\nthere!' (=input) ==>   
   'Howdy [par]there! ' (=output)   
      
   'Howdy\n\t\nthere! (catcode of tab temporarily rededfined to "other")'   
   (=input) ==>   
   'Howdy \t there! (catcode of tab temporarily rededfined to "other") ' (=output)   
      
   '' (=input) ==>   
   '' (=output)   
      
   ' ' (=input) ==>   
   '[par]' (=output)   
      
   ' ' (=input) ==>   
   '[par]' (=output)   
      
   ' In a Galaxy, there lived a man.\nHe was happy when he was ty   
   ing\nparagraphs.' (=input) ==>   
   'In a Galaxy, there lived a man. He was happy when he was typing paragraphs. '   
   (=output)   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca