Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.programming    |    Programming issues that transcend langua    |    57,431 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 56,948 of 57,431    |
|    Dmitry A. Kazakov to Stefan Ram    |
|    Re: Scanning    |
|    19 Jan 23 14:50:58    |
      From: mailbox@dmitry-kazakov.de              On 2023-01-19 13:10, Stefan Ram wrote:              > But how do I know in advance if the line will fit into       > memory?              No idea, my parser reads whole source line into the buffer.              > Perhaps because of such fears, traditional scanners¹ do not       > read lines or, Heaven forbid, files, but only characters!              I think it is more C/UNIX tradition coming from having neither proper       strings in the language nor lines/records in the filesystem.              > So how would you do it with this style of programming (never       > reading the whole line into memory)?              By never following this style and never using scanners, lexers,       tokenizers and other primitive stuff. I do all that in a single pass       that produces either the code or else the AST.              > "I read a character. If it's a space, I peek at the next       > character, if that's a space, I start adding spaces to my       > look-ahead buffer. If an EOL is encountered, the look-ahead       > buffer is discarded. Otherwise, I have to start feeding my       > client from the lookahead buffer until the lookahead buffer       > is empty."              Reasonable languages deploy the rule that one blank character is       equivalent to any number of blank characters, so you could simply pass       one single space further. Note that you have to annotate tokens by       source location anyway (another reason for ditching the scanner       altogether). So you do not need to care about what this blank was built       of. And yet another reason not to use scanner is that the blank can be a       part of a, possibly malformed, comment or literal.              > Is it worth the effort with a look-ahead buffer and       > sequential access? Should you just read a line, assuming       > that a line will always fit into memory, and strip the       > blanks the easy way, i.e., using random access?              My parser works with an abstract source object. The implementation of       the source object maintains an internal line buffer, which size is a       parameter. Whether it is set to 1TB or 1024 bytes, the parser does not care.              --       Regards,       Dmitry A. Kazakov       http://www.dmitry-kazakov.de              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca