From: hebisch@math.uni.wroc.pl   
      
   dik wrote:   
   > i do a lot of ASCII text processing with TP and i am   
   > constantly impressed with how fast buffered reads and writes   
   > take place with WriteLn and ReadLn. however, i very much   
   > would like to filter the incoming character stream from a   
   > file so that i can accommodate UNIX-style line feed   
   > delimited strings, break-up zero delimited stream-style data   
   > and detect inadvertently chosen binaries.   
   >   
   > to this end i began experimenting with a ReadLn replacement   
   > that i named read_lin. as is my habit in experiments, i   
   > initially defined all parameters as globals and passed   
   > nothing to read_lin. i do this for flexibility and speed,   
   > anticipating that once feasibility is established, i can   
   > gather the decisive variables into a record and pass that to   
   > the procedure in the finished form.   
   >   
   > so, using ...   
   >   
   > procedure read_lin ;   
   >   
   > in a timed loop reading a large ASCIIZ file created on the   
   > fly with WriteLn ...   
   >   
   > set_ticks ; while not fil_eof do begin   
   >   
   > read_lin ; try_ior ;   
   >   
   > end ; sho_ticks ;   
   >   
   > ... after much fiddling i obtained the following results:   
   >   
   > TP7 > READ_RACE WriteLn 14 ReadLn 21 read_lin 21   
   > TP5 > READ_RACE WriteLn 25 ReadLn 22 read_lin 22   
   >   
   > that is, in the case of TP7, WriteLn made a (15 MB or so)   
   > ASCIIZ file in 14 BIOS ticks that ReadLn read in 21 ticks,   
   > and the experimental read_lin also read in 21 ticks.   
   >   
   > i was amazed and encouraged that the read_lin prototype -   
   > all TP no ASM so far - could even come close, much less   
   > match ReadLn's performance!   
   >   
   > but here's the rub and my finally my question.   
   >   
   > i gathered together the variables involved into a record and   
   > did the experiment again with ...   
   >   
   > procedure read_lin ( var fil_rec : a_fil_rec ) ;   
   >   
   > ... and got the following results:   
   >   
   > TP7 > READ_RACE WriteLn 14 ReadLn 21 read_lin 68   
   > TP5 > READ_RACE WriteLn 25 ReadLn 22 read_lin 129   
   >   
   > an unusable and dismal reduction in read_lin performance.   
   >   
   > i do not understand why this is so. i thought that passing a   
   > VAR to a procedure meant passing a pointer that should   
   > entail minimal stack overhead before the procedure gets   
   > going. likewise, in this experiment each line in the disk   
   > file is about 45 characters, so the number of calls to   
   > read_lin is relatively small anyway - the whole reason for   
   > embedding a read_chr sub-function into read_lin.   
   >   
      
   You pass a pointer which is cheap, but than access to fields of   
   fil_rec goes via this pointer.   
      
   > my experimental program is below. as listed, it runs without   
   > passing fil_rec, which is commented out ...   
   >   
   > procedure read_lin { var fil_rec : a_fil_rec } ;   
   >   
   > changing the code to ...   
   >   
   > procedure read_lin ( var fil_rec : a_fil_rec ) ;   
   >   
   > is the problem.   
   >   
   > any enlightenment will be greatly appreciated!   
   >   
   > DIK   
   >   
      
   > procedure read_lin { var fil_rec : a_fil_rec } ;   
   >   
   > label LOOP ;   
   >   
   > begin with fil_rec do begin   
   >   
   > lin_pos := 0 ;   
   > lin [0] := #0 ;   
   >   
   > LOOP : if buf_pos = buf_end then begin   
   >   
   > buf_pos := 0 ;   
   > buf_end := 0 ;   
   >   
      
   If var fil_rec is global then TP knows where in memory buf_pos is so   
   the line   
      
    buf_pos := 0 ;   
      
   is essentialy a single machine instruction. But when fil_rec is a var   
   parameter then you have something like:   
      
    read the pointer to fil_rec   
    put 0 at in proper location   
      
   which is twice as much work. Worse yet: IIRC TP pointer consist of   
   a segment and offset, so you have extra segment manipulation, which   
   is rather expensive on modern processors.   
      
   You have put all your work variables inside the fil_rec record. Make   
   char, buf_pos, etc into local variables (local to read_lin). That   
   should make read_lin faster.   
      
   ----   
    Waldek Hebisch   
   hebisch@math.uni.wroc.pl   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|