home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,765 of 117,927   
   Anton Ertl to Anton Ertl   
   Re: Rust, Forth and performance   
   22 Nov 25 17:54:30   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:   
   >anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:   
   >>In order to check out how the Forth systems do without this system   
   >>call overheads, I defined synonyms for TYPE, SPACE and CR that work as   
   >>noops, apart from the stack effect.  So when the programs do not   
   >>output their results, the run times are:   
      
   I achieved a small speedup by optimizing #S.  The new one looks as   
   follows:   
      
   : #s      ( ud -- 0 0 ) \ core	number-sign-s   
       dup if   
           begin   
               #   
           dup 0= until   
       then   
       drop begin   
           base @ u/mod swap digit hold   
       dup 0= until   
       0 ;   
      
   0.68user 0.01system 0:00.70elapsed 98%CPU gforth-fast old #S   
   0.57user 0.00system 0:00.59elapsed 97%CPU gforth-fast new #S   
      
   Performance counter results:   
      
     old #S       new #S   
   gforth-fast   gforth-fast      lxf         Rust   
   no output      no output     no output   buffered 2   
    3245_981222  2690_088360   945_394481  1062_756213 cycles   
   11679_661274  9813_132978  2648_084410  4471_844679 instructions   
    1391_034028  1204_585688   447_127429   888_494927 branches   
       1_521428     1_520834     1_243916     1_329412 branch-misses   
            0.4          3.3         18.2          3.3 %  tma_backend_bound   
            3.9          3.9          6.1          3.8 %  tma_bad_speculation   
           24.6         19.5          4.3         10.9 %  tma_frontend_bound   
           71.1         73.3         71.4         82.0 %  tma_retiring   
      
   I also looked at where Rust's buffered 2 variant spends its time, with   
   perf record and perf report:   
      
     18.62%  fillseq1::main   
     15.51%  cfree@GLIBC_2.2.5   
     11.40%  core::fmt::write   
      9.38%  core::fmt::num::imp::::_fmt   
      8.95%  malloc   
      7.99%  _ZN81_$LT$std..io..default_write_fmt..Adapter$LT$T$GT$   
   u20$as$u20$core..f   
      7.58%  std::io::default_write_fmt   
      7.12%  __memmove_evex_unaligned_erms   
      5.27%  core::fmt::Formatter::pad   
      
   [Everything else is <2.5% individually, and <10% total.]   
      
   So malloc, free and memmove consume a significant part of the remaining time   
      
   Looking into fillseq1::main, there is nothing that catches my eye in   
   the hot part of the code.   
      
   - anton   
   --   
   M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
        New standard: https://forth-standard.org/   
   EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html   
   EuroForth 2025 registration: https://euro.theforth.net/   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca