home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 117,475 of 117,927   
   Anton Ertl to peter   
   Re: Parsing timestamps?   
   17 Jul 25 12:54:29   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   peter  writes:   
   >Ryzen 9950X   
   >   
   >        lxf64   
   >     5,010,566,495     NAI cycles:u   
   >     2,011,359,782     UNR cycles:u   
   >       646,926,001     REC cycles:u   
   >     3,589,863,082     SR  cycles:u   
   >   
   >        lxf64    =20   
   >     7,019,247,519     NAI instructions:u      =20   
   >     4,128,689,843     UNR instructions:u       =20   
   >     4,643,499,656     REC instructions:u=20   
   >    25,019,182,759     SR  instructions:u=20   
   >   
   >   
   >        gforth-fast 20250219   
   >     2,048,316,578      NAI cycles:u   
   >     7,157,520,448      UNR cycles:u   
   >     3,589,638,677      REC cycles:u   
   >    17,199,889,916      SR  cycles:u   
   >   
   >        gforth-fast 20250219   
   >    13,107,999,739      NAI instructions:u=20   
   >     6,789,041,049      UNR instructions:u   
   >     9,348,969,966      REC instructions:u=20   
   >    50,108,032,223      SR  instructions:u=20   
   >   
   >        lxf   
   >     6,005,617,374      NAI cycles:u   
   >     6,004,157,635      UNR cycles:u   
   >     1,303,627,835      REC cycles:u   
   >     9,187,422,499      SR  cycles:u   
   >   
   >        lxf   
   >     9,010,888,196      NAI instructions:u   
   >     4,237,679,129      UNR instructions:u=20   
   >     4,955,258,040      REC instructions:u=20   
   >    26,018,680,499      SR  instructions:u   
      
   >lxf uses the x87 builtin fp stack, lxf64 uses sse4 and a large fp stack=20   
      
   Apparently the latency of ADDSD (SSE2) is down to 2 cycles on Zen5   
   (visible in lxf64 UNR and gforth-fast NAI) while the latency of FADD   
   (387) is still 6 cycles (lxf NAI and UNR).  I have no explanation why   
   on lxf64 NAI performs so much worse than UNR, and in gforth-fast UNR   
   so much worse than NAI.   
      
   For REC the latency should not play a role.  There lxf64 performs at   
   7.2IPC and 1.55 F+/cycle, whereas lxf performs only at 3.8IPC and 0.77   
   F+/cycle.  My guess is that FADD can only be performed by one FPU, and   
   that's connected to one dispatch port, and other instructions also   
   need or are at least assigned to this dispatch port.   
      
   - anton   
   --   
   M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
        New standard: https://forth-standard.org/   
   EuroForth 2025 CFP: http://www.euroforth.org/ef25/cfp.html   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca