From: already5chosen@yahoo.com   
      
   On Thu, 30 Oct 2025 16:04:51 +0100   
   David Brown wrote:   
      
   > On 30/10/2025 13:07, bart wrote:   
   > >   
   > >   
   > > OK, "make -j" gave a real time of 30s, about three times faster.   
   > > (Not quite sure how that works, given that my machine has only two   
   > > cores.)   
   >   
   > You presumably understand how multi-tasking works when there are more   
   > processes than there are cores to run them. Sometimes you have more   
   > processes ready to run, in which case some have to wait. But   
   > sometimes processes are already waiting for something else (typically   
   > disk I/O here, but it could be networking or other things). So while   
   > one compile task is waiting for the disk, another one can be running.   
   > It's not common for the speedup from "make -j" or "make -j N" for   
   > some number N to be greater than the number of cores, but it can   
   > happen for small numbers of cores and slow disk.   
   >   
      
   It *can* give much higher speedup than the number of cores.   
   Measurements taken at relatively small MCU project: 33 modules,   
   size:   
    text data bss dec hex filename   
    26953 156 28028 55137 d761   
      
   Compiled on my corporate desktop.   
   Good hardware (Intel i7-17700, 8 P cores, 12 E cores, 28 logical CPUs,   
   competent SSD : Samsung PM9F1).   
   Bad software environment - very aggressive antivirus + 2 other   
   "management" crapware agents.   
      
   msys2, arm-none-eabi-gcc 13.3.0   
      
   2nd column: execution time with all cores enabled.   
   3rd column: execution time with compilation locked to single   
   logical CPU (P-core).   
   4th column: execution time with compilation locked to single   
   logical CPU (E-core).   
      
   flags tm-all tm-one-P tm-one-E   
   none 0m20.689s 0m21.162s 0m44.608s   
   -j 2 0m9.464s 0m11.199s 0m34.154s   
   -j 3 0m6.855s 0m8.695s   
   -j 4 0m4.970s 0m7.992s 0m21.895s   
   -j 5 0m4.429s 0m7.632s   
   -j 6 0m4.016s 0m7.340s   
   -j 7 0m3.766s 0m7.296s   
   -j 8 0m3.564s 0m7.248s   
   -j 9 0m3.439s 0m7.245s 0m20.323s   
   -j 10 0m3.562s 0m7.324s   
   -j 28 0m3.741s 0m7.295s   
   -j 33 0m3.623s 0m7.128s 0m18.098s   
   -j 0m3.843s 0m7.187s 0m19.365s   
      
   So, on P-core I see almost 3x speed up from simultaneity even with no   
   actual parallelism.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|