home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 129,735 of 131,241   
   BGB to Anton Ertl   
   Re: Intel's Software Defined Super Cores   
   20 Sep 25 22:01:48   
   
   [continued from previous message]   
      
        Minecraft ran, but unplayable.   
          Even on lowest draw distance.   
        Doom 3, started up at least...   
          Severe graphical glitches (lighting didn't work correctly)   
          Dead slow.   
        Around 2.4 GB/sec in memcpy.   
          Around 2.0 GB/s in LZ4   
          Around 1500 Mpix/sec in CRAM decode.   
        Performs well in CPU based tasks.   
          OpenGL via Software rasterization almost as fast as the GPU.   
      
      Current PC (Ryzen 2700X, 3.7GHz, 8C16T)   
        No issues running any of these games.   
        Memcpy: 3.6 GB/sec.   
          DDR4-2133   
          Around 3.2 GB/sec in LZ4   
          Around 2000 Mpix/sec in CRAM decode.   
      
      
   As can be noted:   
      memcpy tests tend to measure lower than RAM bandwidth.   
      CRAM decode often tends to exceed memcpy.   
      My mempy and LZ4 tests are single threaded.   
        Multi-threading can often give higher total bandwidth.   
      
      
   The bulk of time in CRAM decoding is spent in logic like:   
      tab[0]=colorA;   
      tab[1]=colorB;   
      px0=tab[(pix>>0)&1]; px1=tab[(pix>>1)&1];   
      px2=tab[(pix>>2)&1]; px3=tab[(pix>>3)&1];   
      ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;   
      ct+=stride;   
      px0=tab[(pix>>4)&1]; px1=tab[(pix>>5)&1];   
      px2=tab[(pix>>6)&1]; px3=tab[(pix>>7)&1];   
      ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;   
      ct+=stride;   
      px0=tab[(pix>> 8)&1]; px1=tab[(pix>> 9)&1];   
      px2=tab[(pix>>10)&1]; px3=tab[(pix>>11)&1];   
      ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;   
      ct+=stride;   
      px0=tab[(pix>>12)&1]; px1=tab[(pix>>13)&1];   
      px2=tab[(pix>>14)&1]; px3=tab[(pix>>15)&1];   
      ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;   
      
      
   > - anton   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca