Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.arch    |    Apparently more than just beeps & boops    |    131,241 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 129,735 of 131,241    |
|    BGB to Anton Ertl    |
|    Re: Intel's Software Defined Super Cores    |
|    20 Sep 25 22:01:48    |
      [continued from previous message]               Minecraft ran, but unplayable.        Even on lowest draw distance.        Doom 3, started up at least...        Severe graphical glitches (lighting didn't work correctly)        Dead slow.        Around 2.4 GB/sec in memcpy.        Around 2.0 GB/s in LZ4        Around 1500 Mpix/sec in CRAM decode.        Performs well in CPU based tasks.        OpenGL via Software rasterization almost as fast as the GPU.               Current PC (Ryzen 2700X, 3.7GHz, 8C16T)        No issues running any of these games.        Memcpy: 3.6 GB/sec.        DDR4-2133        Around 3.2 GB/sec in LZ4        Around 2000 Mpix/sec in CRAM decode.                     As can be noted:        memcpy tests tend to measure lower than RAM bandwidth.        CRAM decode often tends to exceed memcpy.        My mempy and LZ4 tests are single threaded.        Multi-threading can often give higher total bandwidth.                     The bulk of time in CRAM decoding is spent in logic like:        tab[0]=colorA;        tab[1]=colorB;        px0=tab[(pix>>0)&1]; px1=tab[(pix>>1)&1];        px2=tab[(pix>>2)&1]; px3=tab[(pix>>3)&1];        ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;        ct+=stride;        px0=tab[(pix>>4)&1]; px1=tab[(pix>>5)&1];        px2=tab[(pix>>6)&1]; px3=tab[(pix>>7)&1];        ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;        ct+=stride;        px0=tab[(pix>> 8)&1]; px1=tab[(pix>> 9)&1];        px2=tab[(pix>>10)&1]; px3=tab[(pix>>11)&1];        ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;        ct+=stride;        px0=tab[(pix>>12)&1]; px1=tab[(pix>>13)&1];        px2=tab[(pix>>14)&1]; px3=tab[(pix>>15)&1];        ct[0]=px0; ct[1]=px1; ct[2]=px2; ct[3]=px3;                     > - anton              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca