home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 3,523 of 4,675   
   Robert Prins to Terje Mathisen   
   Re: Speeding up code - am I missing some   
   16 Aug 18 20:38:20   
   
   XPost: comp.lang.pl1, comp.lang.pascal.borland, comp.lang.pascal.misc   
   From: robert@nospicedham.prino.org   
      
   On 2018-08-16 16:02, Terje Mathisen wrote:   
   > (newsgroups limited to just clax86, my server does not allow indiscriminate   
   > cross-posting.)   
   >   
   > Robert, you do realize that anything you are going to print will take   
   billions   
   > of cycles for every second the printer needs to actually print the pages?   
      
   Really? ROFL...   
      
   Of course I do realise that once I/O is added, all bets are off. But as I've   
   mentioned in another reply you may already have seen, sometimes others approach   
   a problem from a completely different angle, with stunning results.   
      
   Somewhere in the back of my head I think I might be able to build the top-10 or   
   top<10 lists in my caching code, but that would mean adding another 5 pointers   
   to each item, and more arrays to keep track of start- and end-of-list   
   addresses.   
   It may be worth a try, but I remember that I once changed the PL/I version to   
   use arrays of pointers and indices in the actual allocated items, and code   
   generated with, I think Enterprise PL/I V3.9, was pretty AD 1985 Borland TP   
   3.01a-ish, even at the highest level of optimisation. Virtual Pascal would not   
   be any better, even two consecutive   
      
   a:= array[i].a;   
   b:= array[i].b   
      
   statements result in the calculation of the address of the variable in the   
   array   
   twice.   
      
   Robert   
   --   
   Robert AH Prins   
   robert(a)prino(d)org   
      
      
   >   
   > Terje   
   >   
   > Robert Prins wrote:   
   >> Hi all,   
   >>   
   >> Not really specific x86, Virtual Pascal, or PL/I, but as I've got the   
   >> same code written in all three languages, I'll give it a try here:   
   >>   
   >> I have an array (actually three, but that's not very relevant for the   
   >> issue) of pointers to items in a linked list. The array is sorted into a   
   >> specific order, and after the sort, I use it to print Top-N tables. For   
   >> the main Top-50 table (x 3) that's easy enough, the first 50 items are   
   >> selected.   
   >>   
   >> However, at this time the linked list also contains 5 sub-lists, each   
   >> containing additional sub-lists for a total of 390 (yes,   
   >> three-hunderd-and-ninety) sub-lists for which I want to print Top-10   
   >> tables (or if a sub-list contains just 4 items, a Top-4 table). Because   
   >> I don't really want to sort the main array six (x 3) times, I just scan   
   >> the array from the top, and pick out the items for each sub-list, until   
   >> I have 10 (or the 4 mentioned before) items. To speed up the process a   
   >> little, I make an initial pass over the array to find the first item for   
   >> each sub-list, but that doesn't really improve things a great deal.   
   >>   
   >> So given that "this" is the list, sorted on its main key, with the   
   >> second and third columns identifying additional sub-lists,   
   >>   
   >> 100, 'a', '1'   
   >> 099, 'b', '9'   
   >> 098, 'c', '1'   
   >> 097, 'b', '8'   
   >> 096, 'd', '3'   
   >> 095, 'a', '2'   
   >> 094, 'c', '8'   
   >> 093, 'c', '1'   
   >> 092, 'a', '3'   
   >> 091, 'f', '4'   
   >> 090, 'a', '2'   
   >>   
   >> the "Top-50" would contain the above.   
   >>   
   >> For sub-list 'a', I would scan the array, and the resulting "Top-10"   
   >> would be   
   >>   
   >> 100, 'a', '1'   
   >> 095, 'a', '2'   
   >> 092, 'a', '3'   
   >> 090, 'a', '2'   
   >>   
   >> Likewise, for sub-list 'b', I would scan the array, and the resulting   
   >> "Top-10" would be, and the initial fill-the-cache scan would have set   
   >> the starting index for the 'b' sub-list to the cell containing 099,   
   >> saving me one superfluous compare.   
   >>   
   >> 099, 'b', '9'   
   >> 097, 'b', '8'   
   >>   
   >> etc, and the same would happen for the sub-lists defined by column 3,   
   >> the first being   
   >>   
   >> 100, 'a', '1'   
   >> 098, 'c', '1'   
   >> 093, 'c', '1'   
   >>   
   >> As I wrote, I make an initial pass over the data to cache the first item   
   >> for each sub-list, some of them consist of a single item and are nearly   
   >> at the end of the sorted full array, and this saves some time, the count   
   >> of executed statements is reduced by around 10%, but if anyone can think   
   >> of a way to speed up things a bit more, without having to sort the   
   >> array(s) five more times (in an additional 10,000,000 executed   
   >> statements, see below), please share it with me.   
   >>   
   >> Some numbers:   
   >>   
   >> - the caching code executes around 120,000 PL/I statements   
   >> - sorting the three arrays executes around in 2,000,000 PL/I statements   
   >> (using a Heapsort)   
   >> - printing the 390 top-n tables executes around 7,600,000 PL/I   
   >> statements, and the "IF" statement that selects the data to be printed   
   >> has a hit-percentage of just 0.5%:   
   >>   
   >> IF STY_TCN = 'Tr' & STY = LIST.TR | /* 208 type 1 */ 1,841,754 executed   
   >> IF's   
   >>    STY_TCN = 'Na' & TCN = LIST.NA | /*  93 type 2 */   
   >>    STY_TCN = 'Yr' & STY = LIST.YR | /*  37 type 3 */   
   >>    STY_TCN = 'Co' & TCN = LIST.CO | /*  33 type 4 */   
   >>    STY_TCN = 'Ty' & TCN = LIST.TY | /*  19 type 5 */   
   >>    STY_TCN = '**' THEN              /*   1 main   */   
   >>   DO;   
   >>     #T = #T + 1;                        
    Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  9,594 - the   
   >> actual number of Top-N lines printed   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca