Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.asm.x86    |    Ahh, the lost art of x86 assembly    |    4,675 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 2,906 of 4,675    |
|    Terje Mathisen to Robert Prins    |
|    Re: Converting some way to clever PL/I c    |
|    04 Aug 17 12:37:11    |
   
   From: terje.mathisen@nospicedham.tmsw.no   
      
   The problem/idea here is that you initialize the BCD int to an illegal   
   value which is effectively zero, but which will maintain that flag info   
   until the first real operation on it, right?   
      
   For pure binary integer code there is of course no reserved value   
   (should probably have been MININT, i.e. 0x80000/-32768 for a 16-bit   
   int), so you cannot add any extra info here.   
      
   As soon as you reserve a single value as the starting point for your   
   sum, then you cannot handle arbitrary inputs, and since an input of zero   
   is legal and should be added, you must use a separate flag:   
      
      
    sum = 0;   
    added_values = 0;   
    foreach (a in arr[]) {   
    if (a >= 0) {   
    sum += a;   
    added_values++;   
    }   
    }   
      
    ;; ESI->array, ECX has count   
    xor edx,edx   
    xor ebx,ebx   
   next:   
    lodsd   
    test eax,eax   
    jl skip   
    add edx,eax   
    inc ebx   
   skip:   
    loop next   
      
   What's expensive here is the test for >= 0 for each element, not the   
   separate flag value (in EBX): Updating this is totally free.   
      
   If the pattern of valid/invalid values in the input array is   
   unpredictable, then you could consider CMOV operations:   
      
   next:   
    lodsd   
    xor edi,edi   
    test eax,eax   
      
    setge bl   
    cmovge edi,eax   
      
    add edx,eax   
    or bh,bl   
    loop next   
      
   The snippet above will take ~5 cycles/iteration while the branchy   
   version is at least one cycle faster when correctly predicted.   
      
   If only positive array elements were OK, then you could initialize the   
   sum to -1, and at the end check it:   
      
   If still -1 then no legal values were found, otherwise increment the sum   
   and print it.   
      
   Terje   
      
   Robert Prins wrote:   
   > I've recently come across some really clever/very nasty PL/I code,   
   > that would, theoretically, save CPU by eliminating a conditional   
   > jump. It relies on initializing a BCD-encoded ***integer*** variable   
   > ("sum") with -0.1, which results, on IBM mainframes, the last nibble   
   > of the BCD encoded value to contain 0xD (rather than the normal 0xC).   
   > The author uses this to avoid a costly (Phuleeze, pass me a bucket!)   
   > test, so rather than coding:   
   >   
   > sum = -1;   
   >   
   > do i = 1 to whatever; if a(i) >= 0 then if sum <> -1 then sum = sum +   
   > a(i); else sum = a(i); end;   
   >   
   > if sum <> -1 then "print sum";   
   >   
   > the code can be simplified to   
   >   
   > sum = -0.1; /* fraction is discarded, but -sign (0xD) is kept! */   
   >   
   > do i = 1 to whatever; if a(i) >= 0 then sum = sum + a(i); end;   
   >   
   > if last_nibble(sum) <> 0xD then "print sum";   
   >   
   > where "last_nibble" is a simplification of using two actual PL/I   
   > builtin functions that actually allow access to the last nibble of a   
   > BCD encoded value, and the addition of any a(i) to "sum", even an   
   > a(i) = 0 will cause the CPU to normalize the last nibble of "sum" to   
   > 0xC.   
   >   
   > Testing this for big "whatever" (in an outer loop, and using a small   
   > array "a" in the inner loop) makes no flipping difference (on a   
   > Hercules emulated) z/OS system, which doesn't surprise yours truly   
   > one Iota. :)   
   >   
   > However, I would be curious if there is a way to code something   
   > similar in x86 assembler, when using strictly integer values, which   
   > implies that sum/eax must be initialized to -1(?), and the addition   
   > is preceded by a "cmp eax, -1(?)" to set up a carry, but that doesn't   
   > seem to work.   
   >   
   > Or am I just on a wild goose chase?   
   >   
   > Obviously using a "cmp eax, -1" followed by a "sete dl / movzx edx,dl   
   > / add eax,edx / add eax, 'a(i)'" works, but might take a few   
   > nano-seconds more than a pretty much very well predicted conditional   
   > jump...   
   >   
   > Any thoughts?   
   >   
   > Robert   
      
      
   --   
   -
|
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca