home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.asm.x86      Ahh, the lost art of x86 assembly      4,675 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 2,877 of 4,675   
   Terje Mathisen to wolfgang kern   
   Re: Look back to "just for the H@ck"   
   20 Jul 17 14:42:37   
   
   From: terje.mathisen@nospicedham.tmsw.no   
      
   wolfgang kern wrote:   
   > I answered Terje Mathisen: ...   
   >>>>> My first ascii executable used both POP BX and Base91. :-)   
   >>>> was it much shorter then ?   
   >   
   >>> It didn't need the Base64 decoder, but handling 16-bytes to 13   
   >>> isn't free either. This was more than 20 years ago so I don't   
   >>> remember the details, like what I did if the input wasn't a   
   >>> multiple of 13 bytes long?   
   >   
   > [C..] you now my inabilty to read this.   
   >   
   > I checked on how 16:13 encoding could work, but I sure missed the   
   > trick and I would need 5 bits more (2*13 hints for 13 byte).   
      
   When I wrote that sample code I realized that it was far simpler than I   
   remembered: Just take a pair of input chars, convert them to the 0-90   
   range (i.e. subtract 33), then multiply the second by 91 and add:   
      
   This gives you a value in the 0..8280 range, which is sufficient to   
   cover the 0..8191 needed to encode 13 bits, i.e. there are 89 unused   
   combinations.   
      
   To make it more efficient we can do the math and realize that   
      
      sum = (a-33) + (b-33)*91   
      
   is the same as   
      
      sum = a + b*91 - (1+91)*33   
          = a + b*91 - 3036   
      
   Since 91 = 64+16+8+2+1 = 9*5*2+1   
   we can handle the multiplication with a few LEAs:   
      
      lea eax,[eax+eax*8]   
      lea eax,[eax+eax*4]   
      lea eax,[eax+eax*2+1]   
      
   which might be either slower or faster than a plain MUL.   
   >   
   > a 16* 7-bit string to 14* 8-bit conversion will not work (no   
   > 00..20), even a 16:14 variant would gain a bit more.   
   >   
   > my yet shortest will be a 9:7 encoding. ie: (first two bytes contain   
   > 14 modify hints) a set bit 0 will sub 0x30 of 3rd byte, (bit 1..6   
   > work on 4th..9th) a set bit 8 will sub 0x7e of 3rd byte, so if both   
   > are set then sub both.   
   >   
   > and it can end early, so multiple of 7 isn't required. it gaines 5%   
   > over base64. Now this is already one byte less in our Hello World :)   
      
   Bit stealing trickery is probably less size-efficient than simple   
   multiplication, shift & mask.   
      
   Anyway, I just remmebered that my original base91 encoder used a very   
   dirty trick: It stored the output bytes in a pre-shuffled order, so that   
   the decoder could push out result bytes with minimal effort.   
      
   I.e. always write out 8 bits, then accumulate the remaining 5 bits and   
   write out when the accumulator has at least 8 bits:   
      
      xor bx,bx	; BH always zero   
   nextpair:   
      lodsb   
      cmp al,34   
       je done	; chr(34) was the end marker!   
      sub al,35   
       jb nextpair	; Skip all values below chr(35)   
      mov bl,al   
   second_char:   
      lodsb   
      sub al,35   
       jb second_char   
      
      xchg al,bl   
      mov ah,91   
      mul ah   
      add ax,bx   
      
      stosb   
      mob al,bh   
      shr ax,cl   
      or dl,al   
      sub cl,5   
       ja nextpair   
      
      mov ax,dx   
      stosb   
      mov dl,ah   
      add cl,8   
       jmp nextpair   
      
      
   Another option is to use 5:4 coding, i.e. 5 chars of 6.4 bits each   
   encodes a 32-bit value, since this will only need 85 ascii values.   
      
   With the normal A-Zaz0-9 covering 62 of them, we need 23 more...   
      
   Here's the table:   
      
   69 chars 6.1 bits   
   74       6.2   
   79       6.3   
   85       6.4   
   91       6.5   
   97       6.6 (This needs the full 32-126 range plus two control chars!)   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca