... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.asm.x86
Ahh, the lost art of x86 assembly
4,675 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 3,435 of 4,675
Terje Mathisen to Steve
Re: Compression At Bit level
12 Jun 18 17:26:34
   From: terje.mathisen@nospicedham.tmsw.no   
      
   Steve wrote:   
   > Hi Rudy,   
   >   
   > "R.Wieser"  writes:   
   >> Bilal Ahmad,   
   >>   
   >>> And I was just giving an example of Huffman encoding, I didn't   
   >>> mean specifically Huffman algo   
   >>   
   >> I didn't either, but as you specifically named that one I used it   
   >> in an example to how to google it.   You could do the same for all   
   >> kinds of compression methods: LZH, LZW, ARC, ZIP, you name it.   
   >> All of them can be implemented in Assembly, and I've seen code   
   >> examples to all of the above floating on the 'web somewhere.   
   >>   
   >>> And i think it would be difficult to build a tree at asm level.   
   >>   
   >> Not really.  But if you think so, why don't you consider building   
   >> trees as the first step to your goal ?   Forget everything else,   
   >> and just concentrate on building (and traversing) a tree.  You will   
   >> definitily learn a lot from experimenting with it. :-)   
   >>   
   >> Than again, you could also take a peek at (one of) the above   
   >> compression methods, and look at how they did it.  In other words:   
   >> Learning from the masters.   
   >>   
   >>> Like, we pack bits in some certain design, I tried to implement   
   >>> it, but I think it is not that much efficient.   
   >>   
   >> Hmmm...   There are quite a number smart*sses (in a positive way)   
   >> in this newsgroup, and quite a few will definitily want to help you   
   >> *as long as* you would post something they could actually look at   
   >> and comment upon.   
   >   
   > I have written assembly code for run length encoding (PCX), packbits   
   > (TGA and Apple paint/draw), LZW (GIF), and Huffman encoding (for the   
   > heck of it.  The LZW took quite a while to write, and is really   
   > horrible to look at.  The others were much easier to write and test   
   > for correct operation.   
   >   
   > That code was mostly written to decode or encode image data.  And as   
   > such, the compression was secondary to the proper display of the   
   > images.  If you want to consider compression as the primary goal;   
   > then either characterize your data to find the appropriate algorithm,   
   > or try them all and pick the best.  Run length and packbits are   
   > probably the easiest to implement.  Huffman wasn't bad, but never had   
   > to do any real world task.  LZW was difficult, though I made most of   
   > the problems myself by poor planning and lousy, redundant coding.   
      
   I strongly suggest you all take a look at LZ4!   
      
   This is a dead simple algorithm, in that the decoder will fit very   
   nicely indeed in the x86 register set, and it is in fact quite possible   
   to decode data faster than a REP MOVS can copy it.   
      
   Encoding is as usual quite a bit harder, particularly if you want to   
   search for the best possible compression, but it is still fast enough   
   that Google and many others use it to compress packetized data requests   
   on the fly.   
      
   (One hint: The core idea to get great decompression speed is to realize   
   that you quite often get RLL encoded source data which you can convert   
   from REP MOVSB to REP STOSD/STOSQ or even SSE/AVX stores.)   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]