Qbasicnews.com

Full Version: Does anybody know anything about compression?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Not just algorithms, but actually knew the math behind compressing information? Because I tried and managed to compress only a few digits using an elaborate method using a 36 base number system (involving greek letters, our letters and our numbers). I would like to get into that. Is there a really good tutorial out there? Let me know Smile
The basic principle is that you assign characters to certain recurring symbols that come up in the file a lot. I think I could set it up, but I'm not sure if it would be that great, or that fast.
data+compression+methods

:bounce:
RLE =Just about Anybody can....

LZW = Antoni Gual/Rich GelDreich
Quote:RLE =Just about Anybody can....

lol. Depends what you're compressing. You won't get very far compressing a novel that way, but it works for images...
*Digs out notes on compression*

If you just want algorithms then youll find plenty of examples on the net, with RLE, LZH, gzip, bzip, huffman encoding, etc. Also look at prefix codes and block codes for simple compression techniques.

The actual theoritical math is (IMHO) rather ugly. To start off with you have compression ratio
Code:
ratio = (LenghtAfter / LengthBefore) * 100
and compression rate
Code:
rate = ((LengthBefore - LengthAfter) / LengthBefore) * 100

Those two are rather straight forward and are the basis of measurement for any given compression method. There are limits to compression, not all data can be compressed (all messages are equally likely), and not all data can be compressed equally well.

The main mathmatical basis for compression is around message probabilites and distributions. For example using the ascii set, you have 256 possible messages, in a novel message 'e' is far more probable than other messages such as 'z' and 'x'. Encoding message 'e' with a short bit string and message 'z' with a longer one is how compression is achieved.

There are formulas for the upper and lower bounds of compression, as well as things like noiseless coding, block code constrution etc. (I wont post them here because they are too complex to write in text, unless you have LaTeX).

Do a google search on Shannon, Kraft and Huffman who all did large amounts of work with the mathmatics and theory behind compression. Hope this helps?
There are different ways of compressing, of which RLE is (I think) the easiest.

A harder one is the RDX compression.
rle = representing repeated pixels longer than 3 as 3 characters.

lzw = representing repeated patterns as 2 characters.

huffman/jpg = placing a voodoo doll in the center of a room, light a circle of fire around you and dance around the room praying to antoni gual to send magic jpeg decoding rays from the sky.
All hail Antoni!!!!!


:rotfl:
:rotfl: :bounce: :king: :bounce: :rotfl:
Pages: 1 2