Poll: Lemon trees are nice...
You do not have permission to vote in this poll.
6 100.00%
Total 6 vote(s) 100%
* You voted for this item. [Show Results]

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Does anybody know anything about compression?
Not just algorithms, but actually knew the math behind compressing information? Because I tried and managed to compress only a few digits using an elaborate method using a 36 base number system (involving greek letters, our letters and our numbers). I would like to get into that. Is there a really good tutorial out there? Let me know Smile
The basic principle is that you assign characters to certain recurring symbols that come up in the file a lot. I think I could set it up, but I'm not sure if it would be that great, or that fast.

Peace cannot be obtained without war. Why? If there is already peace, it is unnecessary for war. If there is no peace, there is already war."

Visit www.neobasic.net to see rubbish in all its finest.
RLE =Just about Anybody can....

LZW = Antoni Gual/Rich GelDreich
y smiley is 24 bit.
[Image: anya2.jpg]

Genso's Junkyard:
Quote:RLE =Just about Anybody can....

lol. Depends what you're compressing. You won't get very far compressing a novel that way, but it works for images...
*Digs out notes on compression*

If you just want algorithms then youll find plenty of examples on the net, with RLE, LZH, gzip, bzip, huffman encoding, etc. Also look at prefix codes and block codes for simple compression techniques.

The actual theoritical math is (IMHO) rather ugly. To start off with you have compression ratio
ratio = (LenghtAfter / LengthBefore) * 100
and compression rate
rate = ((LengthBefore - LengthAfter) / LengthBefore) * 100

Those two are rather straight forward and are the basis of measurement for any given compression method. There are limits to compression, not all data can be compressed (all messages are equally likely), and not all data can be compressed equally well.

The main mathmatical basis for compression is around message probabilites and distributions. For example using the ascii set, you have 256 possible messages, in a novel message 'e' is far more probable than other messages such as 'z' and 'x'. Encoding message 'e' with a short bit string and message 'z' with a longer one is how compression is achieved.

There are formulas for the upper and lower bounds of compression, as well as things like noiseless coding, block code constrution etc. (I wont post them here because they are too complex to write in text, unless you have LaTeX).

Do a google search on Shannon, Kraft and Huffman who all did large amounts of work with the mathmatics and theory behind compression. Hope this helps?
esus saves.... Passes to Moses, shoots, he scores!
There are different ways of compressing, of which RLE is (I think) the easiest.

A harder one is the RDX compression.
rle = representing repeated pixels longer than 3 as 3 characters.

lzw = representing repeated patterns as 2 characters.

huffman/jpg = placing a voodoo doll in the center of a room, light a circle of fire around you and dance around the room praying to antoni gual to send magic jpeg decoding rays from the sky.
i]"I know what you're thinking. Did he fire six shots or only five? Well, to tell you the truth, in all this excitement, I've kinda lost track myself. But being as this is a .44 Magnum ... you've got to ask yourself one question: 'Do I feel lucky?' Well, do ya punk?"[/i] - Dirty Harry
All hail Antoni!!!!!

y smiley is 24 bit.
[Image: anya2.jpg]

Genso's Junkyard:
:rotfl: :bounce: :king: :bounce: :rotfl:

Forum Jump:

Users browsing this thread: 1 Guest(s)