memory - Understanding Hauffman Compression? -


I am trying to understand the Huffman compression algorithm.

This allows the word to be taken: According to the Houghman tree, we will meet:

  • S: 4 times -> Code: 0
  • < Li> Y: one time -> code: 01
  • e: one time -> code: 00

Eventually it will be created: 01 00 0

So far everything is clear.

Now my problem is the place between binary words. How can memory be stored in?

In other words?

How will the computer know:

  • The first letter is two bits
  • Two bits of the second letter
  • The fourth one Only one bit in the letter

Because 01 00 0 0 < 01 00 00 00

  • 00 00 0 0 does not mean: YESSSS
  • 01 00 00 00 : YEEE

Any thoughts please?

The encoder has to ensure that any long sequence of bits is in the form of a combination of small bits Can not be misinterpreted. For example, in your compressed example, 00 can not be separated with e as 00 as SS is.

If you see the example bit string chart here, then you will notice that there are two bit strings in the three most characters (space, 'a', 'e') that are not at the beginning of any Other long bit strings appear and it goes for mid-length 4bit characters - none of those bit strings will appear in the beginning of the string for less than 5-bit characters.

In place of this restriction, the case of decoding becomes the least bit-string in your compression table, at least reading bits such as if your smallest character compresses 4 bits So you never read compressed data more than 4 bits if you do not match 4bits while reading, any small letters, then you start scanning for one more bit and 5bit views, and when Till you Oi Stay add not find bits of the match. Then you start with another 4 bits.

Adding a separator character eliminates the purpose of compressing the data. You want to increase your narrow alphabet to n characters, in n + 1 , And by the nature of compression, the probability of the separator will be "long" bit sequence by average, that means if you have a "small" sequence of large amounts of characters, then you probably want to compare each character's compressed space Are you That was originally taken.

For example, between a sequence of characters of 3 bits of each character, a 7bit separator means that you are now taking 10 bits per character - your data has been reduced to about 20% size : 8bit -> 10bit is required.


Comments

Popular posts from this blog

ios - How do I use CFArrayRef in Swift? -

eclipse plugin - Run java code error: Workspace is closed -

c - Error on building source code in VC 6 -