Are there conditions under which given text $T$ that huffman code and arithmetic code will produce the exact encoding of $T$?
-
$\begingroup$ Of course. I assume it’s homework, so I’ll just say it’s really easy to show. $\endgroup$gnasher729– gnasher7292019-01-28 20:24:26 +00:00Commented Jan 28, 2019 at 20:24
-
$\begingroup$ This is for preparation for the exam.A friend told me that the exact opposite. Meaning that there are no conditions since Huffman is based on frequencies and the arithmetic takes the frequencies and uses them to find an interval. So if you say it's true what is the reason for that? $\endgroup$Kate– Kate2019-01-28 21:03:28 +00:00Commented Jan 28, 2019 at 21:03
-
$\begingroup$ That's an argument why they are usually different. Figure out why arithmetic coding is usually more efficient. Then figure out in which situations is is not more efficient, but exactly as efficient as Huffman. And there you get your conditions. $\endgroup$gnasher729– gnasher7292019-01-28 21:30:07 +00:00Commented Jan 28, 2019 at 21:30
2 Answers
yes, if all probabilities are in form of $2^{-k}$ then the arithmetic encoding will be the same as huffman encoding. both achieving the expected length of $H(X)$ (entropy).
In such conditions branching in arithmetic coding will be exact and behave the same way huffman coding does.
If the text consists of all bytes from 0 to 255 k times, in any order, then n = 256 and each byte will be represented by a single byte, and depending on the algorithm that choose the encodings, it is possible that 0 is mapped to 0, 1 is mapped to 1 etc, so T is encoded to T.
That is also possible if all bytes don’t have the exact same but approximately the same frequency, and n is not a multiple of 256. 0 to 255 may be mapped to the 8 bit codes 0 to 255 if the frequencies grow from c/n to 2c/n for some c, but it depends on the exact mapping algorithm.
Arithmetic coding may give the exact same encoding, but the frequencies would have to be much closer to 1/n.